Gemini Archiving and WARC

Caranatar caranatar at riseup.net
Fri Sep 4 04:54:08 BST 2020


This seems like an incredibly cynical and myopic take. It's also
expected that everything on the internet will track you, will be
constantly expanded for the purpose of commercialization instead of user
experience, etc.... Yet Gemini purposefully rejects those notions in
favor of something better. The idea that the same shouldn't apply here
is odd.

-caranatar


Tom writes:

> On Wed, 02 Sep 2020 01:23:22 +0000
> acdw <acdw at acdw.net> wrote:
>
>> On 2020-09-01 (Tuesday) at 23:43, Charles E. Lehner
>> <cel at celehner.com> wrote:
>> 
>> > Hi Gemini List,
>> > 
>> > Has anyone thought about, or implemented, archiving of Gemini
>> > content/traffic?
>> > 
>> > WARC (Web ARChive)¹ is a standard format used for web archiving. It 
>> > uses text headers for metadata like in HTTP and email. It looks to
>> > me like WARC could be adapted for Gemini. The WARC spec supports
>> > multiple URI schemes, although it doesn't specify any other than
>> > http/https, ftp, and dns². Bespoke formats could also be used, of
>> > course, or just downloading files wget-style, but using a standard
>> > format could allow for interop with "the WARC ecosystem"³.
>> > 
>> > Archive Team⁴ has also worked on archiving non-HTTP protocols like
>> > FTP⁵ and Gopher⁶.
>> > 
>> > I think there is an opportunity for people to maintain high-quality 
>> > archives of Gemini content, like what the Internet Archive⁷ and 
>> > archive.today⁸ do for the HTTP(S) Web. Now is a good time to start, 
>> > while many of the original Gemini hosts⁹ are still online.
>> > 
>> > Regards,
>> > Charles E. Lehner
>> > 
>> > ¹ https://en.wikipedia.org/wiki/Web_ARChive
>> > ² 
>> > https://iipc.github.io/warc-specifications/specifications/warc-format/warc-1.1/#ftp-scheme
>> > ³ https://www.archiveteam.org/index.php?title=The_WARC_Ecosystem
>> > ⁴ https://www.archiveteam.org/
>> >   https://en.wikipedia.org/wiki/Archive_Team
>> > ⁵ https://www.archiveteam.org/index.php?title=FTP
>> > ⁶ https://www.archiveteam.org/index.php?title=Gopher
>> > ⁷ https://en.wikipedia.org/wiki/Internet_Archive
>> >   https://archive.org/
>> > ⁸ https://archive.today
>> >   https://en.wikipedia.org/wiki/Archive.today
>> > ⁹ gemini://gemini.circumlunar.space/servers/
>> >  
>> 
>> I personally think this is a great idea, but I know some might not be
>> so on-board with it. I'm thinking of solderpunk's post (in their
>> gopherhole, actually):
>> gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/the-individual-archivist-and-ghosts-of-gophers-past.txt
>> 
>> So is there a way to opt-out of archiving for publishers? Some in the
>> community might want to know about it, though I personally am of the
>> opinion that if you've published it, it's now the property of the
>> commons.
>> 
>
> Ounce you publish something to the internet there is no retracting it.
> This is one of the first things I was taught the first time I used the
> net. Alongside never using your real name on the net unless your
> publishing something.


-- 
sent from emacs using mu4e


More information about the Gemini mailing list