Assuming disallow-all, and some research on robots.txt in Geminispace (Was: Re: robots.txt for Gemini formalised)

marc marcx2 at welz.org.za
Thu Nov 26 10:18:11 GMT 2020


Hello Christian

> One more thing I want to point out... copyright law isn't opt-in. It's opt-out.
> If you don't have a copyright statement or any other licensing information,
> then "all rights reserved" is automatically assumed, afaik. You can't just copy
> something just because the author didn't explicitly disallow you from doing that.

Yes - copyright legislation hasn't been repealed :-)

*But* by putting things on the web, the creator has granted the
world some implied license. The convention which has evolved for
the web is that without a robots.txt forbidding it, crawlers
are free to index and cache, and some other things too. The
boundaries of this are fuzzy, because the conditions weren't
stated at the outset.

But gemini isn't the web, and gemini is new, so maybe we can
do better and *not* rely on an implied license (all humans may
visit this capsule), and then a robots.txt for just one single
bit of extra information (autonomous software can crawl it too,
if not forbidden).

So many thoughtful people are hesitant to put their data
online - they fear that this may disadvantage them in
future - maybe they worry about employer discrimination, doxxing
or biometric harvesting (from facial detail to writing style)
or things not yet invented.

Given that everybody has different tolerances, a mechanism
whereby people can state their preferences would be a good
thing.

Blindly copying the web robots.txt mechanism seems to be too
coarse/too vague, and too easily decoupled.

regards

marc

-- CC-SA


More information about the Gemini mailing list