WWW indexing concerns (was: Gemini Universal Search)

Sean Conner sean at conman.org
Wed Feb 26 19:29:59 GMT 2020


It was thus said that the Great Andrew Kennedy once stated:
> 
> So the issue here is that the only way to opt out of being indexed is to
> contact each proxy maintainer and request that they make accommodations
> for you. That's fine with only 15 or so gemini servers, but not fair to
> proxy maintainers as gemini grows. It's also not enough to ask all proxies
> to use robots.txt, because there's nothing stopping someone from ignoring
> it either out of ignorance or in bad faith.

  There are other ways.  One way is to recognize a proxy server and block
any requests from it.  I think it would be easy to recognize one because of
all the requests from a single IP address (or block of IP addresses).  The
blocking can be at a firewall level, or the gemini server could recognice
the IP (or IP block) and close the connection or return an error.  That can
be done now.

  A second one is to extend robots.txt to indicate proxying preference, or
some other file, but then there are multiple requests (or maybe
not---caching information could be included).  Heck, even a DNS record (like
a TXT RR with the contents "v=Gemini; proxy=no" with the TTL of the DNS
record being honored).  But that relies upon the good will of the proxy to
honor that data.

  Or your idea of just asking could work just as well.

  -spc



More information about the Gemini mailing list