WWW indexing concerns (was: Gemini Universal Search)

Sean Conner sean at conman.org
Thu Feb 27 04:48:29 GMT 2020


It was thus said that the Great Steve Ryan once stated:
> On 20/02/26 02:29PM, Sean Conner wrote:
> >   A second one is to extend robots.txt to indicate proxying preference, or
> > some other file, but then there are multiple requests (or maybe
> > not---caching information could be included).  Heck, even a DNS record (like
> > a TXT RR with the contents "v=Gemini; proxy=no" with the TTL of the DNS
> > record being honored).  But that relies upon the good will of the proxy to
> > honor that data.
> > 
> >   Or your idea of just asking could work just as well.
> 
> I'm of the opinion that either a robots.txt method or TXT record will do
> for preventing spiders/proxies, I feel that stronger than assuming good
> faith will always lead to an arms-war, and I'm not sure for the protocol
> the servers have any chance of winning a war against clients.

  To that end, I have a TXT record for gemini.conman.org.  

	v=Gemini; proxy=no; webproxies=yes

	v=Gemini	- TXT record for Gemini

	proxy=no	- server does not support proxying requests
	proxy=yes	- server does support proxying requests

	webproxies=no	- please do not proxy this server via the web
	webproxies=yes	- web proxying is okay

  Discussion, questions, concerns, etc. welcome.

> If something must be kept private from proxys or spiders, perhaps
> requiring a client certificate might be for the best? I'm sure someone
> clever than I could figure out a way to require human intervention in
> creating a cert to access a page.

  It's fairly easy, and I do have two directories that require client
certificates:

	gemini://gemini.conman.org/private	- any client certificate
	gemini://gemini.conman.org/conman-labs-private - particular client certificates required

  -spc


More information about the Gemini mailing list