Crawlers on Gemini and best practices

Stephane Bortzmeyer stephane at sources.org
Thu Dec 10 14:09:10 GMT 2020


On Thu, Dec 10, 2020 at 03:00:28PM +0100,
 Petite Abeille <petite.abeille at gmail.com> wrote 
 a message of 24 lines which said:

> Perhaps of interest:

Not exactly the same thing since my email was about order of
User-Agent (when there is both "*" and "archiver") but, yes, Robot
Exclusion Standard is a mess.

> In order to be compatible to all robots, if one wants to allow
> single files inside an otherwise disallowed directory, it is
> necessary to place the Allow directive(s) first, followed by the
> Disallow.

Note that Allow is not even
standard. <http://www.robotstxt.org/robotstxt.html>: 

there is no "Allow" field. 


More information about the Gemini mailing list