robots.txt for Gemini formalised
Robert "khuxkm" Miles
khuxkm at tilde.team
Mon Nov 23 01:56:16 GMT 2020
November 22, 2020 6:02 PM, "Drew DeVault" <sir at cmpwn.com> wrote:
> Feedback:
>
> A web portal is a regular user agent, not a robot.
Just throwing in here for consideration that I agree with Drew, a proxy is not a robot by default. Are we implying that a browser must also follow robots.txt to be well-behaved? If so, I might just block AV-98 from reading my capsule. :)
What I would recommend in lieu of robots.txt proxy rules is normalizing using robots.txt on the web side of a proxy to prevent web spiders from inadvertantly crawling gemspace. For instance, proxy.vulpes.one blocks every robot user agent from indexing any part of the site.
Is there any good usecase for a proxy User-Agent in robots.txt, other than blocking web spiders from being able to crawl gemspace? If not, I would be in favor of dropping that part of the definition.
Just my two cents,
Robert "khuxkm" Miles
More information about the Gemini
mailing list