Requests for robots.txt
solderpunk
solderpunk at SDF.ORG
Sun Mar 22 11:51:21 GMT 2020
On Sat, Mar 21, 2020 at 09:39:46PM -0400, Sean Conner wrote:
> I don't mind the crawling, but I am concerned about the references to
> robots.txt. In the web world, robots.txt lives at the top level and *only*
> at the top level. I don't think there's been a official response from
> solderpunk about robots.txt, but I would expect it to be very similar to how
> it works on the web---the top level only.
>
> But a clarification would be nice (either way). In my opinion, it should
> only live at the top level, but I can adapt to every "directory" as well.
This is nicely timed, actually, as things like robots.txt are now
looming larger on my personal radar than they have previously - with
CAPCOM I am writing for the first time a program which automatically
makes Gemini requests, and I'm very keen on making sure that it's a
"good citizen". There hasn't been too much overt discussion of good
Gemini citizenship yet, but now that non-human clients are becoming more
common, there should be. Robots.txt is obviously part of that package.
(It's *not* super relevant to feed aggregation, because nobody publishes
a feed without the expectation that it is read entirely by bots, but
other issues, especially rate limiting, rate)
It's been many years since I read any robots.txt specs from the web. I
will refresh my memory and start thinking about this, and asking
questions, in the hopes that we can finalise some stuff soon.
Cheers,
Solderpunk
More information about the Gemini
mailing list