Crawlers on Gemini and best practices

Stephane Bortzmeyer stephane at sources.org
Tue Dec 8 13:36:56 GMT 2020

Previous message (by thread): Three possible uses for IRIs
Next message (by thread): Crawlers on Gemini and best practices
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I just developed a simple crawler for Gemini. Its goal is not to build
another search engine but to perform some surveys of the
geminispace. A typical result will be something like (real data, but
limited in size):

gemini://gemini.bortzmeyer.org/software/crawler/

Currently, I did not yet let it loose on the Internet, because there
are some questions I have.

Is it "good practice" to follow robots.txt? There is no mention of it
in the specification but it could work for Gemini as well as for the
Web and I notice that some programs query this name on my server.

Since Gemini (and rightly so) has no User-Agent, how can a bot
advertise its policy and a point of contact?

Previous message (by thread): Three possible uses for IRIs
Next message (by thread): Crawlers on Gemini and best practices
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Gemini mailing list