[ANN] A Gemini crawler, for statistics about the geminispace

Stephane Bortzmeyer stephane at sources.org
Fri Dec 18 09:13:03 GMT 2020


On Wed, Dec 16, 2020 at 06:16:53PM -0500,
 Sean Conner <sean at conman.org> wrote 
 a message of 27 lines which said:

> > You can find the current results (the crawler did not crawl the entire
> > space yet):
> > 
> > gemini://gemini.bortzmeyer.org/software/lupa/stats.gmi

>   One stat I haven't seen yet (yours or from GUS) is a breakdown of
> langauge.  How many pages had a lang parameter, then a breakdown by
> language, how many multiple languages per parameters (for example,
> "lang=en,fr").

Just ask :-) Now done:

gemini://gemini.bortzmeyer.org/software/lupa/stats.gmi

I note:

* French is the second language after english. Cocorico, as we say in
France.

* There is one page in finnish.

* There are more HTML than Markdown pages on the geminispace, which I
find suprising.

* There is one page in EBCDIC and one in CP-437 :-)


More information about the Gemini mailing list