[ANN] A Gemini crawler, for statistics about the geminispace
Stephane Bortzmeyer
stephane at sources.org
Fri Dec 18 16:08:41 GMT 2020
On Fri, Dec 18, 2020 at 12:12:47PM +0000,
Luke Emmet <luke at marmaladefoo.com> wrote
a message of 23 lines which said:
> > gemini://gemini.bortzmeyer.org/software/lupa/stats.gmi
> Could it be possible to show the distribution of page sizes in geminispace?
Like this (the page was updated)?
* Less than 1 kbyte: 18465 URLs (48.7 %)
* 1 to 1000 kbytes: 15865 URLs (41.9 %)
* More than 1000 kbytes: 3559 URLs (9.4 %)
> Is there any raw data available?
The code is available. For the data, I'm not decided yet. True, it is
only public data, and there is not even the content of the pages, but
I don't know yet if there isn't some privacy/ethical problem. Let me
check.
More information about the Gemini
mailing list