[tech] [psa] robustness (was Re: [ANN] A Gemini crawler, for statistics about the geminispace)

Petite Abeille petite.abeille at gmail.com
Tue Dec 22 15:26:02 GMT 2020



> On Dec 22, 2020, at 13:53, Peter Vernigorov <pitr.vern at gmail.com> wrote:
> 
> I don’t think details are boring here.

Really? Didn't get that vibe. Anyway.

> Would you mind listing some of the problems and possible solutions/workarounds?

Two simple examples: resource exhaustion and content poisoning. 

Exhaustion is the most trivial one, The adversary (a technical term, not a value judgment) tries to slow you down or fill you up or reach various limits on your side. E.g. throttling connection, infinite output, hanged connection, any combinations of the above, etc, etc...

This is easy to deal with: always limit everything you do, be it reading, writing, waiting, computing, whatnot. Eventually something will reach these limits and let you out of the trap. You can then mark the site as hostile and/or dysfunctional. Not always clear which one is which: incompetence or malice. 

For example, assuming the network stack goes through, a client has to read at most the first 1024 + some bytes of a server response to figure out what to do. Nothing more. Don't expect a well-formed response line. Assert it. Always validate. Continuously. Drop the connection as soon as something is not right. Always remember what happened.

Of course, there are downsides  to resource exhaustion for the adversary, as it's a sort of self-inflicted denial of service. Oh well.

Content poisoning is more fun. It can be anything from feeding you continuous junk (exhaustion + poisoning), well formed, but ill-intentioned logic bombs, busy beaver, wild goose chase, the list goes on.

For example, a trivial chase is infinite redirects. Got to stop eventually. Another limit.

Another one could be well formed text/gemini, but with  junky links. Same as above.

Again this is easy to identify statistically, marking the adversary as dysfunctional. 

You can them move on, or retaliate, depending on the mood. 

User-agents could also federate such information and use them in meaningful, if ominous, ways.

This is not a one way street: user-agents, specially bots, can do a lot of damage at scale.

Always keep in mind Hanlon's razor: "never attribute to malice that which is adequately explained by stupidity".

https://en.wikipedia.org/wiki/Hanlon%27s_razor

Just my 2¢. Have fun.





More information about the Gemini mailing list