Proposed minor spec changes, for comment.

defdefred defdefred at protonmail.com
Tue May 19 10:35:26 BST 2020


On Tuesday 19 May 2020 09:20, solderpunk <solderpunk at SDF.ORG> wrote:
> I don't think it's viable for interactive user clients (especially light
> and simple ones) to attempt this, but in the context of, say, a search
> engine which really wants to categorise everything (which is not to say
> that GUS necessarily has to shoulder this burden!), even distinguishing
> languages with the same alphabet is possible by looking at bigram and
> trigram frequencies if there's enough text. German text will have many
> more occurences of "lich" and "heit" than French or Spanish, etc.
Agree and french have éèà, spanish ñ¿ and german ß :-)
Nice to have UTF-8 to display all of them in the same document...



More information about the Gemini mailing list