IDN with Gemini?
Sean Conner
sean at conman.org
Tue Dec 8 00:04:51 GMT 2020
It was thus said that the Great Côme Chilliet once stated:
> Le lundi 7 décembre 2020, 19:00:02 CET colecmac at protonmail.com a écrit :
> >
> > This would then require IRI parsing libraries, and as I have explained
> > earlier, these don't exist in likely many programming languages, and
> > when they do, they are third-party.
>
> From what you said on irc, the situation is different between URI and IRI
> because most languages have URI parsing either in their stdlib or in a
> well tested known library. But, if no project use IRI, of course no one
> will write a library for it, this is a chicken and egg situation here.
I'm looking at RFC-3987 [1] and the changes from RFC-3986 [2] are minimal,
and it would be easy to modify my own URI parsing library [3] (which is
based directly off the BNF of RFC-3986) but that only gets me so far. The
other issue is Unicode normalization and punycode support, both of which I
would have to track down existing libraries or (and I shudder to think this)
write my own.
> Also, for the purpose of a client, it seems to me the parsing needed
> (domain and query extraction) is only to search for the first "/" and the
> last "?", and some minor tweaks on the scheme maybe (which does not
> contain unicode, I will leave the scheme alone, promise).
And then do some Unicode normalization to match how filenames are stored
on your server:
http://www.example.org/résumé.html
http://www.example.org/résumé.html
-spc
[1] https://tools.ietf.org/html/rfc3987
[2] https://tools.ietf.org/html/rfc3986
[3] https://github.com/spc476/LPeg-Parsers/blob/master/url.lua
More information about the Gemini
mailing list