[spec] IRIs, IDNs, and all that international jazz
Sean Conner
sean at conman.org
Wed Dec 23 21:59:44 GMT 2020
It was thus said that the Great Solderpunk once stated:
> Feedback welcome, especially if I've overlooked anything, which is
> certainly possible. What I'd be most interested in hearing, at this
> point, is client authors letting me know whether the standard library
> in the language their client is implemented in can straightforwardly:
>
> 1. Parse and relativise URLs with non-ASCII characters (so, yes, okay,
> technically not URLs at all, you know what I mean) in paths and/or
> domains?
> 2. Transform back and forth between URIs and IRIs?
> 3. Do DNS lookups of IDNs without them being punycoded first? You can
> test this with räksmörgås.josefsson.org.
For C, I'm sure there is code, somewhere, that can parse IRIs, but it's a
matter of finding them.
For Lua, the answers are:
1. Yes. I had to write some code [1][2], and modify some existing
code [3], but Lua now has modules to parse IRI and URIs.
2. I can do IRI->URL, but not the other way---I have no need of a
URL->IRI as of yet.
3. For my setups (systems I've been able to test), I cannot lookup
IDNs as is---I *have* to convert to punycode first.
-spc
[1] https://github.com/spc476/LPeg-Parsers/blob/master/iri.lua
[2] https://github.com/spc476/lua-conmanorg/blob/master/src/idn.c
[3] https://github.com/spc476/GLV-1.12556/blob/master/Lua/GLV-1/url-util.lua
More information about the Gemini
mailing list