Some reading on IRIs and IDNs
Sean Conner
sean at conman.org
Wed Dec 9 22:17:49 GMT 2020
It was thus said that the Great Stephane Bortzmeyer once stated:
> On Wed, Dec 09, 2020 at 12:26:51AM -0500,
> Sean Conner <sean at conman.org> wrote
> a message of 73 lines which said:
>
> > It does have an RFC (RFC-3492) and said RFC does contain code for
> > encoding and decoding punycode (but it's in C, and the API is
> > ... not what I would define but it can be worked with).
>
> There is an implemention of Punycode in every standard library,
> whatever your language.
>
> > so a domain name like "納豆.english.sådär.مصر" is converted thusly:
>
> In Python (but it is as simple in any other language):
>
> >>> print(codecs.encode("納豆.English.sådär.مصر", encoding="idna"))
> b'xn--99zt52a.English.xn--sdr-rlad.xn--wgbh1c'
>
> (Note that the encodings.idna library of Python standard library is
> limited to IDN v1.)
>
> So, almost nothing to do for the programmer. I don't agree with your
> assessment that IDN is simpler than IRI.
I'm sorry, but the two languages I work in do *not* have an implementation
of punycode in their standard library. I *was* able to find code for C
(from the RFC, which at least I know will work per the RFC) and could not
find one for Lua. There's a reason why I'm having to muck with this. The
API I have for C is *not* set up to handle domain names (breaking out the
labels, prepending or removing the "xn--", etc.).
It's wonderful that the language you use comes with punycode support in
its standard library. Not all languages have that. I'm looking at the list
of clients [1] and there's one client written in a language I haven't heard
of before (Vala). Other languages used are Nim, scheme and Tcl. I would be
surprised if Vala or Nim have a punycode implementation.
-spc (But hey, write your own client that does eveything you want and show
us all how easy it is)
[1] gemini://gemini.circumlunar.space/software/
More information about the Gemini
mailing list