Some reading on IRIs and IDNs

Sean Conner sean at conman.org
Wed Dec 9 22:17:49 GMT 2020


It was thus said that the Great Stephane Bortzmeyer once stated:
> On Wed, Dec 09, 2020 at 12:26:51AM -0500,
>  Sean Conner <sean at conman.org> wrote 
>  a message of 73 lines which said:
> 
> > It does have an RFC (RFC-3492) and said RFC does contain code for
> > encoding and decoding punycode (but it's in C, and the API is
> > ... not what I would define but it can be worked with).
> 
> There is an implemention of Punycode in every standard library,
> whatever your language.
> 
> > so a domain name like "納豆.english.sådär.مصر" is converted thusly:
> 
> In Python (but it is as simple in any other language):
> 
> >>> print(codecs.encode("納豆.English.sådär.مصر", encoding="idna"))
> b'xn--99zt52a.English.xn--sdr-rlad.xn--wgbh1c'
> 
> (Note that the encodings.idna library of Python standard library is
> limited to IDN v1.)
> 
> So, almost nothing to do for the programmer. I don't agree with your
> assessment that IDN is simpler than IRI.

  I'm sorry, but the two languages I work in do *not* have an implementation
of punycode in their standard library.  I *was* able to find code for C
(from the RFC, which at least I know will work per the RFC) and could not
find one for Lua.  There's a reason why I'm having to muck with this.  The
API I have for C is *not* set up to handle domain names (breaking out the
labels, prepending or removing the "xn--", etc.).

  It's wonderful that the language you use comes with punycode support in
its standard library.  Not all languages have that.  I'm looking at the list
of clients [1] and there's one client written in a language I haven't heard
of before (Vala).  Other languages used are Nim, scheme and Tcl.  I would be
surprised if Vala or Nim have a punycode implementation.

  -spc (But hey, write your own client that does eveything you want and show
	us all how easy it is)

[1]	gemini://gemini.circumlunar.space/software/


More information about the Gemini mailing list