Unicode vs. the World

Philip Linde linde.philip at gmail.com
Thu Dec 24 20:30:29 GMT 2020


On Fri, 18 Dec 2020 07:13:24 +0100
Katarina Eriksson <gmym at coopdot.com> wrote:

> Domain name resolution is outside of the scope of the Gemini specification,
> we don't know if it can handle UTF-8 or not. If the visitor's network
> administrator has set up name resolution to accept UTF-8, they should
> probably also accept the punycoded version for compatibility.

IDNA moves what is ideally part of DNS into the application layer,
which is what the A stands for. It was somehow decided when adopting
this standard that it was better that every application that wants to
use a hostname should implement IDNA than to fix the underlying problem
in DNS.

This probably helped adoption early on because ISPs could largely leave
the cards in their card houses as they were, but creates more of a
burden for application developers, which in the long run is more
expensive.

So no, at least IDNA has to be supported by the application.

> Why can't servers just blindly accept non-ASCII bytes as is?

A fully compliant RFC 3986 implementation can't accept non-ASCII
characters. If that's what you have, you'll have to rewrite or replace
it. RFC 3987 covers this, but it's a bit more specific than blindly
accepting non-ASCII bytes. The chapters on the comparison ladder is a
good read for an overview of what may need to be implemented to avoid
false negative matching.

-- 
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201224/50370e9c/attachment.sig>


More information about the Gemini mailing list