What is required to be IRI compliant?
Solderpunk
solderpunk at posteo.net
Mon Dec 28 13:54:14 GMT 2020
On Mon Dec 28, 2020 at 1:12 PM CET, William Orr wrote:
> Normalization is the process of looking for all of these synonyms for
> characters, and standardizing them to the same set of codepoints. If you
> don't normalize, you could have a case where one user gets the intended
> host for écrire.hostname and another user gets an NXDOMAIN, all
> depending on the sequence of bytes their input method produced.
...and actually, now that I think about, this issue is not specific to
IRI support, is it? Even if we followed the web's lead and declared
that Gemini requests and text/gemini links must contain ASCII-only URLs,
and people have to do punycoding of non-ASCII hostnames and
percent-encoding of UTF-8 representations of non-ASCII paths, it's still
possible for the server and client to have different ideas about how a
hostname or path are represented, right? With one using a composed form
and the other a decomposed form? Whether you send a UTF-8 string as-is
or first punycode and/or percent-encode it so it's valid ASCII is
totally orthogonal to that question. Or have I missed something
important?
Cheers,
Solderpunk
More information about the Gemini
mailing list