Three possible uses for IRIs
John Cowan
cowan at ccil.org
Tue Dec 8 21:45:57 GMT 2020
On Tue, Dec 8, 2020 at 4:10 PM <colecmac at protonmail.com> wrote:
> The most difficult part of what you outlined is the Unicode normalization,
> which maybe not all languages have libraries for, and would also require
> updating every so often. But it wouldn't be a requirement for clients at
> all,
> just something nice to have.
>
If a client has an unnormalized IRI, it needs to normalize it before
sending it to the server. That said, a 2009 study looked at a sample of
700 million HTML documents, of which only 0.02% were not in NFC already,
which suggests that NFC text is already pretty dominant.
I assume you mean NFC normalization?
>
Yes. When I speak of normalization, I mean NFC normalization exclusively.
> What if the user named a domain/file/folder in a non-NFC way? Now does the
> server
> need to support NFC as well, and apply it to vhost recognition or local
> file paths
> to correctly match requests? That seems wrong. But so does the user
> entering
> something visually identical to what the sysadmin typed, and things not
> working.
>
I'm okay with that just failing, as file names are not really part of
text/gemini content. The difference will be obvious to the admin by
checking the requested URIs from the server log against the %-encoded names
of the folders.
John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org
I am a member of a civilization. --David Brin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201208/6b45b640/attachment.htm>
More information about the Gemini
mailing list