Some reading on IRIs and IDNs

Michael Lazar lazar.michael22 at gmail.com
Thu Dec 10 04:16:30 GMT 2020


I've been following along with my own software in the background.

First of all, my domain registrar won't even let me put unicode characters
in an A record without automatically converting them to punycode for me.

café.mozz.us -> xn--caf-dma.mozz.us

Next, my naive python test client just kind of works as-is [0][1]. It will
convert unicode DNS names to punycode under the hood before doing the lookup.
Any unicode in the URL (IRI?) is left alone because.. why would a
client ever muck
around with the URL that the user gives them? That sounds like a bad idea to
me.

My server (running jetforce) also works as-is. All I had to do was add an entry
for "café.mozz.us" as a recognized hostname, and there you go.

```
jetforce-client gemini://café.mozz.us
Welcome to AV-98!
Enjoy your patrol through Geminispace...
🇺🇸 WELCOME TO MOZZ.US 🇺🇸
```

Requesting unicode path names also works with no changes on my part

```
jetforce-client gemini://café.mozz.us/files/𝒻𝒶𝓃𝒸𝓎.txt
20 text/plain
This is a test file with unicode characters in the name.⏎
```

As do quoted path names (the server will unquote the URL before it
attempts to load the file)

```
jetforce-client
gemini://café.mozz.us/files/%F0%9D%92%BB%F0%9D%92%B6%F0%9D%93%83%F0%9D%92%B8%F0%9D%93%8E.txt
20 text/plain
This is a test file with unicode characters in the name.
```

Does this mean my server is already compliant? What else should I try?

- Michael

[0] https://github.com/michael-lazar/jetforce/blob/master/jetforce_client.py
[1] It's nice to finally get a win for python after fighting with TLS
for so long


More information about the Gemini mailing list