Unicode vs. the World

Petite Abeille petite.abeille at gmail.com
Tue Dec 15 13:27:36 GMT 2020


[In the spirit of Scott Pilgrim vs. the World]

There has been a handful of  intertwingled* conversation about the topic.


To recap:

2020-12-04 Stephane Bortzmeyer got the ball rolling with "IDN with Gemini?": https://lists.orbitalfox.eu/archives/gemini/2020/003788.html
2020-12-08 John Cowan followed with "Three possible uses for IRIs": https://lists.orbitalfox.eu/archives/gemini/2020/003873.html
2020-12-09 Jason McBrayer contributed "Some reading on IRIs and IDNs": https://lists.orbitalfox.eu/archives/gemini/2020/003923.html

💩📯 To be charitable, we can also include Alex's self-described "shitpost"  dated 2020-12-15 : https://lists.orbitalfox.eu/archives/gemini/2020/004055.html

[2020-12-15T01:47:20.412Z] <nytpu> sending an message to the ML making fun of the long-running spec-changing threads. i'll probably regret it, but here goes
[2020-12-15T07:05:14.499Z] <nytpu> i've bitched about it but this is the first time i've really addressed the points other than in passing
[2020-12-15T07:05:42.682Z] <nytpu> and even then it's more a shitpost than a real rebuttal, don't take it too seriously


So what's the issue making Alex lose his marbles, thin-skin aside?

It boils down to this:

 => gemini://🐰.mozz.us/🐇.gmi 🥕Hoppity hop🥕

What do do with such a construct? Possible? Not possible? Allowed? Not allowed? First class citizen? Afterthought? How do deal with it, if at all? 

Decisions, decisions, decisions.

Technically speaking, while text/gemini is Unicode friendly by default, the links are not. The location part must be encoded, following idiosyncratic, local customs, perhaps such as:

 => gemini://xn--4o8h.mozz.us/%F0%9F%90%87.gmi 🥕Hoppity hop🥕

In other words, a bit of punycode + percent encoding + glossing over normalization + other niceties. Everything must be US-ASCII clean at the end of the day.

Some will make the distinction between "content" vs. "addressing":

[2020-12-15T07:35:09.590Z] <bie> also... this was never about internationalized content, but a lot of people like to pretend that it is
[2020-12-15T07:36:40.861Z] <bie> addressing != content

While there are some merits about such hair splitting -as it has be handled at different level of the stack- it distracts from the crux of the problem:

=> gemini://🐰.mozz.us/🐇.gmi 🥕Hoppity hop🥕 
vs.
=> gemini://xn--4o8h.mozz.us/%F0%9F%90%87.gmi 🥕Hoppity hop🥕

As it stands, the first variant cannot be handled by gemini -neither in text/gemini, nor in the protocol itself- with further technical gotchas such as address resolution and what not along the way. 

It must be converted to the second variant, the US-ASCII one.

So, what to do? This is what these various conversations are about. Exploring what the scope of the problem is, and what to do about it, if anything. So one can eventually reach an informed decision.

For example:

[2020-12-14T22:12:14.914Z] <remyabel> I lurk this channel and the mailing lists and keep seeing people trying to extend gemini or make it web-like, there's just no point in arguing against it
[2020-12-14T22:12:28.578Z] <CoopDot> I used to be in the US-ASCII only camp but now it's more "do the bare mininum to not forbid UTF-8 'URLs' in the spec and make strong recommendations in best-practices.gmi"

^Those are the "cannot be arsed" camp: things are the way they are, and cannot be bothered to changed anything, technically speaking... we are done. The "not-my-problem" camp.


[2020-12-15T07:30:13.193Z] <khuxkm> honestly my issue with the iri thread was the whole "we NEED this" and "we MUST do this it's our MORAL DUTY"
[2020-12-15T07:30:52.931Z] <khuxkm> like forcing everybody to use IRIs or be non-compliant with the spec is somehow going to solve discrimination

^Those are the... hmmm... oh-so-fragile "entitled" camp.


To summarize: this is a genuine choice for gemini. And not so much a technical issue.


-- 
ʕ·ᴥ·ʔ


Tangentially unrelated, as always:

The Internet is for End Users
https://tools.ietf.org/html/rfc8890

Terminology, Power, and Inclusive Language in Internet-Drafts and RFCs
https://tools.ietf.org/id/draft-knodel-terminology-04.html


* https://en.wikipedia.org/wiki/Intertwingularity




More information about the Gemini mailing list