Some reading on IRIs and IDNs

marc marcx2 at welz.org.za
Thu Dec 10 13:12:59 GMT 2020


Hi

> > This internationalization stuff is complex and makes me want to
> > throw up hands in the air, scream a bit, and go back to the
> > simplicity of ASCII.
> 
> ASCII is not simple (think of case-insensitivity) and then only for
> people whose latin is the first script they learned.

I am struggling to take that statement seriously,
and not just because it breaks set theory :-)

Case conversion in ascii is xor 0x20 - that doesn't
even require a branch/comparison and can compile down
to a single assembly instruction.

This versus *many* tens or even hundreds of thousands of
lines of puny/unicode/etc logic.

But lets assume upper/lowercase characters in ascii
are confusing. That would be an argument to restrict
a simple system such as gemini urls to a subset of ascii
which excludes uppercase characters. Which I could support,
and which is effectively what dns ends up doing - as
do the majority of http urls. "Lowest common denominator
for maximum interoperability" is a good maxim.

If ascii case conversion is confusing, then this isn't
an excuse to grow this confusion by many orders of
magnitude. That makes the problem a lot worse.

  "Oops, I've burnt my toast - I know, lets solve that
   by burning down the house"

regards

marc


More information about the Gemini mailing list