Unicode vs. the World
Jason McBrayer
jmcbray at carcosa.net
Thu Dec 17 13:31:38 GMT 2020
Björn Wärmedal <bjorn.warmedal at gmail.com> writes:
> Because — as I tried to point out — there is no reasonably simple
> heuristic for determining whether a URL is already percent encoded or
> not. And percent encoding a URL that is already percent encoded
> exchanges all % characters with %25.
It's not that hard. All you have to do is percent decode the path *first*,
then percent encode it. Consider this URL, which is a worst-case for
what you're talking about:
gemini://example.com/🐇%20🥕.gmi
Unquoting the path gives you 'gemini://example.com/🐇 🥕.gmi', of
course. And then quoting it gives you
'gemini://example.com/%F0%9F%90%87%20%F0%9F%A5%95.gmi'
which decodes correctly.
Unquoting a path that is already plain ASCII does nothing to it.
--
Jason McBrayer | “Strange is the night where black stars rise,
jmcbray at carcosa.net | and strange moons circle through the skies,
| but stranger still is lost Carcosa.”
| ― Robert W. Chambers,The King in Yellow
More information about the Gemini
mailing list