[tech] Zero-width characters and tracking via pasted text
nervuri
nervuri at disroot.org
Mon Mar 22 13:59:14 GMT 2021
On Mon, Mar 15, 2021, Oliver Simmons wrote:
>> E0020: __
>> ... (E0020–E007F used for invisibly tagging texts by language)
>> E007F: __
>
>These *were* used for tagging texts by language, but have been
>deprecated in favour of using other non-Unicode metadata for this
>purpose.
>They are planned to be used in emojis and are (were?) used (but not
>widely supported) for country codes/flags with codes longer than 2
>characters (3?), such as USA states or counties of England.
>Wikipedia has a ~ok description of their history.
>=> https://en.wikipedia.org/wiki/Tags_(Unicode_block)
Thanks, I replaced "used" with "formerly used". Wikipedia says "The
release of Emoji 5.0 in March 2017 considers these characters to be
emoji for use as modifiers in special sequences." I take that to mean
that they will remain zero-width, but will generate emojis when used in
special sequences, as with the flag of England:
🏴
=
🏴<U+E0067><U+E0062><U+E0065><U+E006E><U+E0067><U+E007F><U+E0042>
Unicode keeps getting weirder.
More information about the Gemini
mailing list