[tech] Zero-width characters and tracking via pasted text
Oliver Simmons
oliversimmo at gmail.com
Mon Mar 15 16:43:33 GMT 2021
On Sun, 14 Mar 2021 at 16:55, nervuri <nervuri at disroot.org> wrote:
>
> First, as a point of reference, here are a few positive-width Unicode
> characters:
> 0020: _ _ | 00E9: _é_ | 03A9: _Ω_ | 5B57: _字_ | 1F407: __
>
All fine for me!
(GMail seems to strip emoji in plain-text replies though.. which is rather odd.)
> FFF9: __
> FFFA: __
> FFFB: __
These three show as the replacement box for me.
I've never quite understood what the "inter annotation" whatever
characters are - but I think they're some form of control character so
having them display as a box when used incorrectly might be correct.
>
> E0020: __
> ... (E0020–E007F used for invisibly tagging texts by language)
> E007F: __
>
These *were* used for tagging texts by language, but have been
deprecated in favour of using other non-Unicode metadata for this
purpose.
They are planned to be used in emojis and are (were?) used (but not
widely supported) for country codes/flags with codes longer than 2
characters (3?), such as USA states or counties of England.
Wikipedia has a ~ok description of their history.
=> https://en.wikipedia.org/wiki/Tags_(Unicode_block)
-Oliver Simmons
More information about the Gemini
mailing list