Three month spec freeze
Sean Conner
sean at conman.org
Wed Jun 3 01:47:45 BST 2020
It was thus said that the Great Felix Queißner once stated:
>
> I strongly recommend supporting utf-8 only. iconv -l lists 1179
> possible/known encodings. I don't want to support more than one in my
> code, most libraries don't support more than UTF-8, UTF-16, UTF-32 and
> ASCII. And for UTF-16 and UTF-32 not even a difference between little
> and big endian encoded data.
RFC-1436 (the gopher RFC) suggests ISO Latin1 (ISO-8859-1 if I'm not
mistaken) for 8-bit character sets and says *nothing* about UTF-8 (of
course, it was written before UTF-8 was an RFC, three years later). But
most of the gopher sites I hit these days are UTF-8---it's rare that I
actually encounter anything but UTF-8.
Secondly, Linux systems come with iconv. Not only is this a program, but
it's also a library which is dead simple to use as there are only three
functions:
iconv_t iconv_open(char const *tocode,char const *fromcode);
size_t iconv(iconv_t cd,char **inbuf,size_t *inbytes,char **outbuf,size_t *outbytes);
iconv_close(iconv_t cd);
and that's *if* you want to support conversion. The world is migrating to
UTF-8 so I think this will be that big of an issue in the long term. And in
case anyone is curious, I found bindings for
Lua
Python
Rust
Go
Haskell
(no comment on how well these bindings work, just that they exist).
-spc
More information about the Gemini
mailing list