a space case for transparent gemtext compression
Rohan Kumar
seirdy at seirdy.one
Fri Jun 18 04:20:08 BST 2021
Get ready for a wall of text.
On Thu, Jun 17, 2021 at 02:24:45PM +0200, Francis Siefken wrote:
>How would people solve such use cases elegantly and within the design
>philosophy?
Compression could be valuable to users with poor download speeds,
especially those using an overlay network like Tor with a poor uplink.
I do like the idea of compression being optional; if a client supports
compression it can get a Gemini file compressed, but otherwise a plain
version. However, this does create a huge problem: clients would have to
declare that they support compression, which opens up a can of worms
(complexity, fingerprinting, etc.) that we should stay away from.
Almaember's approach is much better:
On Thu, Jun 17, 2021 at 03:47:48PM +0200, Almaember wrote:
>In my personal opinion, the best solution would be to simply have it as
>a separate MIME-type, something along the lines of "text/gemini+gzip".
>I don't recall how this works with MIME-types, but it should be
>something like this.
>
>I do support your idea, though. I think compression would be a nice
>addition, but it doesn't belong in the protocol itself, but the file
>format.
I think the best solution is:
- Have clients optionally support pagination, like what most line-mode
clients (gmnlm, cgmnlm, diohsc) do. Show the first N lines/bytes
instead of downloading the whole thing; let the user scroll to trigger
downloading the rest. This probably should not be the default setting,
but that's up to client developers to decide.
- For big files, have authors link a compressed version: "Dear reader,
this gemini file is large. Here's a link to a compressed version:"
- Users can then select the link before they've finished
downloading/paging through the file.
This therefore doesn't need to be part of the Gemini "standard" but can
simply be a recommendation for authors and devs. If it gains a lot of
traction, perhaps it could be formalized using the word "MAY" in the
spec ("clients MAY also support the mimetype...").
Regarding a compression algorithm to pick: it should be one that's
fairly common with a lot of libraries/implementations for a variety of
platforms and programming languages, keeping with the rationale for
choosing TLS 1.2+ over other options for transport-layer encryption.
Although I'd love to pick Lizard for its speed, it's not universal
enough to qualify.
=> https://github.com/inikep/lizard Lizard (formerly LZ5)
Our best options are therefore gzip/DEFLATE and perhaps zstd. Gzip can
actually get pretty small when compressing statically/ahead-of-time,
where compression speed is less of a constraint. Tools like Zopfli and
especially Efficient-Compression-Tool can get a dump of all posts on
seirdy.one (~100kb) 41.1 kb, compared to 39.9kb with zstd -f19. The
difference only becomes significant with Gemini files above 200kb.
=> https://github.com/fhanau/Efficient-Compression-Tool
I therefore vote for the following *non*-standard:
- Encourage authors of 50kb+ gmi files to link a text/gemini+gzip
somewhere near the top.
- Encourage client developers to consider supporting pagination (can be
optional and off by default), so only the first N bytes/lines are
downloaded before the user performs an action.
- Encourage client developers to support the text/gemini+gzip MIME type
Only after this gains traction and is well-received by users with
constrained bandwidth should consider adding a "MAY" statement to the
Gemini spec describing the text/gemini+gzip MIME type. Compression
support should never be required.
--
/Seirdy (seirdy.one)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 898 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210617/9a8c547b/attachment-0001.sig>
More information about the Gemini
mailing list