Proposal about content-size and hash
Arav K.
nothien at uber.space
Thu Oct 29 11:11:28 GMT 2020
Hi, Gemini peoples!
I have two main proposals for getting information like content-size
(which, as has been discussed previously, would be very useful for even
slightly big responses).
1. Define specific additional MIME type parameters. For example, one
could return '20 application/gzip; content-size=42069420' as the
response line. This could also be used for giving the client a hash
of the data (parameter name e.g. 'content-hash-sha256'), useful for
verification and for checking against local caches of the same page.
To just check the MIME type (now possibly with size and hash), as
jmcbray had mentioned on a previous thread on the topic, the best
solution is to just request the page and terminate the connection
after the request line. Perhaps it should be specified in the spec
that clients may do this, and servers may want to be prepared for
this outcome.
A possible issue is that the hash will not fit within the META line,
given the 1024 byte limitation (this is especially problematic for
things like SHA 512 hashes). We could work around this either by
increasing the mandated maximum META line size (probably not
possible), or by providing this information using the second proposal
but still using this proposal (or both) for content-size.
A potential drawback is that we open ourselves up to further
extension this way, but I would argue that this avenue has always
been around. If the spec defined only these two fields, and mandated
that aside from them only parameters defined by the MIME spec may be
included, then it _should_ be fine.
2. Define an additional endpoint for retrieving meta info.
On ~chat, bjorn.warmedal layed out the possibility of using the
'/.content' URL to return a content hash. This would function like a
normal URL, one which accepts as input the URL of the page to
retrieve content information about.
I propose that the response from a '/.content?/<path>' request
returns the MIME type of the URL '/<path>', optionally including
content-size and content-hash-* parameters (this works because there
are no size restrictions for the content, unlike the META line). Its
MIME type can be bikeshed if this proposal is agreed upon. The
response format may be exactly an MIME type (i.e. no CR LF or
anything).
Unfortunately, extension becomes possible simply by using a different
MIME type. I don't know how to prevent this.
I think that proposal #1 is definitely doable, but I understand that #2
can be more problematic for some. I suppose #2 would be beneficial if
we determine that content-hash is really necessary, but further
extensibility is unwanted and should be prevented somehow.
~aravk | ~nothien
More information about the Gemini
mailing list