Questions regarding "POST" request and line endings
Felix Queißner
felix at masterq32.de
Sun May 17 15:10:36 BST 2020
First of all: thanks for the very extensive response!
> It's not that I don't think there are good uses for this.
>
> The original reason is that I was obsessed from day one with making it
> extremely hard for people to be able to extend the core Gemini protocol.
> HTTP, for example, allows as many headers as you like in
> requests/responses. Clients are expected to read them all, and handle
> the ones they can handle. This means anybody can come up with a new
> header, and if it's popular many clients/servers will implement it, and
> then it becomes a de facto part of the standard, and clients/servers
> which don't handle it are seen as "broken" or "primitive".
Yes i can understand this and it was not my intention to create
extensibility in the protocol but just allow a single, client-induced
data upload to the server.
> This extensibility is of course a useful thing in many ways from an
> engineering perspective. But in the long term it is, IMHO, fundamentaly
> totally incompatible with ideals like simplicity and minimalism and
> privacy and "anybody can implement it themselves over a weekend in <
> 1000 LOC". Designers of protocols which are extensible effectively lose
> a lot of control over their protocol.
Yes, true
> It's pointless me trying very
> hard to keep stuff which could be abused for tracking out of Gemini if
> it can be snuck in by popular consensus this way, because inevitably it
> will be. You've just got to limit the scope for this kind of extension
> everywhere you can.
One proposal for more privacy and less tracking:
Explicitly allow clients to remove the query string from any request, as
most of the web stuff does also tracking via request parameters (before
cookies).
This would prevent servers relying on per-user generated URLs in between
pages and the user can be queried if they want to remove the query
parameters.
> If you take this idea seriously, you are basically forced to choose
> one kind of "thing" a lot, and then have that thing be totally implicit.
> If there's only one kind of Gemini request (something analogous to GET),
> then we don't have to explicitly put anything in the request format
> saying "this is a GET-ish request". And if there's nothing explicit
> there, nobody can write an "advanced" server which recognises a
> different value in that place.
Yeah that's why i asked for a specific PUT in the first place. It may
start to emerge that people want a more interactive version of
gemini-served pages and would start to abuse standard features like url
queries to introduce that kind of interactivity and it would be a point
where the server would be able to pretty easily "trick" the user into
following trackable links.
Having an explicit PUT option in the protocol and preventing servers to
rely on queries would make stuff simpler and more straightforward in the
long term
> If somebody can come up with a way to distinguish GET from POST style
> requests without also opening up an obvious door to arbitrarily many
> extra request types, I'll give it some thought. But I'm not optimistic.
I actually came up with an idea, but i don't know how good it is in the end:
Respec the 10 INPUT so that it works like this:
1. Client sends usual request header
2. Server responds with "10 Your forum post:"
3. Client now has two options:
1. The client drops the connection and sends no bytes. This would be
the status quo.
2. The client now sends a single line with the mime type of the
data, then sends the data similar to the server responding with a 20
status code (so, instead of the server sending data to the client, the
client just sends data to the server)
This would allow several things:
1. Server can notify that the client needs to upload data, the client
can now chose to upload or not
2. With the mime type in the upload header, the server can just drop the
connection after the mime, displaying the client that the data sent is
unwanted.
> Insisting on non-extensibility necessarily imposes limits on how much
> Gemini can do. That's okay. Limitations encourage creativity, and give
> different things their own unique style/taste/whatever. Gemini is never
> going to be able to do everything that the web can do - it can't
> possibly do that while remaining simpler. We should accept this.
Yeah true. But the first idea that comes to my mind when i'd like to
upload a file would be:
Chunk the file into 256 byte large pieces, and upload the whole data via
a huge load of requests containing a query
/path/?offset=X&length=Y&blob=Z
where X is the offset in the uploaded file, Y is the length of the
transferred data and Z would be the URL-encoded data itself.
> As recently mentioned, the spec doesn't actually explicitly say anything
> about line endings in text/gemini content itself (although it should).
> It does suggest that CRLF is needed at the end of => lines, but that was
> unintentional on my part. I agree that requiring CRLF for actual
> content is strange and I suspect this will change in the next revision.
>
> CRLF *is* clearly and deliberately specced in the non-content part of
> the protocol, i.e. for requests and response headers. And the honest
> answer here is, well, that's how every internet protocol whose spec I've
> ever looked at works - HTTP, Gopher, SMTP, IRC, for example, all do
> this. I admit to being ignorant as to the exact historical reason for
> his convention. But it's a deep and wide convention adhered to by
> people who know more than I do, and for that reason I'm reluctant to
> break it without very good reason.
Thanks for clarifying!
> If people have strong feelings in either direction about the line
> terminator to be used in the protocol and in text/gemini content, I'm
> very happy to hear it.
I'd like to see a pure <LF> version, especially for the protocol header.
My client atm just reads until the first <LF>, then checks if the <CR>
is there and if not, drops the connection to the server and respons with
"InvalidResponse"
I assume a lot of servers/clients either ignore the existence of <CR> or
drop the connection for protocol violation because both options are the
sane thing to do. It's not like a lone <CR> or <LF> are allowed anyways
in the header.
Regards
xq
More information about the Gemini
mailing list