An outsider's view of the `gemini://` protocol

Ciprian Dorin Craciun ciprian.craciun at gmail.com
Sun Mar 1 09:10:31 GMT 2020


On Sun, Mar 1, 2020 at 10:22 AM Bradley D. Thornton
<Bradley at northtech.us> wrote:
> On 2/28/2020 2:04 AM, Ciprian Dorin Craciun wrote:
> > On Fri, Feb 28, 2020 at 11:07 AM Sean Conner <sean at conman.org> wrote:
> >>   Why is a numeric status code so bad?  Yes, the rest of the protocol is
> >> English centric (MIME types; left-to-right, UTF-8).  It just seems that
> >> using words (regardless of language) is just complexity for its own sake.
> >
> > Because numbers are hard to remember, and say nothing to a person that
> > doesn't know the spec by heart.  (For example although I do a lot of
> > HTTP related work with regard to routing and such, I always don't
> > remember which of the 4-5 HTTP redirect codes says "temporary redirect
> > but keep the same method" as "opposed to temporary redirect but switch
> > to `GET`".)
>
> Well, section 1.3.2 of the Gemini spec-spec says two digit codes, but
> single (first digit) is all that is required. So, a 2, a 20, and a 21
> are all success and there's no ambituity as to anything occuring at the
> first digit level, it's just more gravy with the second digit.


Although I didn't state this before, having two digits, which in fact
are interpreted as a two level decision tree (the first digit denoting
generic class of conditions, and the second digit denoting more
fine-grained reasons), has at least the following problems:

* (given that `gemini://` wants to be simple) it's too much;  out of
the 410 non-empty lines of specification (v0.10.0), 77 are dedicated
to section `1.3.2 Status Codes` and another 93 to the `Appendix 1 --
Full two digit status codes`, i.e. ~40% of the specification is
dedicated to only these two digits...  granted these two sections also
contain additional information about how to interpret them and the
meta field, etc.;  but regardless having so many status codes does add
additional complexity to the protocol;

* either it's not enough, given that we've already used 50% of the
"generic class of conditions" (i.e. first digits 1 through 6);  soon
enough, as the protocol progresses and matures, we'll identify new
classes of conditions, and we'll have to start to either introduce a
"miscellaneous" category, or use values from other categories,
breaking thus the clear hierarchy;

* some conditions don't fall particularly well into clear categories;
for example `21 success with end of client certificate session` has to
do with TLS transient certificates management (which is `6x`);  in
fact this shouldn't even be a status code, but a "signal", because for
example a redirect or other failure could as well require the end of
client certificate session;

* another example of "unclear" status codes are `42 CGI error` and `43
proxy error`, which are part of the `4x temporary failure` group, but
might be in fact (especially in the case of 43, although granted we
have 53) permanent errors;  (even `51 not found` can be seen as a
temporary error, because perhaps the resource will exist tomorrow;)

* and speaking of proxies, we have `43 temporary proxy error` and `53
proxy request refused`, but we have no other proxy related statuses
like for example `6y` that states `proxy requires authentication`,
etc.;




So, if we really want to keep things simple why not change this into:

* (we only use one digit to denote success or failure);
* `0` (i.e. like in UNIX) means success, here is your document;
* `1` (i.e. again like in UNIX) means "undefined failure", the client
MUST display the meta field to the user as plain text;  (please note
that this "soft"-forbids the client and server to implement any clever
"extensions";)
* `2` not found / gone;  (i.e. the server is working fine, but what
you are searching for does not exist at the moment;  perhaps it
existed in the past, perhaps later it will exist;)
* `3` redirect;  neither temporary nor permanent; (because in fact
there isn't a clear definition and usage of temporary vs permanent;)
* `4` input;  (though as stated earlier I find this to be hack, and it
can be more elegantly solved with a redirect to a `gemini+query:...`
URL;)

* all automated crawlers, upon receiving an status code of `1` should
log the failure, and not retry the same URL for more than 4 times in a
24 hour window;
* all automated crawlers, should pause for at least 6 hours if:
  * they have received at least 100 consecutive errors;
  * over the last 200 requests, at least 100 of them where errors;


How would we automate authentication given we have no `authentication
required`?  We display the user the meta, the user interprets the
message and if he sees a message that prompts him to authenticate he
will do so through a menu, and the user-agent will perhaps remember
this decision.




> I do fail to see why what appears to me to be a whole lot of work to
> implement what you suggest,


Now getting back to my "symbolic" status codes proposal, it's no more
work, because currently the code looks like:

````
if (status[0] == '1') {
   ...
} else if (status[0] == '2') {
   ...
}
````

Meanwhile my proposal would require one to:
````
if (hasprefix (status, "success:")) {
   ...
} else if (hasprefix (status, "redirect:")) {
   ...
}
````

Granted now one has to implement, or find already implemented the
`hasprefix`, but all languages have it, and even in C one can
implement it as `strncmp (status, expected, strlen (expected)) == 0`.




> , especially considering that most servers
> will invariably choose to implement their own custom handlers for
> status/error codes, much like one does in Apache so the server operator
> themselves gets to choose what content to deliver as a result of a 404.


No this proliferation of "status codes" won't happen because the
protocol won't allow for it.  (Although even today with numeric status
codes people can just invent their own, unless we clearly define
conditions for all 100 codes, and even then people can disregard their
definitions...)


The only marginal advantage I see for numeric codes is in logs as you've stated.

Ciprian.


More information about the Gemini mailing list