Scheme Section 2 quibble

Sean Conner sean at conman.org
Wed Nov 18 08:42:57 GMT 2020


It was thus said that the Great Sudipto Mallick once stated:
> While you are discussing about the specs, please have a look at how
> the servers are currently responding to the edge cases.
> 
> http://ix.io/2EyQ
> 
> Request -> Response (first line only)
> The list of known servers from gemini://gus.guru/known-hosts : removed
> all non existent servers and *.flounder.online
> Test yourself: http://ix.io/2Etk
> 
> And if you can, forgive my madness.

  Thank you for running this and reporting the results.  I can describe why
you got the results for my server: gemini.conman.org

	gemini.conman.org -> 59 Bad Request
	gemini.conman.org/ -> 59 Bad Request
	gemini.conman.org// -> 59 Bad Request

  These are bad because there's no scheme nor authority (missing a '//') and
thus, these are marked as a bad request.

	//gemini.conman.org -> 20 text/gemini
	//gemini.conman.org/ -> 20 text/gemini
	//gemini.conman.org// -> 59 Bad Request

  These are missing the scheme, but have an authority section [1].  The URL
parser I use adds a '/' for the path if the path does not exist.  That's why
my server does not do a 31-redirect with a missing '/' at the end.  The
double slash at the end is being checked by a modified path-abempty rule. 
The ABNF from the RFC is:

	   path-abempty  = *( "/" segment )

while the URL parser I'm using is doing:

	   path_abempty <- {~ ( '/' segment)+ ~}
                        /  '' -> '/'

  The parsing code is in LPEG [2] and is equivalent to

	   path-abempty = +( "/" segment)
			/ 0<pchar> # and return a '/'

and was written that way to fix an issue inherent with the ABNF of
"0<pchar>" and how parsing works with LPEG.  I can go into details of LPEG
if anyone is interested, but suffice to say, the path_abempty of LPEG is
different from the ABNF of the RFC for a good reason, and this is why the
trailing '//' from the authority section is not parsing.

	gemini://gemini.conman.org -> 20 text/gemini
	gemini://gemini.conman.org/ -> 20 text/gemini
	gemini://gemini.conman.org// -> 59 Bad Request

  A more normal request, and the same explanation from above.  No surprises
for my server (at least, to me).  A more interesting response is from
blekksprut.net and cadence.moe:

	blekksprut.net -> 20 text/gemini
	blekksprut.net/ -> 20 text/gemini
	blekksprut.net// -> 20 text/gemini
	//blekksprut.net -> 51 not found
	//blekksprut.net/ -> 51 not found
	//blekksprut.net// -> 51 not found
	gemini://blekksprut.net -> 20 text/gemini
	gemini://blekksprut.net/ -> 20 text/gemini
	gemini://blekksprut.net// -> 20 text/gemini

	cadence.moe -> 20 text/gemini; charset=utf-8; lang=en
	cadence.moe/ -> 20 text/gemini; charset=utf-8; lang=en
	cadence.moe// -> 20 text/gemini; charset=utf-8; lang=en
	//cadence.moe -> 50 Bliz server: Not found: //cadence.moe
	//cadence.moe/ -> 50 Bliz server: Not found: //cadence.moe/
	//cadence.moe// -> 50 Bliz server: Not found: //cadence.moe//
	gemini://cadence.moe -> 20 text/gemini; charset=utf-8; lang=en
	gemini://cadence.moe/ -> 20 text/gemini; charset=utf-8; lang=en
	gemini://cadence.moe// -> 20 text/gemini; charset=utf-8; lang=en

  These results probably stem from a same issue, but possibly different
servers.  Just going quickly through the results, if there was no problem
with the first grouping (just the domain name), it seems the servers *have* an
issue with the second grouping (leading '//').  Odd.

  Again, thanks for this.

  -spc

[1]	I've been debating if I should mark a missing scheme as a "bad
	request" as I've come around to support that a Gemini server should
	ONLY accept an absolute URL.  I haven't ... yet.

[2]	Lua Parsing Expression Grammar


More information about the Gemini mailing list