Scheme Section 2 quibble

John Cowan cowan at ccil.org
Tue Nov 17 22:45:50 GMT 2020


On Tue, Nov 17, 2020 at 5:10 PM Sean Conner <sean at conman.org> wrote:

  The path parsing rules state a single slash.  Not '/'+, nor '/'*, but a
> single '/'.  The only place where more than a single slash is allowed PER
> THE @#%@#$@$ ABNF is just prior to the authority, which contains the
> hostname.  THE ONLY PLACE!
>

Correct.

> I will also draw your attention to the URI-reference rule, which is there
> for some reason, which allows both a full URI, or a RELATIVE URI, which
> means that
>
>                 //example.com/path/to/resource
>
> IS A VALID URI!  IT IS NOT A HACK!  What part of the ABNF do you not
> understand?
>

Nope.  It is a valid URI reference, because it is a valid relative
reference.  It is *not* a valid URI.

In what follows, I am going to assume that "URL" and "URI" are synonymous,
which they have been for 15 years since RFC 3986 was published.

> No, the spec allows both the full URI, and a relative URI as long as it
> starts with '//' (it has the authority section).  The wording in the spec
> is
> bad and should be changed to clarify it, but that's the current
> specification.
>

There are two cases:

1) In a Gemini-protocol request line (section 2), the second sentence says
that an absolute URL (that is, a URI without a fragment identifier) is
required.  The third sentence says that if the "scheme://" portion is
missing (in which case it is not a URI, much less an absolute URI), it
should be prefixed with "gemini://" and presumably reparsed.  That's
straightforward.

2) In a link line (section 5.4.2), we are told that there may be an
absolute or a relative URL.  There are no relative URIs, so we can only
interpret this as meaning a relative reference.  We are also told that if
the URL lacks a scheme (which is impossible: a URI always has a scheme)
then the scheme is "gemini".

Now suppose a link line in a resource that is available from "gemini://
example.com/public/this.gmi" has the form "foo/bar/baz.gmi".  We can
interpret this in one of two incompatible ways:

2a) a truncated version of "gemini://foo/bar/baz.gmi".  Note that "foo" is
a perfectly valid host name.

2b) a relative reference, in which case it resolves to "gemini://
example.com/public/foo/bar/baz.gmi".

So the spec is self-contradictory.  In my view interpretation 2a is bogus
and the sentence "If the URL does not include a scheme, a scheme of
gemini:// is implied" in section 5.4.2 should be removed.  What is more, I
would like to see the equivalent sentence "If the scheme of the URL is not
specified, a scheme of gemini:// is implied" removed as well.

> but that's what it is, other URI parsers that are more strict with
> > compliance to the RFC will refuse to parse a URI without scheme
> > present,
>
>   If it does, it's broken by design.  Again, see the ABNF above.
>

It is precisely the ABNF line in RFC 3986 section 3 that says a URI (as
opposed to a URI reference) has to begin with a scheme.



John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
It's the old, old story.  Droid meets droid.  Droid becomes chameleon.
Droid loses chameleon, chameleon becomes blob, droid gets blob back
again.  It's a classic tale.  --Kryten, Red Dwarf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201117/9818505a/attachment.htm>


More information about the Gemini mailing list