An outsider's view of the `gemini://` protocol

Sean Conner sean at conman.org
Fri Feb 28 02:44:09 GMT 2020


It was thus said that the Great Ciprian Dorin Craciun once stated:
> Hello all!

  Hello.

  [ snip ]

> * caching -- given that most content is going to be static, caching
> should be quite useful;  however it doesn't seem to have been present
> as a concern neither in the spec, FAQ or the mailing list archive;
> I'm not advocating for the whole HTTP caching headers, but perhaps for
> a simple SHA of the body so that clients can just skip downloading it
> (although this would imply a more elaborate protocol, having a
> "headers" and separate "body" phase);

  I don't think solderpunk (creator of this protocol) expects Gemini to be a
replacement for HTTP---for him, it's more of a way to cut down on the bloat
that has become the web.  In fact, everything in Gemini could in fact be
done with HTTP.  With that said, I have made oblique references to adding
something (a timestamp) to cut down on unneeded requests.  It hasn't been
taken up.

> * `Content-Length` -- I've seen this mentioned in the FAQ or the
> mailing lists;  I think the days of "unreliable" protocols has passed;
>  (i.e. we should better make sure that the intended document was
> properly delivered, in its entirety and unaltered;)

  I did bring this up early in the design, but it was rejected outright. 
This has since been brought up due to one Gemini site serving very large
files.  There has been some talk, but nothing has yet come from it.

> * status codes -- although both Gemini and HTTP use numeric status
> codes, I do believe that these are an artifact of ancient times, and
> we could just replace them with proper symbols (perhaps hierarchical
> in nature like `redirect:temporary` or `failure:temporary:slow-down`;

  I disagree.  Using "proper symbols" is over all harder to deal with. 
First, it tends to be English-centric.  I mean, we could go with:

	defectum:tempus:tardius

or how about

	teip:sealadach:níos-moille

  First off, the code has to be parsed, and while this is easy in languages
like Python or Perl, you run into ... issues, with Rust, C++ or Go (not to
mention the complete mess that is C).  A number is easy to parse, easy to
check and whose meaning can be translated into another language.  The Gemini
status codes (as well as HTTP and other three-digit status codes) don't even
have to be converted into a number---you can easily do a two level check:

	if (status[0] == '2')
		/* happy path */
	else if (status[0] == '3')
		/* redirection path */
	else if (status[0] == '4')
		/* tempoary failure */
	else if (status[0] == '5')
		/* permanent failure */
	else if (status[0] == '6')
	{
		/* authorizatio needed */
		if (status[1] == '1')
			/* client cert required */
		else if (status[1] == '3')
			/* rejected! */
	}

  There was a long, drawn-out discussion between solderpunk and me about
status codes.  The compromise was the two digit codes currently in use.

> * keep-alive -- although in Gopher and Gemini the served documents
> seem to be self-contained, and usually connections will be idle while
> the user is pondering what to read, in case of crawlers having to
> re-establish each time a new connection (especially a TLS one) would
> eat a lot of resources and incur significant delays;  (not to mention
> that repeated TCP connection establishment to the same port or target
> IP might be misinterpreted as an attack by various security appliances
> or cloud providers;)

  I would think that would be a plus for this crowd, as it's less likely for
Gemini to be quickly exploited.

> Now on the transport side, somewhat related to the previous point, I
> think TLS transient certificates are an overkill...  If one wants to
> implement "sessions", one could introduce

  This is the fault of both myself and solderpunk.  When I implemented the
first Gemin server (yes, even more solderpunk, who created the protocol) I
included support for client certificates as a means of authentication of the
client.  My intent (besides playing around with that technology) was to have
fine grained control over server requests without the user to have a
password, and to that end, I have two areas on my Gemini server that require
client certificates:

	gemini://gemini.conman.org/private/

		This area will accept *any* client certificate, making it
		easy for clients to test that they do, in fact, serve up a
		client certificate.

	gemini://gemini.conman.org/conman-labs-private/

		This area requires certificates signed by my local
		certificate authority (i.e. *I* give you the cert to use). 
		This was my actual intent.

It wasn't my intent to introduce a "cookie" like feature. solderpunk
interpreted this as a "cookie" like feature and called it "transient
certificates".  I still view this feature as "client certificates" myself. 
I personally think the user of "transient certificates" is confusing.

> On a second thought, why TLS?  Why not something based on NaCL /
> `libsodium` constructs, or even the "Noise Protocol"
> (http://www.noiseprotocol.org/)? 

	1) Never, *NEVER* implement crypto yourself.

	2) OpenSSL exists and has support in most (if not all) popular
	languages.

	3) I never even heard of the Noise Protocol.

> For example I've tried to build the
> Asuka Rust-based client and it pulled ~104 dependencies and took a few
> minutes to compile, this doesn't seem too lightweight...  

  So wait?  You try to use something other than OpenSSL and it had too many
dependencies and took too long to compile?  Or is did you mean to say that
the existing Rust-based client for OpenSSL had too many dependencies?  I
think you mean the later, but it could be read as the former.

> Why not just re-use PGP to sign / encrypt requests and replies?  With
> regard to PGP, 

  There are issues with using PGP:

	https://latacora.micro.blog/2019/07/16/the-pgp-problem.html

> given that Gopher communities tend to be quite small,
> and composed of mostly "techie" people, this goes hand-in-hand with
> the "web-of-trust" that is enabled by PGP and can provide something
> that TLS can't at this moment: actual "attribution" of servers to
> human beings and trust delegation;  for example for a server one could
> generate a pair of keys and other people could sign those keys as a
> way to denote their "trust" in that server (and thus the hosted
> content).  Why not take this a step further and allow each document
> served to be signed, thus extending this "attribution" not only to the
> servers, but to the actual contents.  This way a server could provide
> a mirror / cached version of a certain document, while still proving
> it is the original one.

  The hardest problem with crypto is key management.  If anything, key
management with PGP seems more problematic than with OpenSSL and the CA
infrastructure (as bad as the CA infrastructure is).

> Now getting back to the `gemini://` protocol, another odd thing I
> found is the "query" feature.  Gemini explicitly supports only `GET`
> requests, and the `text/gemini` format doesn't support forms, yet it
> still tries to implement a "single input-box form"...  Granted it's a
> nice hack, but it's not "elegant"...  (Again, like in the case of
> sessions, it seems more as an afterthought, even though this is the
> way Gopher does it...)
> 
> Perhaps a simple "form" solution would be better?  Perhaps completely
> eliminating for the time these "queries"?  Or perhaps introducing a
> new form of URL's like for example:
> `gemini-query:?url=gemini://server/path&prompt=Please+enter+something`
> which can be served either in-line (as was possible in Gopher) and /
> or served as a redirect (thus eliminating another status code family).

  Forms lead to applications.  Applications lead to client side scripting. 
Client side scripting leads to the web ... 

  Of course there's pressure to expand the protocol.  solderpunk is trying
his hardest to keep that from happening and turning Gemini into another web
clone.

> Regarding the `text/gemini` format -- and taking into account various
> emails in the archive about reflowing, etc -- makes me wonder if it is
> actually needed.  Why can't CommonMark be adopted as the HTML
> equivalent, and a more up-to-date Gopher map variant as an alternative
> for menus?  There are already countless safe CommonMark parsers
> out-there (for example in Rust there is one implemented by Google) and
> the format is well understood and accepted by a large community
> (especially the static side generators community).

  It can.  RFC-7763 defines the media type text/markdown and RFC-7764 define
known variations that can be specified.  Could be done right now without any
changes to Gemini.  Go for it.

> Regarding an up-to-date Gopher map alternative, I think this is an
> important piece of the Gopher ecosystem that is missing from today's
> world:  a machine-parsable standard format of indexing documents.  I
> very fondly remember "directory" sites of yesteryear (like DMOZ or the
> countless other clones) that strives to categorize the internet not by
> "machine learning" but by human curation.

  Could you provide an example of what you mean by this?  I'm not sure why a
map alternative is needed.

> * and perhaps add support for content-based addressing (as opposed to
> server-based addressing) (i.e. persistent URL's);

  There already exist such protocols---I'm not sure what a new one based
around Gemini would buy.

> (Perhaps the closest to this ideal would be a Wikipedia style web...)

  We already have that---the Wikipedia.

  -spc


More information about the Gemini mailing list