CGI and CGI like support (was Re: [ANN] Announcing Molly Brown, a Gemini server in Go)
Sean Conner
sean at conman.org
Tue Jan 14 23:49:20 GMT 2020
It was thus said that the Great solderpunk once stated:
>
> This decision should not be interpreted as a criticism of your
> RFC-3875-derived implementation. I think it makes good sense for there
> to be an option for people to easily convert existing web CGI scripts to
> Gemini.
>
> But I do think it would be nice if there was one vaguely standard way
> for servers to implement this kind of thing, so that dynamic content
> generating code could be more portable. I think for that I'd probably
> prefer something as light as possible, and to explicitly distance Gemini
> from many of the ideas baked into RFC-3875, especially that dynamic
> content code should have access to the end user's IP address.
RFC-3875 wasn't that bad to support as there aren't that many
meta-varables (as they are called) to suport, and several are optional
anyway. The RFC doesn't cover how the meta-varriables are sent to the
script, but under Unixland, it's via envinroment variables.
Here's what I currently do:
AUTH_TYPE
Not set unless the client provides a certificate, then this
gets set to "Certificate".
CONTENT_LENGTH
Doesn't apply as there's no way to send a document to a
Gemini server.
CONTENT_TYPE
Doesn't apply.
GATEWAY_INTERFACE
Set to "CGI/1.1"
PATH_INFO
Per RFC (wording is a bit muddled), and not always set.
PATH_TRANSLATED
Per RFC, and not always set. I will say that these two are
a bit persnickity to get right.
QUERY_STRING
Must be set. If no query string, set to "".
REMOTE_ADDR
REMOTE_HOST
I take it these are the ones you oject to the most. But if
I'm running a Gemini server, I *already* have your IP
address anyway. It seems silly to hide it to me, but I
don't live in Europe so take what I say with a grain of salt
or two. I set these (to just the IP address).
REMOTE_IDENT
Nobody supports RFC-1413, so I skip this one.
REMOTE_USER
If a client provides a certificate, I set this to the client
subject common name.
REQUEST_METHOD
I set this to "", as Gemini has no concept of a request
method (but see below).
SCRIPT_NAME
Per RFC. Not hard to set properly.
SERVER_NAME
Hostname of the current server. If you support multiple
hosts per Gemini, then I would set this to the host the
client connected to.
SERVER_PORT
Set to port number of server.
SERVER_PROTOCOL
Set to "GEMINI".
SERVER_SOFTWARE
Set to "GLV-1.12556/1".
And that's it without further configuration. As a default, a CGI script
will ONLY get these environment variables (whereas your implementation leaks
the parent environment to the script---might want to check that). I allow
one to set other environment variables per script (like $PATH or $LANG or
whatever). If you need HTTP compatibility, I set some HTTP_* and change
REQUEST_METHOD to "GET" and SERVER_PROTOCOL to "HTTP/1.0". I also have an
option to set some variables that Apache sets as well.
If the client presents a certificate, I set the following:
TLS_CIPHER
TLS_VERSION
TLS_CLIENT_HASH
TLS_CLIENT_ISSUER
TLS_CLIENT_SUBJECT
TLS_CLIENT_NOT_BEFORE
TLS_CLIENT_NOT_AFTER
TLS_CLIENT_REMAIN (time between now and TLS_CLIENT_NOT_AFTER)
TLS_CLIENT_ISSUER_* (various fields broken down)
TLS_CLIENT_SUBJECT_* (various fields broken down)
and AUTH_TYPE and REMOTE_USER as mentioned above (if Apache compatibility
requested, the names change but it's largely the same information). Details
can be seen starting here:
https://github.com/spc476/GLV-1.12556/blob/master/Lua/GLV-1/cgi.lua#L241
> I think there's a lot to recommend the way Molly Brown works, especially
> if we generalise it just a little to "Gemini CGI apps should endlessly
> read single line URLs over THING, until THING is closed, at which point
> the app should terminate".
Oh, so pretty much a Gemini server sans TLS then.
> Here THING could be stdin, or a TCP
> connection (making a CGI app basically a small self-contained server),
> or a unix domain socket. Simple servers could do what Molly currently
> does, just spawn the script, send a single URL over stdin and then close
> stdin, giving us the good old fashioned one-process-per-request model of
> traditional web CGI. But more advanced servers could give admins a way
> to configure different approaches where the process is persistent, more
> like FastCGI. Or they could round-robin load balance between multiple
> servers on a local network. The actual CGI program would see very
> little difference between these scenarios, you'd just give a slightly
> different argument to a library function which produced some kind of
> iterator over URLs. This has great power:weight.
>
> I was very happy with this idea until I realised that CGI programs
> should also have some way to get access to client certificates, not just
> the URL. :(
>
> I haven't returned since then to thinking about how to achieve this.
Perhaps a series of line, like:
Request: gemini://example.net/foo/bar/script
TLS_Cipher: ...
TLS_Version: ...
TLS_CLient_Hash: ...
TLS_Client_Issuer: ...
Ends with EOF (or a blank line or a NUL byte or some way to indicate the end
of this one request). This is consistent (each line is formatted the same
way) and I think, easy to deal with.
But whatever you come up with, I would try to avoid calling it CGI, as
that tends to lead to RFC-3875 ...
-spc
More information about the Gemini
mailing list