[spec] The Tragedy of &
Sean Conner
sean at conman.org
Sun Jan 31 23:16:18 GMT 2021
It was thus said that the Great Gary Johnson once stated:
> Sean Conner <sean at conman.org> writes:
>
> > Not if the CGI interface is properly written. All I had to do was write
> > this CGI script and drop it into my tests directory [1]:
> >
> > gemini://gemini.conman.org/test/pathseg.cgi
[ snip ]
> Thanks for sharing some code, Sean. I, of course, realize that one could
> write a CGI script to pick apart the PATH_INFO for user inputs. This
> issue I raised in my message was that this doesn't make any sense in the
> context of a CGI script which is looked up using the path on the remote
> filesystem.
>
> In your example, your script is located at /test/pathseg.cgi. However,
> lacking side information, I see no indicator (outside of the --
> admittedly optional -- cgi extension on your file name) of which path
> segments should be considered part of the CGI filename lookup and which
> parts are meant to be user input data in your example link:
>
> /test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh
That's a particular implementation detail of GLV-1.12556 [1]. Other
servers could require the extension, or some other mechanism.
> This feels like a massive hack to me and an abuse of path segments TBH.
>
> If I were to embrace this approach, I can see that I would have to
> reprogram my server to do some additional path preprocessing magic. I
> could either:
>
> 1. Check every sequence of path segments starting from the document root
> to see if any of them correspond to an executable file or have the
> blessed CGI file extension for my server.
I see your server just accepts the requested path as is. GLV-1.12556
(once it gets into the filesystem handler) walks down the document root
checking each path segment looking for an exectuable file (which indicates a
CGI script) or symbolic link (which indicates a SCGI script).
> Once one of these 3 approaches enables the server to successfully detect
> that a particular path corresponds to a CGI script that is not actually
> located where that path is pointing, then the server would need to
> execute that script with PATH_INFO bound to the entire path. Every
> installed CGI script would then be responsible for manually removing
> SCRIPT_NAME from PATH_INFO and splitting it up to get the user inputs,
> which puts an additional burden on CGI developers.
If you want to follow RFC-3875, that's not the case. PATH_INFO only
contans data past the script name (section 4.1.5). This link:
gemini://gemini.conman.org/cgi
returns
SCRIPT_NAME = /cgi
There is no PATH_INFO or PATH_TRANSLATED because it's not needed. However:
gemini://gemini.conman.org/cgi/path/to/nowhere
returns
SCRIPT_NAME = /cgi
PATH_INFO = /path/to/nowhere
PATH_TRANSLATED = /home/spc/projects/gemini/non-checkin/gemini.conman.org/path/to/nowhere
The work is on the server side, not the CGI script side.
> So I've now heard from multiple folks that we should all just get on
> with these path segment hacks and accept that as the best we can do in
> Gemini.
>
> While I can see that it's technically possible (though arguable ugly) to
> do so, I suppose my question is:
>
> "What exactly does Gemini lose by allowing chained query parameters?
> (with &)"
Nothing as far as I can see, as long as the characters '=' and '&' are
escaped if they appear in the input (to prevent confusion).
> What am I missing here, folks?
Somebody to do a proof-of-concept probably.
> Any chance of weighing in here, Solderpunk?
Is he still alive?
-spc
[1] https://github.com/spc476/GLV-1.12556
More information about the Gemini
mailing list