Conversion of HTML to gemtext (was Re: A proposed scheme for parsing preformatted alt text)
Sean Conner
sean at conman.org
Mon Sep 7 23:54:57 BST 2020
It was thus said that the Great Sandra Snan once stated:
>
> While I'm on the topic of one link per line, the whole numbers in the
> paragraphs to refer to line links style is not cool. Its as if you want
> inline links. If you wanted inline links then why didn't you put them in
> the spec?
>
> The spec was designed by someone who wanted text paragraphs followed by
> (or preceded by) lists of links. _Pages_ are hypertext, but the prose
> isn't.
Because it was clear that people *wanted* links in gopher? Because there
are too many variations on Markdown already? Because it's not that easy to
parse Markdown? [1] Solderpunk wanted a way to include links and still make
it easy to parse.
I do the numbering thing because I'm converting HTML to gemtext, and I
borrowed most of the code from my work in converting HTML to text for
gopher. I like HTML for its hypertext capabilities. But I found that it
was harder for me to convert HTML to gemtext tnan to plain text since
gemtext has *just* enough capabilities to make it seem easier, but not
enough to handle some of the seldom used tags (like <DL><DT><DD>) or even
simple nesting (a <BLOCKQUOTE> within a <BLOCKQUOTE>).
I could serve HTML, but that would go against the grain of Gemini.
So I'm looking at this post of mine:
http://boston.conman.org/2020/07/28.1
and I'm wondering how I would do it differently. I supposed instead of
(showing the rendered output, not the actual gemtext):
On Saturday, I sent a message [1] to the party responsible for
slamming my Gemini server [2] (one among several) and I've yet to
receive any response. I removed the block from the firewall, and I
haven't seen any requests from said bot. It looks to have been a
one-off thing at this time.
[1] /boston/2020/07/25.2
[2] /boston/2020/07/24.2
Weird.
But then again, this is the Intarwebs, where weird things [1] happen
all the time [2].
[1] /boston/2006/10/30.1
[2] /boston/2015/04/29.1
At this point, I'm hoping it was fixed silently and it won't be an
issue again.
I could do:
On Saturday, I sent a message to the party responsible for slamming
my Gemini server (one among several) and I've yet to receive any
response. I removed the block from the firewall, and I haven't seen
any requests from said bot. It looks to have been a one-off thing
at this time.
I sent a message
slamming my Gemini server
Weird.
But then again, this is the Intarwebs, where weird things happen
all the time.
weird things
all the time
At this point, I'm hoping it was fixed silently and it won't be an
issue again.
I'm not sure how I fell about that. It doesn't look as good to me as the
numbered links. Would you prefer I just serve up text/html? Not use HTML
at all? Change the output to the sample above?
And before I go, here are the links to all three versions (HTML, plain
text, gemtext):
http://boston.conman.org/2020/07/28.1
gopher://gopher.conman.org/0Phlog:2020/07/28.1
gemini://gemini.conman.org/boston/2020/07/28.1
-spc
[1] Markdown, as initially defined by John Gruber also allowed arbitrary
HTML. I think most people either forget that detail, or don't know
about it in the first place.
More information about the Gemini
mailing list