Spec proposal
Sean Conner
sean at conman.org
Thu Oct 22 23:01:34 BST 2020
It was thus said that the Great Ali Fardan once stated:
> Greetings follows, I just found out about gemini recently and I got
> interested in the project and wanted to be involved, In the process of
> setting up my gempod (that's how you call them?), I wanted to be able
> to have an HTML/HTTP mirror for my gempod and I haven't found a gemtext
> to HTML converter
Given how easy it is, I'm surprised there aren't more. But by searching
the mailing list, I did fine reference to two Gemini-text-to-HTML
converters:
https://github.com/LukeEmmet/GemiNaut/blob/master/GemiNaut/GmiConverters/GmiToHtml.r3
(written in Rebol, a blast from the past)
https://git.sr.ht/~sotirisp/qute-gemini
(Gemini text to Markdown to HTML in python3)
> so I decided to write my own, and in the middle of
> the process I thought if I'm going write a full parser for gemtext,
> I might as well make the code reusable and package it as a library, so
> the project shifted from a gemtext to HTML tool to a gemtext processing
> library, and here I am.
Hello.
> As of now, my implementation is complete, It is almost usable for
> anyone willing to test it, I wrote manpages for all functions currently
> implemented, but not for the data types yet, I'm going to work on that,
> and as part of my project, I want to write a manpage for the text/gemini
> format (gemtext(5)) and I want it to be precise and spec compliant,
> if you don't mind, I'll go ahead and write the manpage as a proposal to
> standardize some of the unclear cases of the spec, if the rest of the
> community agrees, maybe get the spec updated too?
>
> Attached is a tarball of my current implementation (WIP)
And here are some comments from trying it out. I wrote a simple Gemini
text file (with very long lines) and ran your test program over it. In the
output you have some garbage data on the very first line:
00000000: 88 DB CB 23 20 4C 6F 72 65 6D 20 69 70 73 75 6D ...# Lorem ipsum
00000010: 20 64 6F 6C 6F 72 20 73 69 74 20 61 6D 65 74 2C dolor sit amet,
Thoughts: sounds like you have some unitialized memory. Aside from the
garbage bytes, the output did not match the input as the pre-formatted block
input did not have the ``` guards. And the last blank line was not included
in the output either.
I also ran it under valgrind [1] and found a leak in the happy path:
[spc]lucy:/tmp/libgemtext>valgrind --show-reachable=yes --leak-check=full ./test </tmp/text.gemini >/tmp/t.gmi
==26859== Memcheck, a memory error detector.
==26859== Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward et al.
==26859== Using LibVEX rev 1575, a library for dynamic binary translation.
==26859== Copyright (C) 2004-2005, and GNU GPL'd, by OpenWorks LLP.
==26859== Using valgrind-3.1.1, a dynamic binary instrumentation framework.
==26859== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al.
==26859== For more details, rerun with: -v
==26859==
==26859== Conditional jump or move depends on uninitialised value(s)
==26859== at 0x804A2C1: strlcat (strlcat.c:38)
==26859== by 0x8049C7E: _line_append (encode.c:198)
==26859== by 0x8049EF5: gemtext_encode (encode.c:263)
==26859== by 0x804A182: gemtext_encode_fd (encode.c:339)
==26859== by 0x804A1FB: gemtext_encode_file (encode.c:359)
==26859== by 0x804867A: main (test.c:15)
==26859==
==26859== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 12 from 1)
==26859== malloc/free: in use at exit: 7,200 bytes in 1 blocks.
==26859== malloc/free: 86 allocs, 85 frees, 84,247 bytes allocated.
==26859== For counts of detected errors, rerun with: -v
==26859== searching for pointers to 1 not-freed blocks.
==26859== checked 55,588 bytes.
==26859==
==26859==
==26859== 7,200 bytes in 1 blocks are possibly lost in loss record 1 of 1
==26859== at 0x400579F: realloc (vg_replace_malloc.c:306)
==26859== by 0x8049C56: _line_append (encode.c:194)
==26859== by 0x8049D5D: gemtext_encode (encode.c:224)
==26859== by 0x804A182: gemtext_encode_fd (encode.c:339)
==26859== by 0x804A1FB: gemtext_encode_file (encode.c:359)
==26859== by 0x804867A: main (test.c:15)
==26859==
==26859== LEAK SUMMARY:
==26859== definitely lost: 0 bytes in 0 blocks.
==26859== possibly lost: 7,200 bytes in 1 blocks.
==26859== still reachable: 0 bytes in 0 blocks.
==26859== suppressed: 0 bytes in 0 blocks.
You will also want to check the non-happy paths for memory leaks. In my
experience, memory leaks are more likely in the non-happy path because
programmers rarely think through the non-happy path, and it's annoying to
write code to properly handle the non-happy paths in C.
But I think it's wonderful that there was only one leak, and possibly an
easy one to fix. The library itself appears easy to use (if you know C).
Good job.
-spc
[1] If you are doing C, and have access to valgrind (it's almost always
installed on every Linux system, or available to be installed), use
it. It is a fantastic tool to find memory leaks and issues with
unitialized memory. Yes, it's annoying having to track all the
issues down, but I feel it's worth it.
More information about the Gemini
mailing list