robots.txt for Gemini formalised
Philip Linde
linde.philip at gmail.com
Tue Nov 24 11:31:09 GMT 2020
On Tue, 24 Nov 2020 11:29:02 +0100
marc <marcx2 at welz.org.za> wrote:
> Consider the gemini://example.com/~somebody/personal.gmi -
> if somebody wishes to exclude personal.gmi from being
> crawled they need write access to example.com/robots.txt,
> and how do we go about making sure that ~somebodyelse,
> also on example.com doesn't overwrite robots.txt with
> their own rules ?
How the server produces responses to robots.txt requests is an
implementation detail. robots.txt can easily be implemented such that
the server responds with access information provided by files in
subdirectories. For example: a system directory corresponding to
/~somebody/ contains a file named ".disallow" containing
"personal.gmi". When the server builds a response to /robots.txt, it
considers the content of all ".disallow" files and includes Disallow
lines corresponding to their content. This way, individual users on a
multi-user system can decide for themselves the access policy for their
content without shared access to a canonical robots.txt.
> I have pitched this idea before: I think a footer containing
> the license/rules under which a page can be distributed/cached
> is more sensible than robots.txt. This approach is:
>
> * local to the page (no global /robots.txt)
> * persistent (survives being copied, mirrored & re-exported)
> * sound (one knows the conditions under which this can be redistributed)
What if my document is a binary file of some sort that I can not add a
footer to? The only ways to address this consistently for all document
types are to
a) Include the information in the response, *distinct* from its body
b) Provide the information in a sidecar file or sideband communication
channel
--
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201124/fb819cb7/attachment.sig>
More information about the Gemini
mailing list