kota's memex

distillers

go

https://github.com/JohannesKaufmann/html-to-markdown

https://github.com/go-shiori/go-readability

https://github.com/philipjkim/goreadability

https://github.com/markusmobius/go-domdistiller

https://github.com/rubenfonseca/fastimage

https://github.com/rogchap/v8go

python

https://github.com/codelucas/newspaper

js

https://github.com/postlight/mercury-parser

https://github.com/mozilla/readability

https://www.npmjs.com/package/readability-cli

html to md

https://github.com/JohannesKaufmann/html-to-markdown

this one uses goquery?

https://github.com/rsc/tmp/tree/master/md2html

https://github.com/mattn/godown

about

Been wanting to make a nice web portal on gemini for a while. The most important tool is a reader mode/dom distiller which I have another note detailing. Long story short I've found 2 good ones for go. Both are just go ports of the two best reader mode tools, firefox's and chrome's old one. The firefox one is more actively developed.

The rough process could go one of two ways:

  1. Accept gemini request, probably via SCGI.
  2. Download html.
  3. "Distill" html with one of those libraries.
  4. Walk the simplified html tree and write gemtext directly.
  5. Send back gemtext.

Alternatively, more easily, but probably worse:

  1. Accept gemini request, probably via SCGI.
  2. Download html.
  3. "Distill" html with one of those libraries.
  4. Convert the html 2 markdown using any number of good libraries.
  5. Convert the markdown to gemtext using my goldmark renderer.
  6. Send back gemtext.