Week 2: Quest to generate documentation in Markdown

28 September 2022

Hello everyone,

This is my first installment of development journal for Posh TUi app. I’ll do everything possible to make such notes a bi-weekly occurrence. It also helps me to be accountable and to structure my thoughts to ensure that everything is on a right track.

What do I have so far?

I already have a first prototype on my local computer, that takes HTML documentation, cleans it up and converts to markdown on a fly.  Why do I go through all this trouble to have a Markdown document? Well, standard terminal can’t render HTML properly, it’s mostly likely to show you plain-text content of HTML file.

There is a way to switch terminal renderer, but that would make it inaccessible to a lot of people – only to major geeks. So switching terminal renderer is not really a viable option either.

My first prototype offers a nice way to review documentation in terminal like a book. But it comes with a bunch of serious downsides, that doesn’t allow me to use this approach for a developer documentation browser. Namely:

  • It takes a while to parse HTML and bigger pages takes noticeable time to load
  • I can’t really build a search feature on top of that. I need to know exactly which row to scroll down to, but if we convert HTML to markdown, number of rows changes and search throws us to an entirely different section.

So there is no way around this, I need to generate markdown and work with Markdown, no HTML. And this initial prototype will be scrapped and needs to rebuilt from ground up.

But first, I need to get that documentation reliably converted into markdown…

Quest to render markdown documentation

The most important step, to get this project off the ground, is to have a way to generate markdown documentation (or plain text) so we can render it in terminal. If I do not figure this out, there is not much else I can do. My first assumption was, that I should be able to generate markdown from the source. Same ruby and rails does now, but only tweaking a couple of parameters to generate .md files instead. YARD is being used for that and it supports any markup rdoc or yard.

Yard

While on tin can it says, that it supports markdown, unfortunately output is only html. Played around with multiple parameters and it doesn’t affect the output that much.

While asking around, I’ve been meet with people being completely baffled by my request. “You still need to render markdown to something, and it will probably be HTML”, was most common sentiment I’ve got.

At this stage I understood, that what seemed like an easy-peasy task, might not be straight forward as I assumed.

RDoc

RDoc doesn’t really claim that it can produce anything other than HTML for documentation. But multiple gems  have proved otherwise, there are libraries that could convert rdoc to PDF and if you study RDoc source code closely enough – you’ll find that it can spit out JSON blob of entire content. That json blob could easily be transformed to Markdown and can easily be used to create an index of entire content (for search and navigation section).It seems, that most of the ruby itself, rails and all other gems I’ve stumbled upon work with rdoc. Even DASH clearly indicates, that all gem documentation that they store are based of RDoc.So after a lot of studies on a matter, I’ve decided to write a generator for RDoc that would spit out markdown + index (as sqlite database) for me. And I’m planning to open-source that as a good netizen.

I do have a bit of paranoia, though. What if things are not as easy as they seem again? That lead to research an alternative approach.

Pandoc

Pandoc is a very popular open source document converter.

I did a quick test trying to convert existing ruby documentation to markdown and it worked. But every document contained different artifacts (like issues with links, random html tags or whole sections that are not even needed ) that required tweaking this entire process.

Pandoc offers a way to write filter to deal with that, but I need to use Lua for that! I kinda decided against it. With a wrapper like pandocomatic, it is still a workable solution. But I would rather leave it in case rdoc-Markdown idea will not work out.

What’s next?

I initially focused on building an CLI app, that can render markdown documentation and present a list of existing documentation on a system. The latter, doesn’t seem useful, but markdown presenter might need to wait until Markdown is actually being properly generated.My next steps would be to:- Create rdoc-markdown generator and open source it.- Create sqlite-index-builder that works with rdoc and existing markdown (and not open source it, because it’s hard to make generic implementation)- Start implementing basic CLI app – it should validate licenses from gumroad and preferable be able to parse Gemfile (to understand ruby and gem versions in use)

That’s all for now, folks!

Development updates