Generating a Catalog from Markdown

September 23rd, 2018

Originally, this site used a free MongoDB database for storing posts that could be rendered server-side or fetched on the client. Since I wasn't using a CMS, and I'm also slightly allergic to monthly subscriptions, adding posts to the database was tedious and required me to manually create a new entry and escape the markdown content.

I eventually created a script to automate part of the process, but it never felt great, and it was a source of friction when I actually wanted to write.

I considered rewriting my site in GatsbyJS, which allows you to build static sites from React components and has tons of plugins. I decided against it because the amount of time it would take would probably not be worth it, and I would lose the server-side rendering setup, which I enjoy maintaining and learning more about.

My posts are already written in Markdown, why couldn't I just use those files?

Parsing Markdown Metadata

I needed a way to store metadata in the markdown posts that was previously stored in the database. Information like the title, URL, and a timestamp. Thankfully, a loose standard exists: YAML front matter.

YAML front matter is just a way of specifying metadata inside a markdown file that can be parsed. It's used by Jekyll, but I could use the standard for a CLI to parse out the data at build time. Here's a quick example:

title: Awesome Post
path: /blog/awesome-post

Content of post.


After writing the logic to parse out the metadata that I was now storing inside my posts, I thought that someone else might find it useful, and I've released it as a CLI and Node module on npm: node-md-meta-cataloger.

After installing the module, you can easily generate a JSON catalog containing all the metadata and content, which then can be used to statically generate pages with something like html-webpack-plugin, or serve as a mini database.

Let's imagine we have a folder containing two posts at ~/posts with two metadata fields and we run the CLI like so:

> node-md-meta-cataloger --input ~/posts --output ~/posts/catalog.json

The JSON file generated would contain the following:

    "content": "markdown content",
    "filename": "post-one.md",
    "filepath": "~/posts/post-one.md",
    "metadata": {
      "title": "Post One",
      "author": "Joshua"
    "content": "markdown content",
    "filename": "post-two.md",
    "filepath": "~/posts/post-two.md",
    "metadata": {
      "title": "Post Two",
      "author": "Joshua"

There's a variety of options for sorting or transforming the results, and it can also be used programatically if you'd like to add it directly into your project and not have an additional build step.

It's a simple module, but I'm quite happy with the results. I can now write a post in markdown, push up the new file to GitHub, and the site will be rebuilt with the new post available. Way less friction!

Basic State Reducer
Apollo and GraphQL Make a Great Impression

Like what you read? Subscribe to stay in touch.

Written by Joshua Pohl
Software Engineer in San Antonio. Find me on Twitter.