Extending Markdown with middleware

12th February 2017

Introduction

Markdown is a great way of writing content and over the years we’ve seen a lot of common extensions to Markdown such as tables, fenced code blocks, footnotes and more.

When Markdown doesn’t support a feature you want you can simply write HTML but this can get a bit messy and goes against Markdowns strength of having highly readable source code.

This article will explain how you can add your own features to Markdown and extend it with middleware. It has the unfortunate side-effect of not making the source content very portable however it’s a great way to add functionality to environments that you are in control of.

For the purposes of this example I will be using a mix of standard Markdown and the following features we’ll be adding:

Partials

We’ll add support to import content into our document.

Slugified heading IDs

We’ll change headings to have slugified IDs so you can link to them easily.

Dynamic content

We’ll add support to dynamically replace some content. In this example I’ll use $API_KEY and $CURRENT_YEAR as my examples.

Note: If you get stuck you can view a working example of this code here.

Source document

Here is the Markdown document that we’ll be using in this tutorial.

# Hello world

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

## Partials

```partial
source: /partials/my-document.md
```

## Slugified heading IDs

All of the headings in this document will have IDs

## Dynamic content

The current year is $CURRENT_YEAR.

Here is an example code block:

```
curl -X "POST" "https://example.com/" \
 -H "X-Api-Key: $API_KEY" \
 -d "text=Hello $USERNAME"
```

Setting up the pipeline

We’ll be using a pipeline and filters to write our extensions, just like middleware they take an input, make changes then returns an output. This keeps the concerns of each filter very separate whilst allowing each filter to build on the result of previous filters.

For the purposes of this example we’ll be using two Ruby libraries. Banzai for the pipeline and Redcarpet to render the Markdown but the method used here can be translated into your language of choice.

# markdown_pipeline.rb
class MarkdownPipeline < Banzai::Pipeline
  def initialize
    super(
      DynamicContentFilter,
      CredentialsFilter,
      PartialFilter,
      MarkdownFilter, # Convert the Markdown to HTML
      HeadingFilter,
    )
  end
end

Setting up filters

Filters are simple. They take an input, make some changes and then return an output that is passed to the next filter or back to the caller once all filters have been run.

input --> filter --> output --> HTML
             ↑_________↓

#1 - DynamicContentFilter

This is the simplest of our filters. Think of it as a programmatic find-and-replace.

The gsub method returns a copy of the string with the all occurrences of pattern substituted for the second argument.

# dynamic_content_filter.rb
class DynamicContentFilter < Banzai::Filter
  def call(input)
    input.gsub("$CURRENT_YEAR", Time.current.year)
  end
end

#2 - CredentialsFilter

This filter is similar to DynamicContentFilter. We’re doing a search and replace here but this time we’ll take a piece of data from the current user’s session (provided that there is one).

# credentials_filter.rb
class CredentialsFilter < Banzai::Filter
  def call(input)
    # e.g. Either "Hello Adam" or "Hello world"
    input.gsub!("$USERNAME", current_user ? current_user.first_name : "World")

    # e.g. Either "ABC123"  or "EXAMPLE_API_KEY"
    input.gsub("$API_KEY", current_user ? current_user.api_key : "EXAMPLE_API_KEY")
  end
end

You could even extend this further so that the Regex was /$USER('(.+?)')/ which would allow you to specify which user attribute to return.

#3 - PartialFilter

In this example we’ll simply use Regex to search for our block. Notice the /m option, this is used to make our look-ahead capture over multiple lines.

This time we’ll be using a block with the gsub method this will replace the entire Regex match with the returned string within the block.

# partial_filter.rb
class PartialFilter < Banzai::Filter
  def call(input)
    # Regex for the block, capture the contents
    input.gsub(/```partial(.+?)```/m) do |s|

      # Run the captured contents (accessed with $1) through YAML to read the config
      config = YAML.load($1)

      # Take the source key from the config, this is the file we'll be injecting.
      document_path = "./#{config["source"]}"

      # Read the contents of the file and return the result
      File.read(document_path)
    end
  end
end

#4 - MarkdownFilter

This is the penultimate filter. We’ll simply use Redcarpet to convert our Markdown to HTML.

# markdown_filter.rb
class MarkdownFilter < Banzai::Filter
  def call(input)
    markdown.render(input)
  end

  private

  def renderer
    @renderer ||= Redcarpet::Render::HTML.new
  end

  def markdown
    @markdown ||= Redcarpet::Markdown.new(renderer, options)
  end
end

#5 - HeadingFilter

This filter is a little more involved. We want to ensure we’re only adding IDs to syntax that actually ends up being heading tags (i.e. h1, h2, h3 etc.). Because of this it’s safer if we run this filter after it has been converted to HTML.

We could do this with Regex but let’s instead use a tool called Nokogiri to interoperate the HTML and do the heavy lifting.

# heading_filter.rb
class HeadingFilter < Banzai::Filter
  def call(input)
    @input = input
    document.css('h1,h2,h3,h4,h5,h6').each do |heading|
      # parameterize is part of ActiveSupport::Inflector
      # See https://goo.gl/HyZ1L8 for alternative if you do not
      # have this module available.
      heading['id'] = heading.text.parameterize
    end
    @document.to_html
  end

  private

  def document
    @document ||= Nokogiri::HTML::DocumentFragment.parse(@input)
  end
end

Rendering

To render our document all you need to do is call the pipeline with the document source.

# app.rb
document = File.read("./document.md")
content = MarkdownPipeline.new.call(document)

puts content

The final output should look something like this:

<h1 id="hello-world">Hello world</h1>

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>

<h2 id="partials">Partials</h2>

<p>I am the contents of <code>/partials/my-document.md</code></p>

<h2 id="slugified-heading-ids">Slugified heading IDs</h2>

<p>All of the headings in this document will have IDs</p>

<h2 id="dynamic-content">Dynamic content</h2>

<p>The current year is 2017.</p>

<p>Here is an example code block:</p>

<pre><code>curl -X "POST" "https://example.com/" \
 -H "X-Api-Key: ABC123" \
 -d "text=Hello Adam"
</code></pre>

Conclusion

This isn’t always the right solution but if you control your environment and manage your own content though Markdown adding bespoke functionality can be a practical way of enhancing your document source without adding messy HTML to it. This method is actively being used by our team in building the Nexmo developer portal.

These examples are only simple but the possibilities are endless. Let me know if you build any useful filters on [email protected].

You can find the working source code for this here.