OJFord
OJFord

Reputation: 11130

Read YAML metadata from a Pandoc markdown file

Is it possible to extract Pandoc's metadata (title, date, et al.) from a markdown file without a Haskell filter, or parsing the --to=json output?

The JSON output is particularly inconvenient for this, since a two-word title looks like:

$ pandoc -t json posts/test.md | jq '.meta | .title'
{
  "t": "MetaInlines",
  "c": [
    {
      "t": "Str",
      "c": "Test"
    },
    {
      "t": "Space"
    },
    {
      "t": "Str",
      "c": "post"
    }
  ]
}

so even after having jq read the title, we still need to reconstruct words, and any emphasis, code, or anything else is only going to make it more complicated.

Upvotes: 5

Views: 3031

Answers (1)

OJFord
OJFord

Reputation: 11130

We can use the template variable $meta-json$ for this.

Stick the variable in a file (with an extension, to stop Pandoc looking in it's own directories) and then use it with pandoc --template=file.ext.

Pandoc's output is a JSON object with keys "title", "date", "tags", etc. and their respective values from the markdown document, which we can easily parse, filter, and manipulate with jq.

$ echo '$meta-json$' > /tmp/metadata.pandoc-tpl
$ pandoc --template=/tmp/metadata.pandoc-tpl | jq '.title,.tags'
"The Title"
[
  "a tag",
  "another tag"
]

Upvotes: 10

Related Questions