potong
potong

Reputation: 58558

Using a lua filter, how do you convert a Table into json or native text?

I want to replace Table blocks within a markdown document, with RawBlocks which will eventually contain latex.

I can find the table blocks in a document using:

function Table(table)

However I need to convert each table block to json or native text, so it can be fed to:

local function table2latex(t)
    local latex = pandoc.pipe('pandoc',{'-f','native','-t','latex'},t)
    ...

I can not find any examples that do this kind of conversion, having scoured the pandoc documentation and other sites such as github and stackoverflow.

Is it possible or will I need to write a bespoke embedded writer?

EDIT

Following on. The answer below led to the following script:

-- file: boxtables.lua
--
-- This filter extracts tables and converts them to latex
-- The latex is then altered by a sed script to "box" each cell
-- using "|" and '\hline' as column and row delimiters.
-- The original document containing the table(s) must have
-- 'header-includes' containing '\usepackage{longtable,booktabs)}'
-- The user should have the `jq`and 'sed' utilities installed and the shell
-- script 'table2latex.sh' executable in the current working directory
--
-- #!/bin/sh
-- latex="$(pandoc -f json -t latex)"
-- tmpl='{blocks:[{t:"RawBlock",c:["latex",$tab]}],"pandoc-api-version":[1,17,3],meta:{}}'
-- jq -n --arg tab "$latex" $tmpl
--
-- The lua filter can be invoked:
--    pandoc --lua-filter boxtables.lua -o documentContainingTable.pdf documentContainingTable.md
--
-- N.B. 'header-includes' and other details may be set on the command line or as a YAML header e.g.
--
-- ---
-- geometry: landscape, scale=0.9, centering
-- header-includes:
--   - \usepackage{longtable,booktabs}
-- ---
--
function table (tab)
  -- create a new pandoc document containing only the table
  local dummy_doc = pandoc.Pandoc(tab)
  -- use table2latex.sh to convert the doc to latex
  local tex_doc = pandoc.utils.run_json_filter(dummy_doc, 'table2latex.sh')
  -- only a single block containing the table will be returned
  return tex_doc.blocks
end

function rawblock (rb)
  local oldlatex = rb.text
  local sed = [[/^\\begin{longtable}/{s/@{}/&\n/;:a;ta;/\n@{}/!s/\n\(.\)/|\1\n/;ta;s/\n/|/};s/toprule/hline/;/mid\|bottom/d;/tabularnewline/a\\\hline]]
  local newlatex = pandoc.pipe('sed',{sed},oldlatex)
  return pandoc.RawBlock('latex', newlatex)
end

return {{Table = table},{RawBlock = rawblock}}

Upvotes: 2

Views: 1476

Answers (1)

tarleb
tarleb

Reputation: 22659

There is no built-in and hence no fully portable way to do this. The best I could come up with is to use a shell script invoked as a JSON filter to convert tables to latex. The following builds on pandoc.utils.run_json_filter, which is available since pandoc 2.1.1.

-- file: table2latex.lua
local utils = require 'pandoc.utils'

function Table (tab)
  -- create a new pandoc document containing only the table
  local dummy_doc = pandoc.Pandoc(tab)
  -- use table2latex.sh to convert the doc to latex
  local tex_doc = utils.run_json_filter(dummy_doc, 'table2latex.sh')
  -- only a single block containing the table will be returned
  return tex_doc.blocks
end

The table2latex.sh helper filter uses jq to generate JSON.

#!/bin/sh
latex="$(pandoc -f json -t latex)"
tmpl='{blocks:[{t:"RawBlock",c:["latex",$tab]}],"pandoc-api-version":[1,17,3],meta:{}}'
jq -n --arg tab "$latex" $tmpl

Upvotes: 2

Related Questions