user26992
user26992

Reputation: 23

Return the text string of el.content in pandoc filter

When using pandoc filter, I just want to get the text of el.content, but it return a Table!

The .md as follows(just for debug):

[It's so easy!]{color="red"}. Today is Monday.

I want to get the string It's so easy! be printed. So, I write the code:

function Span(el)
  color  = el.attributes['color']
  strTxt = el.content
  print(strTxt)
end

but it's not true! By using el.text also the same!

Upvotes: 2

Views: 1592

Answers (3)

tarleb
tarleb

Reputation: 22609

The module pandoc.utils contains a function stringify which will convert an element into a list:

function Span(el)
  -- Print just the text in the span; removes all markup.
  print(pandoc.utils.stringify(el))
end

This will print It’s so easy! (note the effect of pandoc's smart handling of apostrophes: a closing curly quote has replaced the straight apostrophe ').

It's also possible to print the output in specific markup language (requires pandoc 2.17 or later):

function Span(el)
  -- Prints the span's contents as Markdown
  print(pandoc.write(pandoc.Pandoc{pandoc.Plain(el)}, 'markdown'))
end

Consult the Lua filters docs for more info on how to use them.

Upvotes: 4

Carlos Luis Rivera
Carlos Luis Rivera

Reputation: 3643

The following lua-filter will colourise the texts that are marked-up in md style as [It's so **easy**!]{color="red"}, as well as plain texts. The idea to specify a font colour by name and by hex-RGB (e.g. [It's so **easy**!]{color="#5588FF"}) is originally implemented here, and is not mine. I just modified the original lua-filter so that we can also apply the filter to create revealjs or beamer slides.

Span = function(span)
  color = span.attributes['color']
  -- if no color attribute, return unchange
  if color == nil then return span end

  -- tranform to <span style="color: red;"></span>
  if FORMAT:match 'html' or FORMAT:match 'revealjs' then
    -- remove color attributes
    span.attributes['color'] = nil
    -- use style attribute instead
    span.attributes['style'] = 'color: ' .. color .. ';'
    -- return full span element
    return span
  elseif FORMAT:match 'latex' or FORMAT:match 'beamer' then
    -- remove color attributes
    span.attributes['color'] = nil
    -- encapsulate in latex code
    if string.sub(color, 1, 1) == "#" and #color == 7 then
      -- TODO: requires xcolor
      local R = tostring(tonumber(string.sub(color, 2, 3), 16))
      local G = tostring(tonumber(string.sub(color, 4, 5), 16))
      local B = tostring(tonumber(string.sub(color, 6, 7), 16))
      table.insert(
        span.content, 1,
        pandoc.RawInline('latex', '\\textcolor[RGB]{'..R..','..G..','..B..'}{')
      )
    elseif string.sub(color, 1, 1) == "#" and #color == 4 then
      -- TODO: requires xcolor
      local R = tostring(tonumber(string.sub(color, 2, 2), 16) * 0x11)
      local G = tostring(tonumber(string.sub(color, 3, 3), 16) * 0x11)
      local B = tostring(tonumber(string.sub(color, 4, 4), 16) * 0x11)
      table.insert(
        span.content, 1,
        pandoc.RawInline('latex', '\\textcolor[RGB]{'..R..','..G..','..B..'}{')
      )
    else
      table.insert(
        span.content, 1,
        pandoc.RawInline('latex', '\\textcolor{'..color..'}{')
      )
    end
    table.insert(
      span.content,
      pandoc.RawInline('latex', '}')
    )
    -- returns only span content
    return span.content
  else
    -- for other format return unchanged
    return span
  end
end

Upvotes: 0

Piglet
Piglet

Reputation: 28958

So I have never used Pandoc before so applogize if I'm making anything wrong here.

I installed Pandoc, I created a filter.lua like yours

function Span(el)
  print(el.content)
end

I created a test.md with your contents

[It's so easy!]{color="red"}. Today is Monday.

And I ran pandoc --lua-filter=filter.lua -f markdown test.md

and it printed

table: 00000000078ba480
<p><span color="red">ItÔÇÖs so easy!</span>. Today is Monday.</p>

Whatever happened to that '...

So I took a look inside that table

function Span(el)
  for k,v in pairs(el.content) do print(k,v) end
end

Which printed

1       table: 0000000007874e90
2       table: 0000000007875010
3       table: 0000000007875050
4       table: 0000000007875090
5       table: 0000000007876190
<p><span color="red">ItÔÇÖs so easy!</span>. Today is Monday.</p>

So that must be the list of Inlines that the manual mentions

Let's look inside!

function Span(el)
  for i, tbl in ipairs(el.content) do
    print(string.format("Table #%d contains: ", i))
    for k, v in pairs(tbl) do
      print(k,v)
    end
  end
end

which prints

Table #1 contains:
text    Itâ?Ts
Table #2 contains:
Table #3 contains:
text    so
Table #4 contains:
Table #5 contains:
text    easy!
<p><span color="red">ItÔÇÖs so easy!</span>. Today is Monday.</p>

So those tables in that table are most likely some Inline objects and they have a text attribute that bears the words you were looking for.

You see it is pretty simple to examine mysterious tables using a few loops and prints.

Upvotes: 1

Related Questions