mrcook
mrcook

Reputation: 640

Nokogiri Builder: Replace RegEx match with XML

While using Nokogiri::XML::Builder I need to be able to generate a node that also replaces a regex match on the text with some other XML.

Currently I'm able to add additional XML inside the node. Here's an example;

def xml
  Nokogiri::XML::Builder.new do |xml|
    xml.chapter {
      xml.para {
        xml.parent.add_child("Testing[1] footnote paragraph.")
        add_footnotes(xml, 'An Entry')
      }
    }
  end.to_xml
end

# further child nodes WILL be added to footnote
def add_footnotes(xml, text)
  xml.footnote text
end

which produces;

<chapter>
  <para>Testing[1] footnote paragraph.<footnote>An Entry</footnote></para>
</chapter>

But I need to be able to run a regex replace on the reference [1], replacing it with the <footnote> XML, producing output like the following;

<chapter>
  <para>Testing<footnote>An Entry</footnote> footnote paragraph.</para>
</chapter>

I'm making the assumption here that the add_footnotes method would receive the reference match (e.g. as $1), which would be used to pull the appropriate footnote from a collection.

That method would also be adding additional child nodes, such as the following;

<footnote>
  <para>Words.</para>
  <para>More words.</para>
</footnote>

Can anyone help?

Upvotes: 1

Views: 303

Answers (2)

the Tin Man
the Tin Man

Reputation: 160571

Here's a spin on your code that shows how to generate the output. You'll need to refit it to your own code....

require 'nokogiri'

FOOTNOTES = {
  '1' => 'An Entry'
}
child_text = "Testing[1] footnote paragraph."

pre_footnote, footnote_id, post_footnote = /^(.+)\[(\d+)\](.+)/.match(child_text).captures

doc = Nokogiri::XML::Builder.new do |xml|
  xml.chapter {
    xml.para {
      xml.text(pre_footnote)
      xml.footnote FOOTNOTES[footnote_id]
      xml.text(post_footnote)
    }
  }
end
puts doc.to_xml

Which outputs:

<?xml version="1.0"?>
<chapter>
  <para>Testing<footnote>An Entry</footnote> footnote paragraph.</para>
</chapter>

The trick is you have to grab the text preceding and following your target so you can insert those as text nodes. Then you can figure out what needs to be added. For clarity in your code you should preprocess all the text, get your variables figured out, then fall into the XML generator. Don't try to do any calculations inside the Builder block, instead just reference variables. Think of Builder like a view in an MVC-type application if that helps.

FOOTNOTES could actually be a database lookup, a hash or some other data container.


You should also look at the << method, which lets you inject XML source, so you could pre-build the footnote XML, then loop over an array containing the various footnotes and inject them. Often it's easier to pre-process, then use gsub to treat things like [1] as placeholders. See "gsub(pattern, hash) → new_str" in the documentation, along with this example:

'hello'.gsub(/[eo]/, 'e' => 3, 'o' => '*')    #=> "h3ll*"

For instance:

require 'nokogiri'

text = 'this is[1] text and[2] text'
footnotes = {
  '[1]' => 'some',
  '[2]' => 'more'
}

footnotes.keys.each do |k|
  v = footnotes[k]
  footnotes[k] = "<footnote>#{ v }</footnote>"
end
replacement_xml = text.gsub(/\[\d+\]/, footnotes) # => "this is<footnote>some</footnote> text and<footnote>more</footnote> text"

doc = Nokogiri::XML::Builder.new do |xml|
  xml.chapter {
    xml.para { xml.<<(replacement_xml) }
  }
end
puts doc.to_xml

# >> <?xml version="1.0"?>
# >> <chapter>
# >>   <para>this is<footnote>some</footnote> text and<footnote>more</footnote> text</para>
# >> </chapter>

Upvotes: 0

Arup Rakshit
Arup Rakshit

Reputation: 118289

I can try as below :

require 'nokogiri'

def xml
  Nokogiri::XML::Builder.new do |xml|
    xml.chapter {
      xml.para {
        xml.parent.add_child("Testing[1] footnote paragraph.")
        add_footnotes(xml, 'add text',"[1]")
      }
    }
  end.to_xml
end

def add_footnotes(xml, text,ref)
  string = xml.parent.child.content
  xml.parent.child.content = ""
  string.partition(ref).each do |txt|
    next xml.text(txt) if txt != ref
    xml.footnote text
  end
end

puts xml
# >> <?xml version="1.0"?>
# >> <chapter>
# >>   <para>Testing<footnote>add text</footnote> footnote paragraph.</para>
# >> </chapter>

Upvotes: 0

Related Questions