Cyril Duchon-Doris
Cyril Duchon-Doris

Reputation: 13939

Rails Microsoft Word, XML databinding, repeat rows

Those willing to jump straight to my questions can go to the paragraph "Please help with". You will find there my beginning of implementation, along with short XML samples

The story

The famous problem of inserting repeating content, like table rows, into a word template, using the rails framework.

I decided to implement a 'cleaner' solution for replacing some variables in a Word document with rails, using XML databinding. This solution works very well for non-repetitive content, but for repetitive content, a little extra dirty work must be done and I need help with it.

No C#, No Visual, just plain olde ruby on rails & XML

The databinded document

I have a Word document with some content controls, tagged with "human-readable" text, so my users know what should be inside.

I have used Word 2007 Content Control Toolkit to add some custom XML to a .docx file. Therefore in each .docx I have some customXml/itemsx.xml that contains my custom XML.

I have manually databinded this XML to text content control I have in my word template, using drag & drop with Word 2007 Content Control Toolkit.

The replacing process with nokogiri

Basically I already have some code that replaces every XML node by the corresponding value from a hash. For example if I provide this hash to my function :

variables = {
   "some_xml-node" => "some_value"
}

It will properly replace XML in customXml/itemsx.xml of .docx file :

<root> <some> <xml-node>some_value</xml-node></some> </root>

So this is taken care of !

The repetitive content

Now as I said, this works perfectly for non-repetitive content. For repetitive content (in my case I want to repeat some <w:tr> in a document), the solution I'd like to go with, is

  1. Manually insert some tags in word/document.xml of .docx file (this is dirty, but hell I can't think of anything else) before every <tr> that needs to be duplicated
  2. In rails, parse the XML and locate the <tr> that needs duplicating using Nokogiri
  3. Copy the tr as many times as I need
  4. Look at some text inside this <tr>, find the databinding (which looks like <w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/name[1]"
  5. Replace movie[1] by movie[index]
  6. Repeat for every table that needs <tr> duplication

With this solution Therefore I ensure 100% compatibility with my existing system ! It's some kind of preprocessing...

Please help with

  1. Finding an XML comment containing a custom string, and selecting the node just below it (using Nokogiri)
  2. Changing attributes in many sub-nodes of the node found in 1.

XML/Hash samples that could be used (my beginning of implementation after that):

Sample of .docx word/document.xml

<w:document>
  <!-- My_Custom_Tag_ID -->
  <w:tr someparam="something"> 
    <w:td></w:td>
    <w:td><w:sthelse></w:sthelse><w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/name[1]><w:sth>Value</w:sth></w:td> 
    <w:td></<:td>
  </w:tr>
</w:document>

Sample of input parameter repeat_tag hash

repeat_tags_sample = [ 
    {
        "tag" => "My_Custom_Tag_ID", 
        "repeatable-content" => "movie"
    },
    {
        "tag" => "My_Custom_Tag_ID_2", 
        "repeatable-content" => "cartoons"
    }
]

Sample of input parameter contents hash

contents_sample =
    {
        "movies" => [{"name" => "X-Men", 
                  "year" => 1998, 
                  "property-xxx" => 42 
                 }, { "name" => "X-Men-4", 
                  "year" => 2007, 
                  "property-xxx" => 42
                 }],
   "cartoons" => [{"name" => "Tom_Jerry", 
                            "year" => 1995, 
                            "property-yyy" => "cat" 
                           }, { "name" => "Random_name", 
                            "year" => 2008, 
                            "property-yyy" => 42
                           }] 
    }

My beginning of implementation :

    def dynamic_table_content(zip, repeat_tags, contents)
        doc = zip.find_entry("word/document.xml")
        xml = Nokogiri::XML.parse(doc.get_input_dtream)

        # repeat_tags_sample = [ {
        #    "tag" => My_Custom_Tag_ID", 
        #    "repeatable-content" => "movie"},
        #    ...]
        repeat_tags.each do |rpt|

            content = contents[rpt[:repeatable-content]]
            # content now looks like [ 
            #  {"name" => "X-Men", 
            #   "year" => 1998, 
            #   "property-xxx" => 42, ...}, 
            #  ...]
            content_name = rpt[:repeateable_content].to_s
            # the 'movie'  of '/root[1]/movies[1]/movie[1]/name[1]' (see below)

            puts "Processing #{rpt[:tag]}, adding #{content_name}s"

            # Word document.xml sample code looks like this :
            # <!-- My_Custom_Tag_ID_inserted_manually -->
            # <w:tr ...> 
            #   ...
            #   <w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/name[1]> 
            #   ...
            # </w:tr>
  1. Find a comment containing a custom string, and select the node just below

            # Find starting <w:tr > tag located after <!-- rpt[:tag] -->
            base_tr_node = find the node after
    
            # Duplicate it as many times as we want.
            content.each_with_index do |content, index|
                puts "Adding #{content_name} : #{content}.to_s"
    
                new_tr_node = base_tr_node.add_next_sibling(base_tr_node)
    
                # inside this new node there are many 
                # <w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/name[1]>
                # <w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/year[1]>
                # ..../movie[1]/property-xxx[1]
                # GOAL : replace every movie[1] by movie[index]
    
    1. Change attributes in many sub-nodes of the node found in 1.

              new_tr_node.change_attributes as shown in (see GOAL in previous comments)
              # Maybe, it would be something like 
              # new_tr_node.gsub("(#{content_name})\[([1-9]+)\]", "\1\[#{index}\]")
              # ... But new_tr_node is a nokogiri element so .gsub doesn't exist 
          end
      end
      @replace["word/document.xml"] = xml.serialize :save_zip_with => 0
      

      end

Upvotes: 0

Views: 719

Answers (1)

Cyril Duchon-Doris
Cyril Duchon-Doris

Reputation: 13939

I have looked at the DoPE extension for Word documents. It looks great ! But alas I had already done a lot of work, and just now I (almost) finished building my own preprocessor.

What I needed was more complicated than what I originally asked. But nevertheless, the answers would be :

EDIT : fixed bad regex/xpath

# 1. Find a comment containing a custom string, and select the node just below
comment_nodes = doc.xpath("//comment()")
# Loop like comment_nodes.each do |comment|
base_tr_node = comment.next_sibling.next_sibling
# For some reason, need to apply next_sibling twice, thought the comment is indeed just above the <w:tr> node

# 2. Change attributes in many sub-nodes of the node found in 1. 
matches = tr_node.search('.//*[name()='w:dataBinding']')
matches.each do |databinding_node| 
    # replace '.*phase[1].*' by '.*phase[index].*'
    databinding_node['w:xpath'].gsub("#{comment.text}\[1\]", "#{comment.text}\[#{index}\]")
end

Upvotes: 0

Related Questions