Rails Microsoft Word, XML databinding, repeat rows

Question

Those willing to jump straight to my questions can go to the paragraph "Please help with". You will find there my beginning of implementation, along with short XML samples

The story

The famous problem of inserting repeating content, like table rows, into a word template, using the rails framework.

I decided to implement a 'cleaner' solution for replacing some variables in a Word document with rails, using XML databinding. This solution works very well for non-repetitive content, but for repetitive content, a little extra dirty work must be done and I need help with it.

No C#, No Visual, just plain olde ruby on rails & XML

The databinded document

I have a Word document with some content controls, tagged with "human-readable" text, so my users know what should be inside.

I have used Word 2007 Content Control Toolkit to add some custom XML to a .docx file. Therefore in each .docx I have some customXml/itemsx.xml that contains my custom XML.

I have manually databinded this XML to text content control I have in my word template, using drag & drop with Word 2007 Content Control Toolkit.

The replacing process with nokogiri

Basically I already have some code that replaces every XML node by the corresponding value from a hash. For example if I provide this hash to my function :

variables = {
   "some_xml-node" => "some_value"
}

It will properly replace XML in customXml/itemsx.xml of .docx file :

  some_value

So this is taken care of !

The repetitive content

Now as I said, this works perfectly for non-repetitive content. For repetitive content (in my case I want to repeat some in a document), the solution I'd like to go with, is

Manually insert some tags in word/document.xml of .docx file (this is dirty, but hell I can't think of anything else) before every that needs to be duplicated
In rails, parse the XML and locate the that needs duplicating using Nokogiri
Copy the tr as many times as I need
Look at some text inside this , find the databinding (which looks like


Replace movie[1] by movie[index]
Repeat for every table that needs  duplication



With this solution Therefore I ensure 100% compatibility with my existing system ! It's some kind of preprocessing...

Please help with


Finding an XML comment containing a custom string, and selecting the node just below it (using Nokogiri)
Changing attributes in many sub-nodes of the node found in 1. 


XML/Hash samples that could be used (my beginning of implementation after that):

Sample of .docx word/document.xml


  
   
    
     "My_Custom_Tag_ID", 
        "repeatable-content" => "movie"
    },
    {
        "tag" => "My_Custom_Tag_ID_2", 
        "repeatable-content" => "cartoons"
    }
]


Sample of input parameter contents hash

contents_sample =
    {
        "movies" => [{"name" => "X-Men", 
                  "year" => 1998, 
                  "property-xxx" => 42 
                 }, { "name" => "X-Men-4", 
                  "year" => 2007, 
                  "property-xxx" => 42
                 }],
   "cartoons" => [{"name" => "Tom_Jerry", 
                            "year" => 1995, 
                            "property-yyy" => "cat" 
                           }, { "name" => "Random_name", 
                            "year" => 2008, 
                            "property-yyy" => 42
                           }] 
    }


My beginning of implementation :

    def dynamic_table_content(zip, repeat_tags, contents)
        doc = zip.find_entry("word/document.xml")
        xml = Nokogiri::XML.parse(doc.get_input_dtream)

        # repeat_tags_sample = [ {
        #    "tag" => My_Custom_Tag_ID", 
        #    "repeatable-content" => "movie"},
        #    ...]
        repeat_tags.each do |rpt|

            content = contents[rpt[:repeatable-content]]
            # content now looks like [ 
            #  {"name" => "X-Men", 
            #   "year" => 1998, 
            #   "property-xxx" => 42, ...}, 
            #  ...]
            content_name = rpt[:repeateable_content].to_s
            # the 'movie'  of '/root[1]/movies[1]/movie[1]/name[1]' (see below)

            puts "Processing #{rpt[:tag]}, adding #{content_name}s"

            # Word document.xml sample code looks like this :
            # 
            #  
            #   ...
            #   
            # ..../movie[1]/property-xxx[1]
            # GOAL : replace every movie[1] by movie[index]



Change attributes in many sub-nodes of the node found in 1. 

        new_tr_node.change_attributes as shown in (see GOAL in previous comments)
        # Maybe, it would be something like 
        # new_tr_node.gsub("(#{content_name})$$([1-9]+)$$", "\1$$#{index}$$")
        # ... But new_tr_node is a nokogiri element so .gsub doesn't exist 
    end
end
@replace["word/document.xml"] = xml.serialize :save_zip_with => 0


end

Cyril Duchon-Doris · Accepted Answer

I have looked at the DoPE extension for Word documents. It looks great ! But alas I had already done a lot of work, and just now I (almost) finished building my own preprocessor.

What I needed was more complicated than what I originally asked. But nevertheless, the answers would be :

EDIT : fixed bad regex/xpath

# 1. Find a comment containing a custom string, and select the node just below
comment_nodes = doc.xpath("//comment()")
# Loop like comment_nodes.each do |comment|
base_tr_node = comment.next_sibling.next_sibling
# For some reason, need to apply next_sibling twice, thought the comment is indeed just above the  node

# 2. Change attributes in many sub-nodes of the node found in 1. 
matches = tr_node.search('.//*[name()='w:dataBinding']')
matches.each do |databinding_node| 
    # replace '.*phase[1].*' by '.*phase[index].*'
    databinding_node['w:xpath'].gsub("#{comment.text}$$1$$", "#{comment.text}$$#{index}$$")
end

Rails Microsoft Word, XML databinding, repeat rows

Answers (1)

Related Questions