frengo
frengo

Reputation: 387

Merging MS Word documents with Java

I'm looking for java libraries that read and write MS Word Document. What I have to do is:

users may make updates to the file.

I've searched and found POI Apache and UNO OpenOffice. The first one can easily read a template and replace any placeholders with my own data from DB. I didn't found anything about merging two, or more, documents. OpenOffice UNO looks more stable but complex too. Furthermore I'm not sure that it has the ability to merge documents..

We are looking the right direction?

Another solution i've thought was to convert doc file to docx. In that way I found more libraries that can help us merging documents. But how can I do that?

Thanks!

Upvotes: 1

Views: 7462

Answers (3)

victorpacheco3107
victorpacheco3107

Reputation: 862

I've developed the next class (using Apache POI):

import java.io.InputStream;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.List;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTBody;

public class WordMerge {

    private final OutputStream result;
    private final List<InputStream> inputs;
    private XWPFDocument first;

    public WordMerge(OutputStream result) {
        this.result = result;
        inputs = new ArrayList<>();
    }

    public void add(InputStream stream) throws Exception{            
        inputs.add(stream);
        OPCPackage srcPackage = OPCPackage.open(stream);
        XWPFDocument src1Document = new XWPFDocument(srcPackage);         
        if(inputs.size() == 1){
            first = src1Document;
        } else {            
            CTBody srcBody = src1Document.getDocument().getBody();
            first.getDocument().addNewBody().set(srcBody);            
        }        
    }

    public void doMerge() throws Exception{
        first.write(result);                
    }

    public void close() throws Exception{
        result.flush();
        result.close();
        for (InputStream input : inputs) {
            input.close();
        }
    }   
}

And its use:

public static void main(String[] args) throws Exception {

    FileOutputStream faos = new FileOutputStream("/home/victor/result.docx");

    WordMerge wm = new WordMerge(faos);

    wm.add( new FileInputStream("/home/victor/001.docx") );
    wm.add( new FileInputStream("/home/victor/002.docx") );

    wm.doMerge();
    wm.close();

}

Upvotes: 1

N Tarun
N Tarun

Reputation: 11

import java.io.File;
import java.util.List;

import javax.xml.bind.JAXBException;

import org.docx4j.dml.CTBlip;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.Part;
import org.docx4j.openpackaging.parts.PartName;
import org.docx4j.openpackaging.parts.WordprocessingML.ImageBmpPart;
import org.docx4j.openpackaging.parts.WordprocessingML.ImageEpsPart;
import org.docx4j.openpackaging.parts.WordprocessingML.ImageGifPart;
import org.docx4j.openpackaging.parts.WordprocessingML.ImageJpegPart;
import org.docx4j.openpackaging.parts.WordprocessingML.ImagePngPart;
import org.docx4j.openpackaging.parts.WordprocessingML.ImageTiffPart;
import org.docx4j.openpackaging.parts.relationships.RelationshipsPart;
import org.docx4j.openpackaging.parts.relationships.RelationshipsPart.AddPartBehaviour;
import org.docx4j.relationships.Relationship;

public class MultipleDocMerge {


    public static void main(String[] args) throws Docx4JException, JAXBException {
        File first = new File("D:\\Mreg.docx");
        File second = new File("D:\\Mreg1.docx");
        File third = new File("D:\\Mreg4&19.docx");
        File fourth = new File("D:\\test12.docx");   
        WordprocessingMLPackage f = WordprocessingMLPackage.load(first);
        WordprocessingMLPackage s = WordprocessingMLPackage.load(second);
        WordprocessingMLPackage a = WordprocessingMLPackage.load(third);
        WordprocessingMLPackage e = WordprocessingMLPackage.load(fourth);

        List body = s.getMainDocumentPart().getJAXBNodesViaXPath("//w:body", false);
        for(Object b : body){
            List filhos = ((org.docx4j.wml.Body)b).getContent();
            for(Object k : filhos)
                f.getMainDocumentPart().addObject(k);
        }

        List body1 = a.getMainDocumentPart().getJAXBNodesViaXPath("//w:body", false);
        for(Object b : body1){
            List filhos = ((org.docx4j.wml.Body)b).getContent();
            for(Object k : filhos)
                f.getMainDocumentPart().addObject(k);
        }

        List body2 = e.getMainDocumentPart().getJAXBNodesViaXPath("//w:body", false);
        for(Object b : body2){
            List filhos = ((org.docx4j.wml.Body)b).getContent();
            for(Object k : filhos)
                f.getMainDocumentPart().addObject(k);
        }


        List<Object> blips = e.getMainDocumentPart().getJAXBNodesViaXPath("//a:blip", false);
        for(Object el : blips){
            try {

                   CTBlip blip = (CTBlip) el;
                   RelationshipsPart parts = e.getMainDocumentPart().getRelationshipsPart();
                   Relationship rel = parts.getRelationshipByID(blip.getEmbed());
                   Part part = parts.getPart(rel);
                   if(part instanceof ImagePngPart)
                        System.out.println(((ImagePngPart) part).getBytes()); 
                   if(part instanceof ImageJpegPart)
                        System.out.println(((ImageJpegPart) part).getBytes()); 
                    if(part instanceof ImageBmpPart)
                        System.out.println(((ImageBmpPart) part).getBytes()); 
                    if(part instanceof ImageGifPart)
                        System.out.println(((ImageGifPart) part).getBytes()); 
                    if(part instanceof ImageEpsPart)
                        System.out.println(((ImageEpsPart) part).getBytes()); 
                    if(part instanceof ImageTiffPart)
                        System.out.println(((ImageTiffPart) part).getBytes()); 
                    Relationship newrel = f.getMainDocumentPart().addTargetPart(part,AddPartBehaviour.RENAME_IF_NAME_EXISTS);
                    blip.setEmbed(newrel.getId());
                    f.getMainDocumentPart().addTargetPart(e.getParts().getParts().get(new PartName("/word/"+rel.getTarget())));
                } catch (Exception ex){
                        ex.printStackTrace();
                } }

        File saved = new File("D:\\saved1.docx");
        f.save(saved);




    }

}

Upvotes: 1

Paul Jowett
Paul Jowett

Reputation: 6581

You could take a look at Docmosis since it provides the four features you have mentioned (data population, template/document merging, DOC format and java interface). It has a couple of flavours (download, online service), but you could sign up for a free trial of the cloud service to see if Docmosis can do what you want (then you don't have to install anything) or read the online documentation.

It uses OpenOffice under the hood (you can see from the developer guide installation instructions) which does pretty decent conversions between documents. The UNO API has some complications - I would suggest either Docmosis or JODReports to isolate your project from UNO directly.

Hope that helps.

Upvotes: 1

Related Questions