Sherin
Sherin

Reputation: 349

Relace HWPFDocument paragraph text using java results strange output

I require to replace a HWPFDocument paragraph text of .doc file if it contains a particular text using java. It replaces the text. But the process writes the output text in a strange way. Please help me to rectify this issue. Code snippet used:

public static HWPFDocument processChange(HWPFDocument doc)
{
    try
    {
        Range range = doc.getRange();
        for (int i = 0; i < range.numParagraphs(); i++)
        {
            Paragraph paragraph = range.getParagraph(i);
            if (paragraph.text().contains("Place Holder"))
            {
                String text = paragraph.text();
                paragraph.replaceText(text, "*******");

            }
        }
    }
    catch (Exception ex)
    {
        ex.printStackTrace();
    }
    return doc;
}

Input:

Place Holder 
Textvalue1
Textvalue2
Textvalue3

Output:

*******Textvalue1
Textvalue1
Textvalue2
Textvalue3

Upvotes: 3

Views: 2068

Answers (1)

Rainer Schwarze
Rainer Schwarze

Reputation: 4745

The HWPF library is not in a perfect state for changing / writing .doc files. (At least at the last time that I looked. Some time ago I developed a custom variant of HWPF for my client which - among many other things - provides correct replace and save operations, but that library is not publicly available.)

If you absolutely must use .doc files and Java you may get away by replacing with strings of exactly same length. For instance "12345" -> "abc__" (_ being spaces or whatever works for you). It might make sense to find the absolute location of the to be replaced string in the doc file (using HWPF) and then changing it in the doc file directly (without using HWPF).

Word file format is very complicated and "doing it right" is not a trivial task. Unless you are willing to spend many man months, it will also not be possible to fix part of the library so that just saving works. Many data structures must be handled very precisely and a single "slip up" lets Word crash on the generated output file.

Upvotes: 3

Related Questions