Reputation: 980
I'm trying to remove empty paragraphs from a .docx file before parsing the content into xml. How would I achieve this?
Protected Sub removeEmptyParagraphs(ByRef body As DocumentFormat.OpenXml.Wordprocessing.Body)
Dim colP As IEnumerable(Of Paragraph) = body.Descendants(Of Paragraph)()
Dim count As Integer = colP.Count
For Each p As Paragraph In colP
If (p.InnerText.Trim() = String.Empty) Then
body.RemoveChild(Of Paragraph)(p)
End If
Next
End Sub
Upvotes: 1
Views: 3257
Reputation: 1
This will help removing blank space for paragraph and even for blank pages too.
IEnumerable<Paragraph> paragraphs =
myDoc.MainDocumentPart.Document.Body.Elements<Paragraph>();
foreach (Paragraph paragraph in paragraphs)
{
if (paragraph != null && string.IsNullOrWhiteSpace(paragraph.InnerText))
{
paragraph.ParagraphProperties = new ParagraphProperties(
new ParagraphStyleId() { Val = "No Spacing" },
new SpacingBetweenLines() { After = "0" }
);
paragraph.ParagraphProperties.SpacingBetweenLines.AfterLines = 0;
paragraph.ParagraphProperties.SpacingBetweenLines.BeforeLines = 0;
paragraph.ParagraphProperties.SpacingBetweenLines.Line = "0";
}
}
Upvotes: 0
Reputation: 26517
The problem you might be running into is removing items from a list in a for each block. You could try using linq and the RemoveAll method:
Protected Sub removeEmptyParagraphs(ByRef body As DocumentFormat.OpenXml.Wordprocessing.Body)
Dim colP As IEnumerable(Of Paragraph) = body.Descendants(Of Paragraph)()
colP.RemoveAll(Function(para) para.InnerText.Trim() = String.Empty)
End Sub
Upvotes: 1