Reputation: 183
I have a word file with 9 pages.
I use this:
Microsoft.Office.Interop.Word.Application wordApp = new Microsoft.Office.Interop.Word.Application();
Microsoft.Office.Interop.Word.Document wordDoc = wordApp.Documents.Open(file);
Microsoft.Office.Interop.Word.Range docRange = wordDoc.Range();
But, this code will give me range of all paragraph.
How to get the range of text in fist line (or first paragraph) of each page using C# Word interlop?
Sorry about my english ...
Ex: At the first page i want to get text:
"Apple Inc. is an American multinational technology company headquartered in Cupertino, California,"
or first paragraph
"Apple Inc. is an American multinational technology company headquartered in Cupertino, California, that designs, develops, and sells consumer electronics, computer software, and online services. It is considered one of the Big Four technology companies, alongside Amazon, Google, and Microsoft."
The second page is:
the Text i want:
Apple was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in April 1976 to develop and sell
or
Apple was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in April 1976 to develop and sell Wozniak's Apple I personal computer, though Wayne sold his share back within 12 days.
Upvotes: 2
Views: 1225
Reputation: 540
You can try iterate through all paragraphs and get page number. Then select the first paragraph of the page.
using Word = Microsoft.Office.Interop.Word;
private void FindFirstParagraphOfEachPage()
{
Word.Application wordApp = new Word.Application();
Word.Document wordDoc = wordApp.Documents.Open(filePath);
Word.Range docRange = wordDoc.Range();
var paragraphs = new List<Paragraph>();
foreach (Word.Paragraph p in wordDoc.Paragraphs)
{
paragraphs.Add(new Paragraph()
{
PageNumber = (int)p.Range.get_Information(Word.WdInformation.wdActiveEndPageNumber),
ParagraphText = p.Range.Text.ToString()
});
}
var result = paragraphs.Where(x => !string.IsNullOrWhiteSpace(x.ParagraphText))
.GroupBy(x => x.PageNumber)
.Select(x => x.First());
wordDoc.Close();
wordApp.NormalTemplate.Saved = true;
wordApp.Quit();
}
Helper class to store page number and paragraph text.
class Paragraph
{
public int PageNumber { get; set; }
public string ParagraphText { get; set; }
}
I am not sure about releasing the objects. It probably will require some edits and testing.
Upvotes: 1