Reputation: 1
I am currently developing a word addin using the office js library. I need to get all sentences in the word document as individual ranges. For this I used getTextRanges() on the body of the document with "." as the delimiter. However, it also separates on paragraph mark which is not ideal for my use case. All I want is for the document to be divvied up into ranges where the only delimiter is "." - regardless of whether the ranges will then expand across paragraphs.
Is there a way to ignore paragraph marks with getTextRanges(), or is there another method entirely that I seem to have overlooked?
Thanks.
I have been unable to resolve it.
Upvotes: 0
Views: 61
Reputation: 168
Word.run(async (context) => {
const body = context.document.body;
context.load(body, 'text');
await context.sync();
const text = body.text;
// 3. Split the text into individual sentences
const sentences = text.split(/[.!?]/);
// 4. Create a list of sentences
const sentenceList = [];
sentences.forEach(sentence => {
// Ignore any paragraph marks found in the body of the document
sentenceList.push(sentence);
});
for (let i = 0 ; i < sentenceList.length ; i++) {
console.log(i + " " + sentenceList[i]);
}
await context.sync();
});
This seems to meet your requirements.
Upvotes: 0