Reputation: 55
I want to print a sentence in my terminal with some specific words in curly parenthesis. For instance if I want the word in 5th and 7th position of this sentence to be parenthesised:
My important word is here and there.
The output should be:
My important word is {here} and {there}.
I want the solution to be in python and in particular with spacy. So far I managed to do a program like this:
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('My important word is here and there.')
my_important_words = [4,6]
for token in doc:
if token.i in my_important_words:
print("{"+token.text+"}")
else:
print(token.text)
But not only my for loop displays words line by lines but also it sounds pretty verbose program to me. I cannot believe a library like spacy has not a straightforward one/twoliner way to do that.
Any solution?
PS: I know there is displacy fancy solutions for stressing words with some labeled property like this: Spacy Verb highlight?
but it is not really the same because 1) my set of words is a list of words/tokens arbitrary chosen by me 2) I do not want some displacy render html things. I just want plain print on my terminal.
Upvotes: 1
Views: 301
Reputation: 879
A two liner for your use case could be:
import re
import spacy
nlp = spacy.load('en_core_web_lg')
doc = nlp('My important word is here and there.')
my_important_words = [4,6]
# First line: this basically does what you're looking for, but adds an extra space before every punctuation character...
output_string = " ".join([token.text if token.i not in my_important_words else '{'+token.text+'}' for token in doc])
# Second line: solves the 'extra space before punctuation' explained before
output_string = re.sub(' ([@.#$\/:-?!])', r'\1', output_string)
# Results
print(output_string)
The output of the previous code gets what you're looking for in the CLI:
My important word is {here} and {there}.
Hope it helps.
Upvotes: 1