Reputation: 1479
I have a string "A.B.C one two three."
I have a task to tokenize this string into ["A.B.C", one, two, three], neglecting the period at the end of the sentence. I'm having trouble removing the period at the end of the sentence by itself without interfering with the A.B.C acronym.
Is there a way for me to remove just periods at the end of a sentence without affecting acronyms using python regexs?
Upvotes: 0
Views: 186
Reputation: 56714
word = re.compile(r'[A-Za-z.]*[A-Za-z]')
word.findall("A.B.C one two three.") # => ['A.B.C', 'one', 'two', 'three']
Upvotes: 2
Reputation: 308
line= "A.B.C one two three."
print line[:-1].split(' ')
may be this way as well
Upvotes: 0