Reputation: 743
Given the sentence "I want to eat fish and I want to buy a car. Therefore, I have to make money."
I want to split the sentene by
['I want to eat fish', 'I want to buy a car", Therefore, 'I have to make money']
I am trying to split the sentence
re.split('.|and', sentence)
However, it splits the sentence by '.', 'a', 'n', and 'd'.
How can I split the sentence by '.' and 'and'?
Upvotes: 1
Views: 123
Reputation: 106455
In addition to escaping the dot (.
), which matches any non-newline character in regex, you should also match any leading or trailing spaces in order for the delimiter of the split to consume undesired leading and trailing spaces from the results. Use a positive lookahead pattern to assert a following non-whitespace character in the end to avoid splitting by the trailing dot:
re.split('\s*(?:\.|and)\s*(?=\S)', sentence)
This returns:
['I want to eat fish', 'I want to buy a car', 'Therefore, I have to make money.']
Demo: https://replit.com/@blhsing/LimitedVastCookies
Upvotes: 2
Reputation: 5425
You need to escape the .
in the regex.
import re
s = "I want to eat fish and I want to buy a car. Therefore, I have to make money."
re.split('\.|and', s)
Result:
['I want to eat fish ',
' I want to buy a car',
' Therefore, I have to make money',
'']
Upvotes: 1