Reputation: 1
If I want to split a string with spaces preserved, but don't want to include special characters and numbers.
So it would look like this.
sentence = "jak3 love$ $b0x1n%"
list_after_split = ["jak", " ", "love", " ", "bxn"]
I want to use re.split()
, but I am not sure what to write as a pattern.
Upvotes: 0
Views: 115
Reputation: 316
If you want to condense whitespaces into a single space:
import re
# String with multi-spaces, tab(s), and newline(s).
s='Jak3 \t love$s \n $D0ax1t3e90r%.'
print(s)
# Jak3 love$s
# $D0ax1t3e90r%.
# First, remove all characters which aren't letters or a space.
# Second, condense spaces together into a single space.
# Third, split into desired list.
print(re.split(r'( )', re.sub(r' +',' ',re.sub(r'[^a-zA-Z ]+', '', s))))
# ['Jak', ' ', 'loves', ' ', 'Daxter']
Upvotes: 0
Reputation: 22418
Try filtering the unwanted characters out first:
>>> import re
>>> sentence = "jak3 love$ $b0x1n%"
>>> sentence_filtered = re.sub(r'[^a-zA-Z\s]+', '', sentence)
>>> # Alternative: sentence_filtered = ''.join(ch for ch in sentence if ch.isalpha() or ch.isspace())
>>> sentence_filtered
'jak love bxn'
>>> re.split('(\s+)', sentence_filtered)
['jak', ' ', 'love', ' ', 'bxn']
Upvotes: 3