Reputation: 563
I am practising regex and I would like to extract only characters from this list
text=['aQx12', 'aub 6 5']
I want to ignore the numbers and the white spaces and only keep the letters. The desired output is as follows
text=['aQx', 'aub']
I tried the below code but it is not working properly
import re
text=['aQx12', 'aub 6 5']
r = re.compile("\D")
newlist = list(filter(r.match, text))
print(newlist)
Can someone tell me what I need to fix
Upvotes: 0
Views: 1431
Reputation: 626747
You can remove any chars other than letters in a list comprehension.
No regex solution:
print( [''.join(filter(str.isalpha, s)) for s in ['aQx12', 'aub 6 5']] )
See the Python demo. Here is a regex based demo:
import re
text=['aQx12', 'aub 6 5']
newlist = [re.sub(r'[^a-zA-Z]+', '', x) for x in text]
print(newlist)
# => ['aQx', 'aub']
See the Python demo
If you need to handle any Unicode letters, use
re.sub(r'[\W\d_]+', '', x)
See the regex demo.
Upvotes: 1
Reputation: 103774
You can do this without a regex as well:
from string import ascii_letters
text=['aQx12', 'aub 6 5']
>>> [''.join([c for c in sl if c in ascii_letters]) for sl in text]
['aQx', 'aub']
Upvotes: 1
Reputation: 18406
You can use re.findall
then join the matches instead of using re.match
and filter
, also use [a-zA-Z]
to get only the alphabets.
>>> [''.join(re.findall('[a-zA-Z]', t)) for t in text]
['aQx', 'aub']
Upvotes: 1
Reputation: 780871
You're testing the entire string, not individual characters. You need to filter the characters in the strings.
Also, \D
matches anything that isn't a digit, so it will include whitespace in the result. You want to match only letters, which is [a-z]
.
r = re.compile(r'[a-z]', re.I)
newlist = ["".join(filter(r.match, s)) for s in text]
Upvotes: 1