Reputation: 2111
I am looking to extract only the sign and numbers from the data
Input: <70 (aze) , <0,03 (+) , >0.03 (+)
Output: <70 , <0,03 , >0.03
Trying with re.sub
but I can't pick the sign
re.sub("\D", "", text)
Upvotes: 1
Views: 49
Reputation: 627507
You can use
" , ".join(re.findall(r'[<>]?\d+(?:[.,]\d+)?', text))
See the regex demo. Details:
[<>]?
- an optional <
or >
\d+
- one or more digits(?:[.,]\d+)?
- an optional occurrence of .
or ,
and then one or more digits.See the Python demo:
import re
text = '<70 (aze) , <0,03 (+) , >0.03 (+)'
print( " , ".join(re.findall(r'[<>]?\d+(?:[.,]\d+)?', text)) )
# => <70 , <0,03 , >0.03
In Pandas:
df['text'] = df['text'].str.findall(r'[<>]?\d+(?:[.,]\d+)?').str.join(' , ')
Upvotes: 2