Reputation: 17
I have written following Code in Python to "clean" my strings:
df['TextCleaning'] = df['Text'].apply(lambda x: re.findall('[äöüßÖÄa-zA-Z].*[öäüßÖÄÜa-zA-Z0-9]', x)[0])
Now I makes "1.2.1 Hello" (Text) to just "Hello" (TextCleaning). What I want to do now is -> save the "1.2.1" in a own column. Can you help me?
Upvotes: 0
Views: 133
Reputation: 658
try this,
Change the regex,
out = "1.2.1 Hello "
new = " ".join(re.findall("[0-9.]+", out))
Output
'1.2.1'
Upvotes: 0
Reputation: 71610
You can do expand=True
, with pd.Series.str.split
:
df[['Text', 'TextCleaning'] = df['Text'].str.split('(?![öäüßÖÄÜa-zA-Z0-9])\s+(?=[äöüßÖÄa-zA-Z])', expand=True)
Upvotes: 1
Reputation: 423
This will work for you
output = "2.1.3 Hello world"
word1 = re.findall("\d+\.\d+\.\d", output )
Output
['2.1.3']
output = "2.45.6 Hello 22.3.9 world"
word = re.findall("\d+\.\d+\.\d", output )
Output
['2.45.6', '22.3.9']
output = "2.6 Hello 3.9 world"
word = re.findall("\d+\.\d", output )
Output
['2.6', '3.9']
Upvotes: 1