Reputation: 1315
I want to split a bunch of strings based on a phrase and get the second element. However, in the cases where the string cannot be split, I want to retain the first element. Here is an example showing my current approach, where I by default always extract the second element:
import pandas as pd
df = pd.DataFrame({"a" : ["this is a (test), it is", "yet another"]})
df["a"].str.split("\(test\)", 1).str[1]
As you can see, this (incorrectly) gives me
0 , it is
1 NaN
Name: a, dtype: object
whereas my desired output should be
0 , it is
1 yet another
Name: a, dtype: object
Upvotes: 1
Views: 673
Reputation: 862601
Add Series.fillna
with original column a
:
df['b'] = df["a"].str.split("\(test\)", 1).str[1].fillna(df["a"])
#alternative
#df['b'] = df["a"].str.split("\(test\)", 1).str[1].combine_first(df["a"])
print (df)
a b
0 this is a (test), it is , it is
1 yet another yet another
Upvotes: 3