N08
N08

Reputation: 1315

Extracting (get) second string if exists after splitting, otherwise first

I want to split a bunch of strings based on a phrase and get the second element. However, in the cases where the string cannot be split, I want to retain the first element. Here is an example showing my current approach, where I by default always extract the second element:

import pandas as pd
df = pd.DataFrame({"a" : ["this is a (test), it is", "yet another"]})
df["a"].str.split("\(test\)", 1).str[1]

As you can see, this (incorrectly) gives me

0    , it is
1        NaN
Name: a, dtype: object

whereas my desired output should be

0     , it is
1    yet another
Name: a, dtype: object

Upvotes: 1

Views: 673

Answers (1)

jezrael
jezrael

Reputation: 862601

Add Series.fillna with original column a:

df['b'] = df["a"].str.split("\(test\)", 1).str[1].fillna(df["a"])
#alternative
#df['b'] = df["a"].str.split("\(test\)", 1).str[1].combine_first(df["a"])
print (df)
                         a            b
0  this is a (test), it is      , it is
1              yet another  yet another

Upvotes: 3

Related Questions