Tanner Ormanoski
Tanner Ormanoski

Reputation: 35

Can I make a Python if condition using Regex on Pandas column to see if it contains something and then create a new column to hold it

Say I have data like this

Account
1 Kevin (1234567)
2 Buzz (7896345)
3 Snakes (5438761)
4 Marv
5 Harry (9083213)

I want to use an if condition to search to see if the account number exists at the end of the name in the account column, if it does split the account number off and put it in a new column, if not pass and go on to the next Account.

Something like this although it does not work

dataset.loc[dataset.Account.str.contains(r'\(d+')], 'new'=dataset.Account.str.split('',n=1, expand=True)
dataset["Account_Number"] = new[1]

Upvotes: 0

Views: 422

Answers (2)

Andrej Kesely
Andrej Kesely

Reputation: 195543

Try:

df["Account number"] = df["Account"].str.extract(r"\((\d+)\)$")
df["Account"] = df["Account"].str.replace(r"\s*\(\d+\)$", "", regex=True)
print(df)

Prints:

  Account Account number
1   Kevin        1234567
2    Buzz        7896345
3  Snakes        5438761
4    Marv            NaN
5   Harry        9083213

Upvotes: 1

Naveed
Naveed

Reputation: 11650

here is one way to do it

# split the account on ( and create two columns
df[['Account','Account Number']]= df['Account'].str.split('(', expand=True)

#replace the trailing ) with empty string
df['Account Number']=df['Account Number'].str.replace(r'\)','', regex=True ).str.strip()
df
dfdf
        Account     Account Number
0   1   Kevin              1234567
1   2   Buzz               7896345
2   3   Snakes             5438761
3   4   Marv                  None
4   5   Harry              9083213

Upvotes: 1

Related Questions