how to remove string after integer in a dataframe python

Question

I want to remove string(subject to approval) after integer(90) in a dataframe.

import pandas as pd
name_dict = {
            'Name': ['a','b','c','d'],
            'Score': ['90(subject to approval)',80,95,20]
          }
df = pd.DataFrame(name_dict)
print (df)

df.set_index('Name').loc['a', 'Score']

Zach Flanders · Accepted Answer

You can use regex to replace all non numeric characters and then cast to int.

import pandas as pd
name_dict = {
            'Name': ['a','b','c','d'],
            'Score': ['90(subject to approval)',80,95,20]
          }
df = pd.DataFrame(name_dict)

df['Score'] = df['Score'].replace(r'[^0-9]+', '', regex=True)
print(df)

Output:

  Name  Score
0    a     90
1    b     80
2    c     95
3    d     20

If you want to only remove the extra string for rows where "Name" == "a" you can use:

import pandas as pd
name_dict = {
            'Name': ['a','b','c','d'],
            'Score': ['90(subject to approval)',80,95,20]
          }
df = pd.DataFrame(name_dict)

df.loc[df['Name'] == 'a', 'Score'] = (
    df
    .loc[df['Name'] == 'a', 'Score']
    .replace(r'[^0-9]+', '', regex=True)
)

how to remove string after integer in a dataframe python

Answers (2)

Related Questions