AK007
AK007

Reputation: 71

how to remove string after integer in a dataframe python

enter image description hereI want to remove string(subject to approval) after integer(90) in a dataframe.

import pandas as pd
name_dict = {
            'Name': ['a','b','c','d'],
            'Score': ['90(subject to approval)',80,95,20]
          }
df = pd.DataFrame(name_dict)
print (df)

df.set_index('Name').loc['a', 'Score']

Upvotes: 2

Views: 515

Answers (2)

Zach Flanders
Zach Flanders

Reputation: 1314

You can use regex to replace all non numeric characters and then cast to int.

import pandas as pd
name_dict = {
            'Name': ['a','b','c','d'],
            'Score': ['90(subject to approval)',80,95,20]
          }
df = pd.DataFrame(name_dict)

df['Score'] = df['Score'].replace(r'[^0-9]+', '', regex=True)
print(df)

Output:

  Name  Score
0    a     90
1    b     80
2    c     95
3    d     20

If you want to only remove the extra string for rows where "Name" == "a" you can use:

import pandas as pd
name_dict = {
            'Name': ['a','b','c','d'],
            'Score': ['90(subject to approval)',80,95,20]
          }
df = pd.DataFrame(name_dict)

df.loc[df['Name'] == 'a', 'Score'] = (
    df
    .loc[df['Name'] == 'a', 'Score']
    .replace(r'[^0-9]+', '', regex=True)
)

Upvotes: 2

Ynjxsjmh
Ynjxsjmh

Reputation: 30070

You can use .str.extract to extract the integer part

df['Score'] = df['Score'].astype(str).str.extract('(\d+)')
print(df)

  Name Score
0    a    90
1    b    80
2    c    95
3    d    20

Upvotes: 2

Related Questions