Hello lad
Hello lad

Reputation: 18790

How to correct typos in pandas dataframe

I have a pandas dataframe like that

    a      b    c
1   "hi"   1    2
2   "hi"   4    1
3   "Hi"   1    3
4   "hi"   2    1
5   "Hi"   2    1

all "Hi" should be corrected to "hi", how could I precede this cleanly with pandas

this is a toy example, real data can be larger

Upvotes: 1

Views: 5735

Answers (3)

Shashank Agarwal
Shashank Agarwal

Reputation: 2804

If you want it to be lowercased, you can do -

df['a'] = df['a'].str.lower()

If you want to replace certain words -

df['a'] = df['a'].str.replace('Hi', 'hi')

Or if the word appears in a phrase, use regex -

df['a'] = df['a'].str.replace('\bHi\b', 'hi')

This regex option allows you to work even with words -

In [12]: df
Out[12]: 
             a  b
0           hi  1
1           hi  2
2       Hi mom  3
3  mom Hi, mom  4
4      mHim Hi  5

In [13]: df['a'] = df.a.str.replace(r'\bHi\b', 'hi')

In [14]: df
Out[14]: 
             a  b
0           hi  1
1           hi  2
2       hi mom  3
3  mom hi, mom  4
4      mHim hi  5

Note that all words 'Hi' got replaced with 'hi', but in the last example, where 'Hi' appeared in the middle of a word, the replacement was not done.

Upvotes: 3

Haleemur Ali
Haleemur Ali

Reputation: 28253

you can apply a lambda function to the column a in your dataframe that returns the lowercase of the string contained, if your correction is just making the string lowercase.

e.g.

df.a = df.a.apply(lambda x: x.lower())

the apply function method can be extended for other more specific replacements.

e.g.

df.a = df.a.apply(lambda x: 'hi' if x == 'Hi' else x)

Or you can use a function instead of a lambda for more complicated transformations.

def my_replacement_func(x): 
    return x.lower()
df.a = df.a.apply(my_replacement_func)

Upvotes: 0

Brian from QuantRocket
Brian from QuantRocket

Reputation: 5468

Use replace:

In [127]: df.loc[:, "a"] = df.a.replace("Hi", "hi")

In [128]: df
Out[128]:
    a  b  c
1  hi  1  2
2  hi  4  1
3  hi  1  3
4  hi  2  1
5  hi  2  1

Upvotes: 0

Related Questions