Menkes
Menkes

Reputation: 391

Pandas.replace not removing character from data

I have a pandas data frame with the character " in some places (Python 2.7). I want to remove all of the " from the data. I'm using the following method:

data_frame.replace(to_replace'"', value = '')

However, the data frame stays the same and the action doesn't take place. Any advice on what the problem is would be greatly appreciated.

Upvotes: 4

Views: 669

Answers (4)

jrjc
jrjc

Reputation: 21873

You need to use the method str.replace of Series

So:

data_frame.foo.str.replace(to_replace'"', value = '')

foo being the name of a column

> df
    foo
0  "bar"

> df.foo.str.replace('"', '')
0    bar
Name: foo, dtype: object

If you have many columns, but then @jezrael answer is better I guess:

for s in df:
    if df[s].dtype == "object": # to avoid converting non-string column into string
        df.loc[:,s] = df.loc[:,s].str.replace('"', '')

Upvotes: 1

jezrael
jezrael

Reputation: 862851

You can try replace with regex=True:

import pandas as pd

df = pd.DataFrame({'ItemID': {0: 8988, 1: 8988, 2: 6547, 3: 6547}, 
                   'Description': {0: 'Tall Chair', 1: 'Tall Chair', 2: 'Big" Pillow', 3: 'Big Pillow'}, 
                   'Feedback': {0: 'I hated it""', 1: 'Best chair" ever', 2: 'Soft and amazing', 3: 'Horrific color'}})
print df
   Description          Feedback  ItemID
0   Tall Chair      I hated it""    8988
1   Tall Chair  Best chair" ever    8988
2  Big" Pillow  Soft and amazing    6547
3   Big Pillow    Horrific color    6547

print df.replace({'"': ''}, regex=True)
  Description          Feedback  ItemID
0  Tall Chair        I hated it    8988
1  Tall Chair   Best chair ever    8988
2  Big Pillow  Soft and amazing    6547
3  Big Pillow    Horrific color    6547

Upvotes: 2

Zachary Cross
Zachary Cross

Reputation: 2318

The replace function returns a new DataFrame with the replaced data. Try:

data_frame = data_frame.replace(to_replace='"', value='')

Upvotes: -1

fernandezcuesta
fernandezcuesta

Reputation: 2448

Either set the inplace flag to True or reassign the output back to data_frame:

data_frame.replace(to_replace'"', value = '', inplace=True)

or

data_frame = data_frame.replace(to_replace'"', value = '')

Upvotes: -1

Related Questions