Reputation: 5372
Why doesn't a pandas.DataFrame
object complain when I rename a column if the new column name already exists?
This makes referencing the new column in the future return a pandas.DataFrame as opposed to a pandas.Series , which can cause further errors.
Secondly, is there a suggested way to handle such a situation?
Example:
import pandas
df = pd.DataFrame( {'A' : ['foo','bar'] ,'B' : ['bar','foo'] } )
df.B.map( {'bar':'foo','foo':'bar'} )
# 0 foo
# 1 bar
# Name: B, dtype: object
df.rename(columns={'A':'B'},inplace=True)
Now, the following will fail:
df.B.map( {'bar':'foo','foo':'bar'} )
#AttributeError: 'DataFrame' object has no attribute 'map'
Upvotes: 0
Views: 7182
Reputation: 109636
Let's say you had a dictionary mapping old columns to new column names. When renaming your DataFrame, you could use a dictionary comprehension to test if the new value v
is already in the DataFrame:
df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
d = {'a': 'B', 'b': 'B'}
df.rename(columns={k: v for k, v in d.iteritems() if v not in df}, inplace=True)
>>> df
a B
0 1 3
1 2 4
df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
d = {'a': 'b'}
df.rename(columns={k: v for k, v in d.iteritems() if v not in df}, inplace=True)
>>> df
a b
0 1 3
1 2 4
Upvotes: 3