Erfan
Erfan

Reputation: 42916

Renaming columns on slice of dataframe not performing as expected

I was trying to clean up column names in a dataframe but only a part of the columns.

It doesn't work when trying to replace column names on a slice of the dataframe somehow, why is that?

Lets say we have the following dataframe:
Note, on the bottom is copy-able code to reproduce the data:

   Value ColAfjkj ColBhuqwa ColCouiqw
0      1        a         e         i
1      2        b         f         j
2      3        c         g         k
3      4        d         h         l

I want to clean up the column names (expected output):

   Value ColA ColB ColC
0      1    a    e    i
1      2    b    f    j
2      3    c    g    k
3      4    d    h    l

Approach 1:

I can get the clean column names like this:

df.iloc[:, 1:].columns.str[:4]

Index(['ColA', 'ColB', 'ColC'], dtype='object')

Or

Approach 2:

s = df.iloc[:, 1:].columns
[col[:4] for col in s]

['ColA', 'ColB', 'ColC']

But when I try to overwrite the column names, nothing happens:

df.iloc[:, 1:].columns = df.iloc[:, 1:].columns.str[:4]

   Value ColAfjkj ColBhuqwa ColCouiqw
0      1        a         e         i
1      2        b         f         j
2      3        c         g         k
3      4        d         h         l

Same for the second approach:

s = df.iloc[:, 1:].columns
cols = [col[:4] for col in s]

df.iloc[:, 1:].columns = cols

   Value ColAfjkj ColBhuqwa ColCouiqw
0      1        a         e         i
1      2        b         f         j
2      3        c         g         k
3      4        d         h         l

This does work, but you have to manually concat the name of the first column, which is not ideal:

df.columns = ['Value'] + df.iloc[:, 1:].columns.str[:4].tolist()

   Value ColA ColB ColC
0      1    a    e    i
1      2    b    f    j
2      3    c    g    k
3      4    d    h    l

Is there an easier way to achieve this? Am I missing something?


Dataframe for reproduction:

df = pd.DataFrame({'Value':[1,2,3,4],
                   'ColAfjkj':['a', 'b', 'c', 'd'],
                   'ColBhuqwa':['e', 'f', 'g', 'h'],
                   'ColCouiqw':['i', 'j', 'k', 'l']})

Upvotes: 3

Views: 2506

Answers (3)

Taj G
Taj G

Reputation: 393

I had this problem as well and came up with this solution:

First, create a mask of the columns you want to rename

mask = df.iloc[:,1:4].columns

Then, use list comprehension and a conditional to rename just the columns you want

df.columns = [x if x not in mask else str[:4] for x in df.columns]

Upvotes: 2

yatu
yatu

Reputation: 88276

This is because pandas' index is immutable. If you check the documentation for class pandas.Index, you'll see that it is defined as:

Immutable ndarray implementing an ordered, sliceable set

So in order to modify it you'll have to create a new list of column names, for instance with:

df.columns = [df.columns[0]] + list(df.iloc[:, 1:].columns.str[:4])

Another option is to use rename with a dictionary containing the columns to replace:

df.rename(columns=dict(zip(df.columns[1:], df.columns[1:].str[:4])))

Upvotes: 3

Porada Kev
Porada Kev

Reputation: 513

To overwrite columns names you can .rename() method:

So, it will look like:

df.rename(columns={'ColA_fjkj':'ColA',
                   'ColB_huqwa':'ColB',
                   'ColC_ouiqw':'ColC'}
          , inplace=True)

More info regarding rename here in docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rename.html

Upvotes: 2

Related Questions