Firuz
Firuz

Reputation: 75

Renaming column names from a data set in pandas

I am trying to rename column names from a DataFrame that have space in the name. DataFrame (df) consists of 45 columns and the majority have spaces in the name. For instance: df.column.values [1] = 'Date Release', and the name should be changed to 'Date_Release'. I tried DataFrame.rename () and DataFrame.columns.values[] but did not work. I would much appreciate it if you could help me to find out what I did wrong

for colmns in df:
    if ' ' in colmns:
        colmns_new = '_'.join(colmns.split())
        df = df.rename (columns = {"\"%s\"" %colmns : "\"%s\"" %colmns_new})   
    else:
        print (colmns)    

print (df)

or this one:

for i in range (len(df.columns)):
    old= df.columns.values[i]
    if ' ' in old:
        new = '_'.join(old.split())
        df = df.columns.values[i] = ['%s' % new]
        print ("\"%s\"" % new) 
print (df)

Error: AttributeError: 'list' object has no attribute 'columns'

Upvotes: 3

Views: 3495

Answers (3)

Joe Ferndz
Joe Ferndz

Reputation: 8508

You can just give df.columns = df.columns.str.replace(' ','_') to replace the space with an underscore.

Here's an example. Here column a1 does not have a space. However columns b 2 and c 3 have a space.

>>> df = pd.DataFrame({'a1': range(1,5), 'b 2': list ('abcd'), 'c 3':list('pqrs')})
>>> df
   a1 b 2 c 3
0   1   a   p
1   2   b   q
2   3   c   r
3   4   d   s
>>> df.columns = df.columns.str.replace(' ','_')
>>> df
   a1 b_2 c_3
0   1   a   p
1   2   b   q
2   3   c   r
3   4   d   s

Upvotes: 2

Vaishali
Vaishali

Reputation: 38415

You can use regex to replace spaces with underscore

Here is an example df with some columns containing spaces,

cols = ['col {}'.format(i) for i in range(1, 10, 1)] + ['col10']
df = pd.DataFrame(columns = cols)

import re
df.columns = [re.sub(' ','_',i) for i in df.columns]

You get

col_1   col_2   col_3   col_4   col_5   col_6   col_7   col_8   col_9   col10

Upvotes: 1

Mo Huss
Mo Huss

Reputation: 464

import pandas as pd
df.columns = [i.replace(' ','_') for i in df.columns]

Upvotes: 2

Related Questions