user4069366
user4069366

Reputation:

Unable to remove unicode char from column names in pandas under Python 2.x

I have read a csv file in pandas dataframe and am trying to remove the unicode char u from the column names but with no luck.

fl.columns
Index([ u'time', u'contact', u'address'], dtype='object')

headers=[ 'time', 'contact', 'address']
fl=pandas.read_csv('file.csv',header=None,names=headers)

Still doesnt work

fl.columns
Index([ u'time', u'contact', u'address'], dtype='object')

Even the rename doesnt work either

fl.rename(columns=lambda x:x.replace(x,x.value.encode('ascii','ignore')),inplace=True)
fl.columns
Index([ u'time', u'contact', u'address'], dtype='object')

Can anybody please tell me why this is happening and how to fix it ? Thanks.

Upvotes: 8

Views: 15575

Answers (3)

ashok suthar
ashok suthar

Reputation: 19

I was facing a similar issue while building ML pipeline. My features list was having Unicode along with names.

features

[u'Customer_id', u'Age',.....]

One way to get away with it is using str() function. Create a new list with applying an str function to each of the value.

features_new= [str(x) for x in features]

Now the features_new list will not have any Unicode char. Let me know how it works.

Upvotes: 0

elPastor
elPastor

Reputation: 8976

I had an issue with this today and used: df['var'] = df['var'].astype(str)

Upvotes: 0

paulo.filip3
paulo.filip3

Reputation: 3297

If you really need to remove the u (since this is only a display issue) you can do the following very dirty trick:

from pandas import compat

compat.PY3 = True

df.columns
Index(['time', 'contact', 'address'], dtype='object')

Upvotes: 5

Related Questions