Moritz
Moritz

Reputation: 5418

pandas column names to list

According to this thread: SO: Column names to list

It should be straightforward to do convert the column names to a list. But if i do:

df.columns.tolist()

I do get:

[u'q_igg', u'q_hcp', u'c_igg', u'c_hcp']

I know, i could get rid of the u and the ' . But i would like to just get the clean names as list without any hack around. Is that possible ?

Upvotes: 14

Views: 47915

Answers (6)

gincard
gincard

Reputation: 1904

Or, you could try:

df2 = df.columns.get_values()

which will give you:

array(['q_igg', 'q_hcp', 'c_igg', 'c_hcp'], dtype=object)

then:

df2.columns.tolist()

which gives you:

['q_igg', 'q_hcp', 'c_igg']

Upvotes: 23

Omkar Darves
Omkar Darves

Reputation: 194

this will do the job

list(df2)

Upvotes: 0

brijesh_patel
brijesh_patel

Reputation: 41

Simple and easy way: df-dataframe variable name

df.columns.to_list()

this will give the list of the all columns name.

Upvotes: 4

PlagTag
PlagTag

Reputation: 6449

As already mentioned the u means that its unicode converted. Anyway, the cleanest way would be to convert the colnames to ascii or something like that.

In [4]: cols
Out[4]: [u'q_igg', u'q_hcp', u'c_igg', u'c_hcp']

In [5]: [i.encode('ascii', 'ignore') for i in cols]
Out[5]: ['q_igg', 'q_hcp', 'c_igg', 'c_hcp'

The problem here is that you would lose special characters that are not encode in ascii.

A much more dirty solution would be to fetch the string representation of the list object and just replace the u. I would not use that but it might befit your needs in this special case ;-)

In [7]: repr(cols)
Out[7]: "[u'q_igg', u'q_hcp', u'c_igg', u'c_hcp']"
In [11]: x.replace("u", "")
Out[11]: "['q_igg', 'q_hcp', 'c_igg', 'c_hcp']"

see: https://docs.python.org/2/library/repr.html

Upvotes: 1

chrisb
chrisb

Reputation: 52286

If you're just interested in printing the name without an quotes or unicode indicators, you could do something like this:

In [19]: print "[" + ", ".join(df) + "]"
[q_igg, q_hcp, c_igg, c_hcp]

Upvotes: 1

Simeon Visser
Simeon Visser

Reputation: 122526

The list [u'q_igg', u'q_hcp', u'c_igg', u'c_hcp'] contains Unicode strings: the u indicates that they're Unicode strings and the ' are enclosed around each string. You can now use these names in any way you'd like in your code. See Unicode HOWTO for more details on Unicode strings in Python 2.x.

Upvotes: 3

Related Questions