P A N
P A N

Reputation: 5922

Pandas append list to list of column names

I'm looking for a way to append a list of column names to existing column names in a DataFrame in pandas and then reorder them by col_start + col_add.

The DataFrame already contains the columns from col_start.

Something like:

import pandas as pd

df = pd.read_csv(file.csv)

col_start = ["col_a", "col_b", "col_c"]
col_add = ["Col_d", "Col_e", "Col_f"]
df = pd.concat([df,pd.DataFrame(columns = list(col_add))]) #Add columns
df = df[[col_start.extend(col_add)]] #Rearrange columns

Also, is there a way to capitalize the first letter for each item in col_start, analogous to title() or capitalize()?

Upvotes: 2

Views: 10013

Answers (2)

DavidK
DavidK

Reputation: 2564

Here what you want to do :

import pandas as pd

#Here you have a first dataframe
d1 = pd.DataFrame([[1,2,3],[4,5,6]], columns=['col1','col2','col3'])

#a second one
d2 = pd.DataFrame([[8,7,3,8],[4,8,6,8]], columns=['col4','col5','col6', 'col7'])

#Here we can make a dataframe with d1 and d2
d = pd.concat((d1,d2), axis=1)

#We want a different order from the columns ?
d = d[col_start + col_add]

If you want to capitalize values from a column 'col', you can do

d['col'] = d['col'].str.capitalize()

PS: Update Pandas if ".str.capitalize()" doesn't work.

Or, what you can do :

df['col'] = df['col'].map(lambda x:x.capitalize())

Upvotes: 2

EdChum
EdChum

Reputation: 393933

Your code is nearly there, a couple things:

df = pd.concat([df,pd.DataFrame(columns = list(col_add))])

can be simplified to just this as col_add is already a list:

df = pd.concat([df,pd.DataFrame(columns = col_add)])

Also you can also just add 2 lists together so:

df = df[[col_start.extend(col_add)]]

becomes

df = df[col_start+col_add]

And to capitalise the first letter in your list just do:

In [184]:
col_start = ["col_a", "col_b", "col_c"]
col_start = [x.title() for x in col_start]
col_start

Out[184]:
['Col_A', 'Col_B', 'Col_C']

EDIT

To avoid the KeyError on the capitalised column names, you need to capitalise after calling concat, the columns have a vectorised str title method:

In [187]:
df = pd.DataFrame(columns = col_start + col_add)
df

Out[187]:
Empty DataFrame
Columns: [col_a, col_b, col_c, Col_d, Col_e, Col_f]
Index: []

In [188]:    
df.columns = df.columns.str.title()
df.columns

Out[188]:
Index(['Col_A', 'Col_B', 'Col_C', 'Col_D', 'Col_E', 'Col_F'], dtype='object')

Upvotes: 4

Related Questions