Natasha
Natasha

Reputation: 1521

Adding new columns to a dataframe

I have a list containing the column names of a dataframe. I want to add these empty columns to a dataframe that already exists.

col_names = ["a", "b", "e"]
df = pd.Dataframe()
df = # stores some content

I understand a single new column can be added in the following manner and I could do the same for other columns

df['e'] = None

However, I'd like to know how to add these new columns at once.

Upvotes: 0

Views: 109

Answers (5)

Andy L.
Andy L.

Reputation: 25239

assign with dict.fromkeys also works

In [219]: df = pd.DataFrame()

In [220]: df.assign(**dict.fromkeys(col_names))

Out[220]:
Empty DataFrame
Columns: [a, b, e]
Index: []

This is also work for adding empty (None values) columns to an existing dataframe

sample df

np.random.seed(20)
df = pd.DataFrame(np.random.randint(0, 4, 3*2).reshape(3,2), columns=['col1','col2'])

Out[240]:
   col1  col2
0     3     2
1     3     3
2     0     2

df = df.assign(**dict.fromkeys(col_names))

Out[242]:
   col1  col2     a     b     e
0     3     2  None  None  None
1     3     3  None  None  None
2     0     2  None  None  None

Upvotes: 1

wwnde
wwnde

Reputation: 26676

Please try reindex

#To add to an existing dataframe
df=df.reindex(list(df.columns)+col_names, axis='columns', fill_value='None') 

     Name         Weight     a     b     e
0    John        Average  None  None  None
1    Paul  Below Average  None  None  None
2  Darren  Above Average  None  None  None
3    John        Average  None  None  None
4  Darren  Above Average  None  None  None

Upvotes: 0

Joe Ferndz
Joe Ferndz

Reputation: 8508

You can simply give df[col_list] = None. Here's an example of how you can do it.

import pandas as pd
df = pd.DataFrame({'col1':['river','sea','lake','pond','ocean'],
                   'year':[2000,2001,2002,2003,2004],
                   'col2':['apple','peach','banana','grape','cherry']})
print (df)

Created a dataframe with 3 columns and 5 rows:

Output of df is:

    col1  year    col2
0  river  2000   apple
1    sea  2001   peach
2   lake  2002  banana
3   pond  2003   grape
4  ocean  2004  cherry

Now I want to add columns ['a','b','c','d','e'] to the df. I can do it by just assigning None to the column list.

temp_cols = ['a','b','c','d','e']
df[temp_cols] = None
print (df)

The updated dataframe will have:

    col1  year    col2     a     b     c     d     e
0  river  2000   apple  None  None  None  None  None
1    sea  2001   peach  None  None  None  None  None
2   lake  2002  banana  None  None  None  None  None
3   pond  2003   grape  None  None  None  None  None
4  ocean  2004  cherry  None  None  None  None  None

Upvotes: 1

You can use the same syntax as adding a single new column:

df[col_names] = None

Upvotes: 3

akuiper
akuiper

Reputation: 214927

When you create the data frame, you can pass the col_names to the columns parameter:

import pandas as pd

col_names = ["a", "b", "e"]
df = pd.DataFrame(columns=col_names)

print(df)
# Empty DataFrame
# Columns: [a, b, e]
# Index: []

print(df.columns)
# Index(['a', 'b', 'e'], dtype='object')

Upvotes: 1

Related Questions