Reputation: 5367
I have an empty dataframe
.
df=pd.DataFrame(columns=['a'])
for some reason I want to generate df2, another empty dataframe, with two columns 'a' and 'b'.
If I do
df.columns=df.columns+'b'
it does not work (I get the columns renamed to 'ab') and neither does the following
df.columns=df.columns.tolist()+['b']
How to add a separate column 'b' to df, and df.emtpy
keep on being True
?
Using .loc is also not possible
df.loc[:,'b']=None
as it returns
Cannot set dataframe with no defined index and a scalar
Upvotes: 40
Views: 76205
Reputation: 662
You can simply use the following syntax
import pandas as pd
df = pd.DataFrame(columns=['A', 'B', 'C'])
df[['D', 'E', 'F']] = None
print(df)
This creates an empty dataframe with columns from 'A' to 'F' with below result
>>Empty DataFrame
>>Columns: [A, B, C, D, E, F]
>>Index: []
Upvotes: 0
Reputation: 164853
This is one way:
df2 = df.join(pd.DataFrame(columns=['b']))
The advantage of this method is you can add an arbitrary number of columns without explicit loops.
In addition, this satisfies your requirement of df.empty
evaluating to True
if no data exists.
Upvotes: 6
Reputation: 323396
You can use concat
:
df=pd.DataFrame(columns=['a'])
df
Out[568]:
Empty DataFrame
Columns: [a]
Index: []
df2=pd.DataFrame(columns=['b', 'c', 'd'])
pd.concat([df,df2])
Out[571]:
Empty DataFrame
Columns: [a, b, c, d]
Index: []
Upvotes: 4
Reputation: 29635
If you just do df['b'] = None
then df.empty
is still True
and df is:
Empty DataFrame
Columns: [a, b]
Index: []
EDIT:
To create an empty df2
from the columns of df
and adding new columns, you can do:
df2 = pd.DataFrame(columns = df.columns.tolist() + ['b', 'c', 'd'])
Upvotes: 21
Reputation: 1691
Here are few ways to add an empty column to an empty dataframe:
df=pd.DataFrame(columns=['a'])
df['b'] = None
df = df.assign(c=None)
df = df.assign(d=df['a'])
df['e'] = pd.Series(index=df.index)
df = pd.concat([df,pd.DataFrame(columns=list('f'))])
print(df)
Output:
Empty DataFrame
Columns: [a, b, c, d, e, f]
Index: []
I hope it helps.
Upvotes: 55
Reputation: 59579
If you want to add multiple columns at the same time you can also reindex.
new_cols = ['c', 'd', 'e', 'f', 'g']
df2 = df.reindex(df.columns.union(new_cols), axis=1)
#Empty DataFrame
#Columns: [a, c, d, e, f, g]
#Index: []
Upvotes: 10