Reputation: 181
I'm confused on how do I add a column to another in pandas
Here is what I'm trying to do :
from pandas import DataFrame
df1 = DataFrame({'a':[1,2], 'b':[3,4]})
concat((df1['a'], df1['b'].rename({'b':'a'}))).reset_index(drop=True)
Which do return what I want : A serie with my 4 values. What I don't understand is : Why I can't assign it to column 'a' ?
>>> from pandas import DataFrame
>>> df1 = DataFrame({'a':[1,2], 'b':[3,4]})
>>> concat((df1['a'], df1['b'].rename({'b':'a'}))).reset_index(drop=True)
0 1
1 2
2 3
3 4
dtype: int64
>>> df1['a'] = concat((df1['a'], df1['b'].rename({'b':'a'}))).reset_index(drop=True)
>>> df1
a b
0 1 3
1 2 4
Is there any way to make it more readable by the way? I'm confused on how it should worked... Note that I don't need column 'b' afterward.
Thanks for your help :)
Sam
Upvotes: 0
Views: 60
Reputation: 13257
pandas series
doesn't contain columns.
If you want to use column by Dataframe, use df[['a']]
instead df['a']
& you want change column's name need axis
or columns
pd.concat([df1[['a']], df1[['b']].rename(columns={'b':'a'})]).reset_index(drop=True)
output:
a
0 1
1 2
2 3
3 4
If I create your output using your code, code like above. But I wouldn't use the above code.
I will use following code:
pd.concat([df1['a'], df1['b']]).to_frame('a').reset_index(drop='True')
Upvotes: 1
Reputation: 262284
When you assign, you are not creating new rows/indices (except with a single value, which is not the case here).
pd.concat((df1['a'], df1['b'].rename({'b':'a'}))).reset_index(drop=True)
Gives you:
0 1
1 2
2 3
3 4
dtype: int64
Pandas aligns the indices before assignment. So, only the indices matching the existing df index are used, here 0 and 1, the rest is discarded
Upvotes: 1