Reputation: 1177
I'm concatenating two pandas data frames, that have the same exact columns, but different number of rows. I'd like to stack the first dataframe over the second.
When I do the following, I get many NaN values in some of the columns. I've tried the fix in using this post, using .reset_index
But I'm getting NaN values still. My dataframes have the following columns:
The first one, rem_dup_pre
and the second one, rem_dup_po
have shape (54178, 11) (83502, 11)
respectively.
I've tried this:
concat_mil = pd.concat([rem_dup_pre.reset_index(drop=True), rem_dup_po.reset_index(drop=True)], axis=0)
and I get NaN values. For example in 'Station Type', where previously there were no NaN values in either rem_dup_pre
or rep_dup_po
:
How can I simply concat them without NaN values?
Upvotes: 0
Views: 1254
Reputation: 8508
Here's how I did it and I don't get any additional NaNs.
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'a':[1,2,3,4,5,6],
'b':['a','b','c','d',np.nan,np.nan],
'c':['x',np.nan,np.nan,np.nan,'y','z']})
df2 = pd.DataFrame(np.random.randint(0,10,(3,3)), columns = list('abc'))
print (df1)
print (df2)
df = pd.concat([df1,df2]).reset_index(drop=True)
print (df)
The output of this is:
DF1:
a b c
0 1 a x
1 2 b NaN
2 3 c NaN
3 4 d NaN
4 5 NaN y
5 6 NaN z
DF2:
a b c
0 4 8 4
1 8 4 4
2 2 8 1
DF: after concat
a b c
0 1 a x
1 2 b NaN
2 3 c NaN
3 4 d NaN
4 5 NaN y
5 6 NaN z
6 4 8 4
7 8 4 4
8 2 8 1
Upvotes: 1