Katie Melosto
Katie Melosto

Reputation: 1177

pandas concat two dataframes of different row size without nan values

I'm concatenating two pandas data frames, that have the same exact columns, but different number of rows. I'd like to stack the first dataframe over the second.

When I do the following, I get many NaN values in some of the columns. I've tried the fix in using this post, using .reset_index But I'm getting NaN values still. My dataframes have the following columns:

enter image description here

The first one, rem_dup_pre and the second one, rem_dup_po have shape (54178, 11) (83502, 11) respectively.

I've tried this:

concat_mil = pd.concat([rem_dup_pre.reset_index(drop=True), rem_dup_po.reset_index(drop=True)], axis=0)

and I get NaN values. For example in 'Station Type', where previously there were no NaN values in either rem_dup_pre or rep_dup_po:

enter image description here

How can I simply concat them without NaN values?

Upvotes: 0

Views: 1254

Answers (1)

Joe Ferndz
Joe Ferndz

Reputation: 8508

Here's how I did it and I don't get any additional NaNs.

import pandas as pd
import numpy as np
df1 = pd.DataFrame({'a':[1,2,3,4,5,6],
                    'b':['a','b','c','d',np.nan,np.nan],
                    'c':['x',np.nan,np.nan,np.nan,'y','z']})
df2 = pd.DataFrame(np.random.randint(0,10,(3,3)), columns = list('abc'))
print (df1)
print (df2)
df = pd.concat([df1,df2]).reset_index(drop=True)
print (df)

The output of this is:

DF1:

   a    b    c
0  1    a    x
1  2    b  NaN
2  3    c  NaN
3  4    d  NaN
4  5  NaN    y
5  6  NaN    z

DF2:

   a  b  c
0  4  8  4
1  8  4  4
2  2  8  1

DF: after concat

   a    b    c
0  1    a    x
1  2    b  NaN
2  3    c  NaN
3  4    d  NaN
4  5  NaN    y
5  6  NaN    z
6  4    8    4
7  8    4    4
8  2    8    1

Upvotes: 1

Related Questions