Chia Yi
Chia Yi

Reputation: 562

Concatenate dataframes while making sure boolean value is not converted into integer

I have df1 and df2 that I want to combine into one dataframe in a for loop. The 2 dataframes are identical df1 looks like this

id booleanValue
0     True
1     False

df2 looks like this

id booleanValue
2     True
3     np.nan

I did

total_df = pd.Dataframe()
total_df = pd.concat([total_df, df1], ignore_index=True, sort=False)

I was hoping to get

id booleanValue
0     True
1     False
2     True
3     NaN

but I got

id booleanValue
0     0.0
1     1.0
2     0.0
3     0.0

Is there a way to concatenate so that boolean value doesn't get converted into integer and np.nan will remain as np.nan?

Upvotes: 1

Views: 1341

Answers (1)

jezrael
jezrael

Reputation: 862731

Your solution working nice, only is necessary upgrade pandas, because Nullable Boolean Data Type working from pandas 1.0.0+:

df1['booleanValue'] = df1['booleanValue'].astype('boolean')
df2['booleanValue'] = df2['booleanValue'].astype('boolean')

total_df = pd.concat([df1, df2], ignore_index=True, sort=False)
print (total_df.dtypes)
id                int64
booleanValue    boolean
dtype: object

print (total_df)
   id  booleanValue
0   0          True
1   1         False
2   2          True
3   3          <NA>

Solution if not convert to boolean - get object dtype:

total_df = pd.concat([df1, df2], ignore_index=True, sort=False)
print (total_df)
   id booleanValue
0   0         True
1   1        False
2   2         True
3   3          NaN

print (total_df.dtypes)
id               int64
booleanValue    object
dtype: object

Upvotes: 1

Related Questions