Reputation: 569
Hi there I have 2 large pandas dataframes, df and df1.
df is something like this which has values or 0.
df:
Amazon Apple Blackberry Yahoo Google
1/1/2000 0 13 0 42 0
1/1/2001 0 41 0 53 0
1/1/2002 34 42 0 64 0
1/1/2003 45 63 0 74 0
1/1/2004 43 74 24 75 0
1/1/2005 0 89 25 86 25
now I have df1 which also has some mutual column names and row index. eg:
df1:
Amazon Apple Blackberry Yahoo Google
1/1/2000 0 0 0 53 53
1/1/2001 24 0 53 53 42
1/1/2002 42 31 53 53 74
1/1/2003 52 43 53 0 89
1/1/2004 52 53 0 0 99
1/1/2005 24 53 0 0 100
Now I want to use df, and maintain all the values in df. However when there is a 0 in df, I want to look up df1 and populate the values of df1 into df given the column name and index date are the same.
Eg output would be something like this:
dfoutput
Amazon Apple Blackberry Yahoo Google
1/1/2000 0 13 0 42 53
1/1/2001 24 41 53 53 42
1/1/2002 34 42 53 64 74
1/1/2003 45 63 53 74 89
1/1/2004 43 74 24 75 99
1/1/2005 24 89 25 86 25
If there is no match for df in df1 then the value remains 0 in df. In reality df and df1 do differ slightly in the larger dataset.
Thanks.
Upvotes: 1
Views: 39
Reputation: 862921
You can use combine_first
or (fillna
) with replace 0
to NaN
:
df2 = df.replace(0,np.nan).combine_first(df1.replace(0,np.nan))
#alternatively
#df2 = df.replace(0,np.nan).fillna(df1.replace(0,np.nan))
print (df2)
Amazon Apple Blackberry Yahoo Google
1/1/2000 NaN 13 NaN 42 53.0
1/1/2001 24.0 41 53.0 53 42.0
1/1/2002 34.0 42 53.0 64 74.0
1/1/2003 45.0 63 53.0 74 89.0
1/1/2004 43.0 74 24.0 75 99.0
1/1/2005 24.0 89 25.0 86 25.0
df2 = df.mask(df==0).combine_first(df1.mask(df1==0))
#alternatively
#df2 = df.mask(df==0).fillna(df1.mask(df1==0))
print (df2)
Amazon Apple Blackberry Yahoo Google
1/1/2000 NaN 13 NaN 42 53.0
1/1/2001 24.0 41 53.0 53 42.0
1/1/2002 34.0 42 53.0 64 74.0
1/1/2003 45.0 63 53.0 74 89.0
1/1/2004 43.0 74 24.0 75 99.0
1/1/2005 24.0 89 25.0 86 25.0
Last replace NaN
to 0
and cast to int
:
df2 = df2.fillna(0).astype(int)
print (df2)
Amazon Apple Blackberry Yahoo Google
1/1/2000 0 13 0 42 53
1/1/2001 24 41 53 53 42
1/1/2002 34 42 53 64 74
1/1/2003 45 63 53 74 89
1/1/2004 43 74 24 75 99
1/1/2005 24 89 25 86 25
Upvotes: 2