Reputation: 39
I am relatively new to python/pandas. I have two dataframes that I want to merge together. The first dataframe look like:
var1 var2
EA-0123 A01
EC-0124 A03
EC-0124 A03
ED-0125 A02
ED-0125 A01
and the second:
var1 A01 A02 A03
EA-0123 0 0 0
EB-0124 0 0 0
EC-0124 0 0 0
ED-0125 0 0 0
I want to count the amount of times var1 has a certain value for var2 and get an output that would look like:
var1 A01 A02 A03
EA-0123 1 0 0
EB-0124 0 0 0
EC-0124 0 0 2
ED-0125 1 1 0
The second dataframe contains data that does not always appear in the first dataframe but I would like to keep those rows, even if the columns all indicate "0".
I tried to make an if elif else loop but couldn't get it to work (maybe due to the fact that there are two dataframes?), and couldn't find any other problem like this..
Upvotes: 0
Views: 48
Reputation: 862761
I think need crosstab
with reindex
:
df = pd.crosstab(df1['var1'], df1['var2']).reindex(df2['var1'], fill_value=0)
If need better performance use DataFrameGroupBy.size
with unstack
:
df = (df1.groupby(['var1', 'var2']).size()
.unstack(fill_value=0)
.reindex(df2['var1'], fill_value=0))
print (df)
var2 A01 A02 A03
var1
EA-0123 1 0 0
EB-0124 0 0 0
EC-0124 0 0 2
ED-0125 1 1 0
Upvotes: 3