avibrun
avibrun

Reputation: 39

Value count dataframe with different dataframes and result in different columns (pandas)

I am relatively new to python/pandas. I have two dataframes that I want to merge together. The first dataframe look like:

var1      var2
EA-0123   A01
EC-0124   A03
EC-0124   A03
ED-0125   A02
ED-0125   A01

and the second:

var1      A01     A02     A03
EA-0123   0       0       0
EB-0124   0       0       0
EC-0124   0       0       0
ED-0125   0       0       0

I want to count the amount of times var1 has a certain value for var2 and get an output that would look like:

    var1      A01     A02     A03
    EA-0123   1       0       0
    EB-0124   0       0       0
    EC-0124   0       0       2
    ED-0125   1       1       0

The second dataframe contains data that does not always appear in the first dataframe but I would like to keep those rows, even if the columns all indicate "0".

I tried to make an if elif else loop but couldn't get it to work (maybe due to the fact that there are two dataframes?), and couldn't find any other problem like this..

Upvotes: 0

Views: 48

Answers (1)

jezrael
jezrael

Reputation: 862761

I think need crosstab with reindex:

df = pd.crosstab(df1['var1'], df1['var2']).reindex(df2['var1'], fill_value=0)

If need better performance use DataFrameGroupBy.size with unstack:

df = (df1.groupby(['var1', 'var2']).size()
        .unstack(fill_value=0)
        .reindex(df2['var1'], fill_value=0))

print (df)
var2     A01  A02  A03
var1                  
EA-0123    1    0    0
EB-0124    0    0    0
EC-0124    0    0    2
ED-0125    1    1    0

Upvotes: 3

Related Questions