Daniel Ortega
Daniel Ortega

Reputation: 435

Group results of dataframe by several columns and count the different unique values per each column

I have a dataframe that has this structure:

Group    Subgroup    Results1       Results2
  A         s1          OK             NOK
  A         s1          OK             OK
  A         s2          NOK            NOK
  A         s2          OK             NOK
  B         s3          OK             Not tested
  B         s3          Not tested     NOK
  B         s4          OK             NOK

And I want to get the count of each Results column in some kind like:

                            Results1                    Results2
                     OK    NOK    Not tested         OK    NOK    Not tested
    A       s1       2     1      0                  1     2      0
    A       s2       1     3      ...                ...   ...    ...
    B       s3       ...   ...    ...                ...   ...    ...
    B       s4       ...   ...    ...                ...   ...    ...

Is there any way to achieve this with Pandas?

Upvotes: 1

Views: 33

Answers (1)

jezrael
jezrael

Reputation: 862681

Use DataFrame.melt with crosstab and DataFrame.rename_axis for remove columns names:

df = df.melt(['Group','Subgroup'])

df = (pd.crosstab([df['Group'], df['Subgroup']], [df['variable'], df['value']])
        .rename_axis([None, None], axis=1))
print (df)
               Results1               Results2              
                    NOK Not tested OK      NOK Not tested OK
Group Subgroup                                              
A     s1              0          0  2        1          0  1
      s2              1          0  1        2          0  0
B     s3              0          1  1        1          1  0
      s4              0          0  1        1          0  0

Upvotes: 1

Related Questions