Group By Multiple Columns Into One Table

Question

I have a dataset that looks like this:

ID   HEMO_Counts   PLATE_Counts  ESR_Counts  CRP_Counts  COVID POS
0          23                       2          6          1
1                                                         0
2          54          23                                 0
3          234                      4          9          1
4                      75           6                     0
5          64                                 16          1

I want to create a table that has each Lab Value (HEMO_Counts, PLATE_Counts, ESR_Counts, CRP_Counts) as the index and have the counts of each lab value split into two columns by COVID POS

LAB     0    1
Hemo    1    3
Plate   2    0
ESR     1    2
CRP     0    3

I wrote the following code but the output is incorrect:

LABS_VAR=['HEMO_Counts','PLATE_Counts','ESR_Counts','CRP_Counts']

for lab in LABS_VAR:
    TEST=pd.pivot_table(df_merged2,values=[lab],  columns='Covid_pos', aggfunc=lambda x: len(x.unique()))
    TESTING = pd.DataFrame(TEST.to_records()).rename(columns={'index':'Lab'})
    DF=TESTING.append(TESTING)
    
Labs_Count_Results = pd.DataFrame(DF)

Am I on the right track or is there an easier way to do this?

Shubham Sharma · Accepted Answer

First filter columns containing _Counts then groupby these columns on COVID POS and agg using count, finally using rstrip remove the _Counts part from the index:

d = df.filter(like='_Counts').groupby(df['COVID POS'])\
      .count().T.rename_axis(columns='LAB')
      
d.index = d.index.str.rstrip('_Counts')

LAB    0  1
HEMO   1  3
PLATE  2  0
ESR    1  2
CRP    0  3

Group By Multiple Columns Into One Table

Answers (1)

Related Questions