Reputation: 859
I have a dataset that looks like this:
ID HEMO_Counts PLATE_Counts ESR_Counts CRP_Counts COVID POS
0 23 2 6 1
1 0
2 54 23 0
3 234 4 9 1
4 75 6 0
5 64 16 1
I want to create a table that has each Lab Value (HEMO_Counts, PLATE_Counts, ESR_Counts, CRP_Counts) as the index and have the counts of each lab value split into two columns by COVID POS
LAB 0 1
Hemo 1 3
Plate 2 0
ESR 1 2
CRP 0 3
I wrote the following code but the output is incorrect:
LABS_VAR=['HEMO_Counts','PLATE_Counts','ESR_Counts','CRP_Counts']
for lab in LABS_VAR:
TEST=pd.pivot_table(df_merged2,values=[lab], columns='Covid_pos', aggfunc=lambda x: len(x.unique()))
TESTING = pd.DataFrame(TEST.to_records()).rename(columns={'index':'Lab'})
DF=TESTING.append(TESTING)
Labs_Count_Results = pd.DataFrame(DF)
Am I on the right track or is there an easier way to do this?
Upvotes: 1
Views: 85
Reputation: 71707
First filter
columns containing _Counts
then groupby
these columns on COVID POS
and agg using count
, finally using rstrip
remove the _Counts
part from the index:
d = df.filter(like='_Counts').groupby(df['COVID POS'])\
.count().T.rename_axis(columns='LAB')
d.index = d.index.str.rstrip('_Counts')
LAB 0 1
HEMO 1 3
PLATE 2 0
ESR 1 2
CRP 0 3
Upvotes: 1