Reputation: 61
I am looking for a way to tabulate the pandas value counts per column into a summary table. I've found a way to acheive what I want but Pandas must have a better way to do this.
the data frame has mutliple test steps with 'P' 'F' or ' ' data for each test run.
step1 = list('PPFP PFP ')
step2 = list('PFFP FPF')
step3 = list(' PPPFFPFP')
step4 = list(' PPFPF PP')
df = pd.DataFrame({'step1': step1,'step2':step2, 'step3':step3,'step4':step4})
step1 step2 step3 step4
0 P P
1 P F P P
2 F F P P
3 P P P F
4 F P
5 P F F
6 F F P
7 P P F P
8 F P P
the output I am looking for is:
step1 step2 step3 step4
P 5 3 5 5
F 2 4 3 2
2 2 1 2
I've been able to solve this by looping through each column, doing value_counts then appending it to an output array but this seems clunky.
df2 = pd.DataFrame(index=['P', 'F', ' '])
for i in range(len(df.columns)):
df2[df.columns.tolist()[i]] = df.iloc[:, i].value_counts(dropna=False)
Is there a more elegant way to accomplish this?
Upvotes: 1
Views: 1953
Reputation: 863611
Use DataFrame.apply
with value_counts
:
df2 = df.apply(pd.value_counts)
print (df2)
step1 step2 step3 step4
2 2 1 2
F 2 4 3 2
P 5 3 5 5
For change order of rows add DataFrame.reindex
by list of all values in index in list in expected order:
df2 = df.apply(pd.value_counts).reindex([' ','P','F'])
print (df2)
step1 step2 step3 step4
2 2 1 2
P 5 3 5 5
F 2 4 3 2
Upvotes: 4