Reputation: 205
I have a Df which looks like this:
tests Machine results
111 A OK
111 A OK
111 A OK
111 A NOK
111 B OK
222 A OK
333 A OK
333 B OK
444 A OK
222 A NOK
222 A OK
111 B OK
I want to have a matrix/pivot table where rows should be tests and columns should be machines and values should be ratio/percentage of tests that are OK to the overall count of tests for each machine.
results should be:
OK/total cnt A B
111 75% 100% #test 111 passed 3 out of 4 tests in machine A and all tests in B
222 50% 0% #test 222 passed 1 out of 2 tests in machine A and 0 tests prfrmd in B
333 100% 100% #test 333 passed all tests in A and B machines
Please help me on this.
Upvotes: 1
Views: 1146
Reputation: 28709
Get column for rows that are equal to OK
, then run a crosstab to get your average
df = df.assign(passed=df.results.eq("OK"))
(
pd.crosstab(df.tests, df.Machine, df.passed, aggfunc="mean")
.rename_axis(columns=None, index="OK/total cnt")
.mul(100, fill_value=0)
)
A B
OK/total cnt
111 75.000000 100.0
222 66.666667 0.0
333 100.000000 100.0
444 100.000000 0.0
Upvotes: 2
Reputation: 863166
Create new column by DataFrame.assign
with compare for OK
by Series.eq
, then pivoting by default mean
method in DataFrame.pivot_table
, multiple by 100
and last if necessary convert index to column
:
df = (df.assign(res = df['results'].eq('OK'))
.pivot_table(index='tests',
columns='Machine',
values='res',
fill_value=0)
.mul(100)
.rename_axis(index='OK/total cnt', columns=None)
.reset_index())
print (df)
OK/total cnt A B
0 111 75.000000 100
1 222 66.666667 0
2 333 100.000000 100
3 444 100.000000 0
Upvotes: 1