dj_paige
dj_paige

Reputation: 353

Using tapply on Function that has multiple outputs

I have some data where we apply multiple tests (called parameters) to different "die", and each "die" can either pass or fail a given test.

Here is a small portion of a dataframe named alldie

    die                        parameter firstfailure
1     1 Resistance_Test DevID (Ohms) 428        FALSE
2     1         Diode_Test SUBLo (V) 353        FALSE
3     1        Gate_Test V1_WELL (V) 361        FALSE
4     1        Gate_Test V2_WELL (V) 360        FALSE
5     1        Gate_Test V3_WELL (V) 361        FALSE
6     1  Class_Test Cluster Class2 (#) 6        FALSE
7     1   Class_Test Column Class1 (#) 2         TRUE
8     1  Class_Test Cluster Class1 (#) 2           NA

If I provided the full dataset, you'd see multiple die (numbered 1,2,3,...), many more different parameters, and under firstfailure, you would see FALSE (die passed) or TRUE (die failed) and occasionally NA if the test wasn't performed.

I thought I could compute the number of die going through each test (parameter), the number that passed, and the proportion that passed, by writing a function and then using tapply

ly <- function(data) {
  ndie <- sum(!is.na(data))
  npass <- ndie - sum(data,na.rm = TRUE)
  yield <- npass / ndie
  c(npass,ndie,yield)
}

This does the calculations I want, but produces some difficult to use output

tapply(alldie$firstfailure, alldie$parameter, ly)) -> lim_yld

then lim_yld looks like (first few rows only, and also tapply puts the parameters in alphabetical order)

$`Class_Test Cluster Class1 (#) 2`
[1] 76 76  1

$`Class_Test Cluster Class2 (#) 6`
[1] 89 89  1

$`Class_Test Column Class1 (#) 2`
[1] 76.0000000 89.0000000  0.8539326

Questions:

  1. How can I get the data into a dataframe that is more readable? Something like this: Parameter Npass Ndie Proportion Class_Test Cluster Class1 (#) 2 76 76 1.0000000 Class_Test Cluster Class2 (#) 6 89 89 1.0000000 Class_Test Column Class1 (#) 2 76 89 0.8539326

  2. How can I sort the parameters in this dataframe in the original order?

Thanks!

Upvotes: 1

Views: 241

Answers (1)

Dave2e
Dave2e

Reputation: 24069

How about this a solution. Take the result of the tapply and convert to a dataframe. The add the column headings and parameter names:

df<-as.data.frame(matrix(unlist(lim_yld), ncol=3, byrow=TRUE))
names(df)<-c("npass","ndie","yield")
df<-cbind(parameter=names(lim_yld), df)

As the comments mention above not very generic with respect to the column names, but it does align with your function return. It appears the tapply is returning the list is reverse but just in case this should work:

df<-df[order(df$parameter, alldie$parameter ),]

Upvotes: 1

Related Questions