Reputation: 850
Sample dataframe:
df <- data.frame(c('ab','cd','..'),c('ab','..','ab'),c('..','cd','cd'))
I'm trying to get the proportion of ab's for each column and row, but ignoring ..'s from the total in the numerator and denominator.
Proportion of ab's = total number of ab's excluding ../ number of any symbol except ..
For example for column 1 (values are ab,cd,and ..), the proportion of ab's is 0.5
What I have so far:
fun <- function(x) {
length(which(x == 'ab'))/length(which(x != '..'))
}
byColumn<- sapply(df[,1:ncol(df)],fun)
byRow <- sapply(df[1:nrow(df),],fun)
Expected result:
byColumn <- c(0.5,1.0,0.0)
byRow <- c(1.0,0.0,0.5)
Actual result:
byColumn <- c(0.5,1.0,0.0)
byRow <- c(0.5,1.0,0.0)
But byRow isn't working... it seems to be the same output as byColumn?
Upvotes: 1
Views: 1691
Reputation: 13570
You can keep your function. Then byRow
you use the same code that is working byColumn
but transposing the data frame:
byColumn <- sapply(df[, 1:ncol(df)], fun)
byRow <- sapply(as.data.frame(t(df))[, 1:ncol(df)], fun)
Output:
# By column
col1 col2 col3
0.5 1.0 0.0
# By row
V1 V2 V3
1.0 0.0 0.5
Upvotes: 1
Reputation: 92282
I would define the function as follows (you can play around with the settings)
Propfunc <- function(x, dim = "col", equal = "ab", ignore = ".."){
if(dim == "col") return(unname(colSums(x == equal)/colSums(x != ignore)))
if(dim == "row") return(rowSums(x == equal)/rowSums(x != ignore))
else stop("Unknown dim")
}
Propfunc(df)
## [1] 0.5 1.0 0.0
Propfunc(df, dim = "row")
## [1] 1.0 0.0 0.5
Propfunc(df, dim = "blabla")
## Error in Propfunc(df, dim = "blabla") : Unknown dim
Upvotes: 3