Reputation: 140
I've the following data frame and I want to count the occurrences of each row by the first column and append as another column say "freq" to the data frame:
df:
gene a b c
abc 1 NA 1
bca NA 1 1
cba 1 2 1
my df is bigger, so this is only an example to scalable.
The desire dataframe is that:
gene a b c freq
abc 1 NA 1 2
bca NA 1 1 2
cba 1 2 1 3
the codes what I have tried is that:
g <- df %>% mutate(numtwos = rowSums(. > 0))
or
df$freq <- apply(df , 1, function(x) length(which(x>0)))
But it is not working because if in a row should have (for example) 150 repetitions, I obtain only 2 for every row.
Any help or other point of view is welcome!
Thanks
Upvotes: 1
Views: 40
Reputation: 886938
We can use first convert the Na
to "NA"
library(dplyr)
df %>%
mutate_at(vars(a:c), ~ as.numeric(na_if(., "Na"))) %>%
mutate(freq = rowSums(select(., a:c), na.rm = TRUE))
# gene a b c freq
#1 abc 1 NA 1 2
#2 bca NA 1 1 2
#3 cba 1 1 1 3
Here, the values are all 1s, so it is the same as getting the sum of non-NA
df %>%
mutate_at(vars(a:c), ~ as.numeric(na_if(., "Na"))) %>%
mutate(freq = rowSums(!is.na(select(., a:c))))
df <- structure(list(gene = c("abc", "bca", "cba"), a = c("1", "Na",
"1"), b = c("Na", "1", "1"), c = c(1L, 1L, 1L)),
class = "data.frame", row.names = c(NA,
-3L))
Upvotes: 2
Reputation: 53
I haven't used R for a while, so I won't paste in the code, but you can create a new df groupping the initial one by gene and merge/join it to your initial df in another line of code.
Upvotes: 0