Suicide Bunny
Suicide Bunny

Reputation: 938

how to get item frequency in a row in a r data frame?

I have a data frame that looks like

GeneID   person1  person2 ... person100  homo1 homo2 heter homo1count homo2count hetercount
1        AA       AC           AA         AA   CC    AC     25         50        25
2  .....
3  .....

How may I get the count 25, 50, 25?

I was trying to use apply as

g <- function(df, AA) {
  x = table(df)
  AA = x[which(names(x) == df$homo1)]
  }
x = apply(temp,1,g)

But it didn't work, the df$homo1 is always a list

Thanks!

Upvotes: 0

Views: 104

Answers (1)

Neal Fultz
Neal Fultz

Reputation: 9696

This is easier if you pivot to long format first, then aggregate. Something like this:

require(reshape2)
require(dplyr)

g <- c('AC', 'AA', 'CC')
n <- 30

df <- data.frame(gene_id=1:30, person1=sample(g,n,replace=TRUE),
                               person2=sample(g,n,replace=TRUE),
                               person3=sample(g,n,replace=TRUE),
                               person4=sample(g,n,replace=TRUE),
                               homo1=sample(g,n,replace=TRUE),
                               homo2=sample(g,n,replace=TRUE),
               stringsAsFactors=FALSE)

df %>% melt(c("gene_id", "homo1", "homo2")) %>% 
       group_by(gene_id) %>%
       summarise(homo1count=sum(homo1==value),
                 homo2count=sum(homo2==value) ) %>%
       merge(x=df)

EDIT: sample output:

   gene_id person1 person2 person3 person4 homo1 homo2 homo1count homo2count
1        1      AA      CC      AC      AA    AC    CC          1          1
2        2      AC      AA      CC      CC    CC    AA          2          1
3        3      AC      CC      CC      AA    CC    AA          2          1
4        4      AC      AC      AC      AA    AA    AA          1          1
5        5      CC      AC      AA      AC    AA    AC          1          2
6        6      CC      AC      CC      CC    AA    AA          0          0
7        7      AA      AA      AC      AA    CC    CC          0          0
8        8      AA      AC      AA      CC    AC    CC          1          1

Upvotes: 2

Related Questions