How to add row based on the other rows?

Question

I have a data frame with many rows and columns, for example

treatment   gene1   gene2   gene3   …
A   0   3   0   …
A   0   0   0   …
A   0   0   0   …
A   1   1   0   …
A   0   0   0   …
B   0   1   1   …
B   0   5   2   …
B   0   0   3   …
B   0   0   0   …
…   …   …   …

I would like to have the following data frame based on the rule: if the values of each gene for each treatment are 0, the value of this gene for this treatment is 0 (for example gene1 for treatment A), otherwise 1 (for example gene1 for treatment B). So the new data frame will be the below data frame.

treatment   gene1   gene2   gene3   …
A   1   1   0   …
B   0   1   1   …
…   …   …   …   …

Thank you very much for your help.

tmfmnk · Accepted Answer

With dplyr, you can do:

df %>%
 group_by(treatment) %>%
 summarise_all(list(~ as.integer(any(.))))

  treatment gene1 gene2 gene3
         
1 A             1     1     0
2 B             0     1     1

The same with base R:

aggregate(. ~ treatment, FUN = function(x) as.integer(any(x)), data = df)

How to add row based on the other rows?

Answers (2)

data

Related Questions