Reputation: 159
I have a data frame with many rows and columns, for example
treatment gene1 gene2 gene3 …
A 0 3 0 …
A 0 0 0 …
A 0 0 0 …
A 1 1 0 …
A 0 0 0 …
B 0 1 1 …
B 0 5 2 …
B 0 0 3 …
B 0 0 0 …
… … … …
I would like to have the following data frame based on the rule: if the values of each gene for each treatment are 0, the value of this gene for this treatment is 0 (for example gene1 for treatment A), otherwise 1 (for example gene1 for treatment B). So the new data frame will be the below data frame.
treatment gene1 gene2 gene3 …
A 1 1 0 …
B 0 1 1 …
… … … … …
Thank you very much for your help.
Upvotes: 1
Views: 44
Reputation: 887851
An option with base R
+(rowsum(df[-1], df$treatment) > 0)
# gene1 gene2 gene3
#A 1 1 0
#B 0 1 1
df <- structure(list(treatment = c("A", "A", "A", "A", "A", "B", "B",
"B", "B"), gene1 = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L), gene2 = c(3L,
0L, 0L, 1L, 0L, 1L, 5L, 0L, 0L), gene3 = c(0L, 0L, 0L, 0L, 0L,
1L, 2L, 3L, 0L)), class = "data.frame", row.names = c(NA, -9L
))
Upvotes: 0
Reputation: 40171
With dplyr
, you can do:
df %>%
group_by(treatment) %>%
summarise_all(list(~ as.integer(any(.))))
treatment gene1 gene2 gene3
<fct> <int> <int> <int>
1 A 1 1 0
2 B 0 1 1
The same with base R
:
aggregate(. ~ treatment, FUN = function(x) as.integer(any(x)), data = df)
Upvotes: 1