Reputation: 15
I'm trying to summarize rows with the same ID to one row including values from all columns including NAs.
My dataset looks like this:
# id Date Red Green Blue Black
#1 1 01-01-21 TRUE NA NA NA
#2 1 01-01-21 NA TRUE NA NA
#3 2 05-01-21 TRUE NA NA NA
#4 3 07-01-21 NA NA TRUE NA
#5 3 07-01-21 TRUE NA NA NA
#6 3 07-01-21 NA NA NA TRUE
My result should look something like this
# id Date Red Green Blue Black
#1 1 01-01-21 TRUE TRUE NA NA
#3 2 05-01-21 TRUE NA NA NA
#4 3 07-01-21 TRUE NA TRUE TRUE
I already tried the following, but "TRUE" is replaced with "1" and NAs are replaced with "0". Additionally, my "Date" column is erased.
test <- test %>%
group_by(id) %>%
summarize(across(Red:Black, ~sum(.x, na.rm=TRUE)))
Thank you for your help!
Upvotes: 0
Views: 680
Reputation: 8880
library(tidyverse)
df <-
structure(
list(
id = c(1L, 1L, 2L, 3L, 3L, 3L),
Date = c(
"01.01.2021",
"01.01.2021",
"05.01.2021",
"07.01.2021",
"07.01.2021",
"07.01.2021"
),
Red = c(TRUE, NA, TRUE, NA, TRUE, NA),
Green = c(NA, TRUE, NA, NA, NA, NA),
Blue = c(NA, NA, NA, TRUE, NA, NA),
Black = c(NA, NA, NA, NA, NA, TRUE)
),
class = "data.frame",
row.names = c(NA, -6L))
df %>%
pivot_longer(-c(id, Date), values_drop_na = T) %>%
pivot_wider(id_cols = c(id, Date), names_from = name, values_from = value)
#> # A tibble: 3 x 6
#> id Date Red Green Blue Black
#> <int> <chr> <lgl> <lgl> <lgl> <lgl>
#> 1 1 01.01.2021 TRUE TRUE NA NA
#> 2 2 05.01.2021 TRUE NA NA NA
#> 3 3 07.01.2021 TRUE NA TRUE TRUE
Created on 2021-08-11 by the reprex package (v2.0.1)
or
library(tidyverse)
df %>%
group_by(id) %>%
fill(everything(), .direction = "downup") %>%
distinct() %>%
ungroup()
#> # A tibble: 3 x 6
#> id Date Red Green Blue Black
#> <int> <chr> <lgl> <lgl> <lgl> <lgl>
#> 1 1 01.01.2021 TRUE TRUE NA NA
#> 2 2 05.01.2021 TRUE NA NA NA
#> 3 3 07.01.2021 TRUE NA TRUE TRUE
Created on 2021-08-11 by the reprex package (v2.0.1)
Upvotes: 1