Reputation: 600
I have this very simple dataset.
id <- c(12, 13, 26, 45, 55, 66)
group <- c("one", "one", "two", "two", "three", "three")
txt <- c("quick", "brown", NA, "fox", NA, NA)
df <- data.frame(id, group, txt)
df
> df
id group txt
1 12 one quick
2 13 one brown
3 26 two <NA>
4 45 two fox
5 55 three <NA>
6 66 three <NA>
I want to aggregate by group. I am trying this,
df |>
dplyr::group_by(group) |>
dplyr::summarise(txt = paste(txt, collapse = ", "))
# A tibble: 3 × 2
group txt
<chr> <chr>
1 one quick, brown
2 three NA, NA
3 two NA, fox
As we see the NA
are outed as characters. Secondly, three comes after one in the group column. I want to have something like this,
group txt
1 one quick, brown
2 two fox
3 three <NA>
Any idea how to fix this simple problem?
Upvotes: 1
Views: 71
Reputation: 270045
Check if all values are NA and if so return NA and otherwise remove the NA's and use toString on the remaining elements.
library(dplyr)
df %>%
summarize(txt = if (all(is.na(txt))) NA else toString(na.omit(txt)), .by=group)
A variation of that approach which is slightly more compact is to compute toString as above but then use Find to return it or NULL if it is the empty string. ... %||% NA
then converts NULL to NA.
df %>%
summarize(txt = Find(nzchar, toString(na.omit(txt))) %||% NA, .by = group)
Upvotes: 0
Reputation: 7979
Base R
aggregate(txt~group, df, na.action = 'na.pass',
FUN = \(i) {i = na.omit(unique(i)); if (length(i)) i else NA})[
order(unique(df$group)), ]
group txt
1 one quick, brown
3 two fox
2 three NA
Upvotes: 2
Reputation: 389205
You may either handle it yourself by code -
library(dplyr)
df |>
dplyr::summarise(txt = if(all(is.na(txt))) NA else toString(na.omit(txt)), .by = group)
group txt
#1 one quick, brown
#2 two fox
#3 three <NA>
Or use hablar::max_
which does the same thing.
df |>
dplyr::summarise(txt = toString(hablar::max_(txt)), .by = group)
Upvotes: 2