Reputation: 671
I am trying to combine multiple columns into a single cell for each row and then remove missing values.
Sample data:
df <- data.frame(a=c("a", "b", "c", "d"),
b=c(NA, "a", "b", "c"),
c=c("a", "b", "e", "g"))
Attempt:
df %>% rowwise() %>%
mutate(collapse=as.character(paste(a,b,c, collapse=",")),
collapse_nona=na.omit(collapse))
Output:
# A tibble: 4 x 5
a b c collapse collapse_nona
* <fct> <fct> <fct> <chr> <chr>
1 a NA a a NA a,b a b,c b e,d c… a NA a,b a b,c b e,d …
2 b a b a NA a,b a b,c b e,d c… a NA a,b a b,c b e,d …
3 c b e a NA a,b a b,c b e,d c… a NA a,b a b,c b e,d …
4 d c g a NA a,b a b,c b e,d c… a NA a,b a b,c b e,d …
1) I am not successfully creating cells with values for each row (the whole column appears in collapse).
2) Cells in the collapse column do not behave like a vector.
Desired output
a b c collapse collapse_nona
* <fct> <fct> <fct> <chr> <chr>
1 a NA a a NA a a a
2 b a b b a b b a b
3 c b e c b e c b e
4 d c g d c g d c g
Thank you
Upvotes: 2
Views: 2250
Reputation: 361
I think this does it. You could play around with the sep argument in str_c.
library(dplyr)
library(stringr)
df %>%
mutate(collapse = str_c(str_replace_na(a), str_replace_na(b), str_replace_na(c), sep = " "),
collapse_nona = str_c(str_replace_na(a, ""), str_replace_na(b, ""), str_replace_na(c,""), sep = " "))
a b c collapse collapse_nona
1 a <NA> a a NA a a a
2 b a b b a b b a b
3 c b e c b e c b e
4 d c g d c g d c g
Upvotes: 0
Reputation: 2698
The think the core issue is that you don't want collapse
, you want sep
. Then rowwise calculation is unnecessary. Also, NA
will get printed as character, so you cannot remove them with na.omit
df %>%
mutate(collapse = paste(a,b,c, sep = " "), collapse_nona = gsub("NA", "", collapse))
a b c collapse collapse_nona
1 a <NA> a a NA a a a
2 b a b b a b b a b
3 c b e c b e c b e
4 d c g d c g d c g
Upvotes: 2
Reputation: 887118
With unite
, there is an option for na.rm
and it is by default FALSE
library(tidyr)
library(dplyr)
df %>%
mutate_all(as.character) %>%
unite(collapse, a, b,c, remove = FALSE, sep=" ") %>%
unite(collapse_nona, a, b, c, remove = FALSE, sep=" ", na.rm = TRUE) %>%
select(names(df), everything())
# a b c collapse collapse_nona
#1 a <NA> a a NA a a a
#2 b a b b a b b a b
#3 c b e c b e c b e
#4 d c g d c g d c g
Or with paste
and str_remove_all
(from stringr
) - Note that paste/str_c
are vectorized, so there is no need to loop over each row with rowwise
df %>%
mutate(collapse = paste(a, b, c),
collapse_nona = str_remove_all(collapse, "\\sNA|NA\\s"))
# a b c collapse collapse_nona
#1 a <NA> a a NA a a a
#2 b a b b a b b a b
#3 c b e c b e c b e
#4 d c g d c g d c g
Another option is pmap
to loop over each row, remove the NA
elements with na.omit
and then paste
or str_c
(from stringr
)
library(dplyr)
library(stringr)
library(purrr)
df %>%
mutate_all(as.character) %>%
mutate(collapse_nona = pmap_chr(., ~ c(...) %>%
na.omit %>%
str_c(collapse=" ")))
# a b c collapse_nona
#1 a <NA> a a a
#2 b a b b a b
#3 c b e c b e
#4 d c g d c g
Upvotes: 4