Reputation:
I have this vector of eligible columns for my script
cols <- c("country", "phone", "car")
And this dataframe
test <-
data.frame(
id = c(1, 2, 3),
country = c("us", NA, "uk"),
phone = c(1, 1, NA),
car = c(NA, 0, 1)
)
The goal is to create a new column with the result, where the condition will be based only on columns present in cols variable. In case that all values for id are NA
, then res should be string nothing, if some of them are not NA, then I need to this colnames
, in case that all columns are not NA then result should be string all.
result <-
data.frame(
id = c(1, 2, 3),
country = c("us", NA, NA),
phone = c(1, 1, NA),
car = c(NA, NA, NA),
res = c("country, phone", "phone", "nothing")
)
I can do it only via case_when()
function
mutate(
res = case_when(
!is.na(country) & is.na(phone) & is.na(car) ~ "country",
T ~ "?"
)
Upvotes: 2
Views: 166
Reputation: 388982
The data which you have shared is different (test
and result
). So we will start with result
by removing the res
column.
library(dplyr)
result$res <- NULL
result %>%
mutate_all(as.character) %>%
tidyr::pivot_longer(cols = cols) %>%
group_by(id) %>%
summarise(res = toString(name[!is.na(value)])) %>%
type.convert() %>%
left_join(res, by = 'id') %>%
mutate(res = case_when(res == '' ~ 'nothing',
stringr::str_count(result, ',') ==
(length(cols) - 1) ~ 'all',
TRUE ~ as.character(result)))
# A tibble: 3 x 5
# id res country phone car
# <dbl> <chr> <fct> <dbl> <lgl>
#1 1 country, phone us 1 NA
#2 2 phone NA 1 NA
#3 3 nothing NA NA NA
We get the data in long format, get the column names which have non-NA value for each ID
. We then change the res
column to "all"
or "nothing"
if there are all or 0 matches respectively.
Upvotes: 0
Reputation: 7941
You can do this in base R (rather than dplyr
) using the code:
result$res <- apply(result[,cols],1, function(x){paste(cols[!is.na(x)], collapse=", ")})
result$res[results$res==""] <- "nothing"
Upvotes: 3