dfrankow
dfrankow

Reputation: 21387

Subset a data frame using a vector, including NAs

I'd like to get a subset of data frame rows where one of its columns is equal to the values in a vector, where either the vector or the data frame column can have NAs. I'd like to include a match on NAs, but only if both column and vector have an NA. (FYI, this is for a labeller function which gets a vector as a parameter.)

Some data:

df1 <- data.frame(x_var_value=c('a', 'b', NA), num=c(1,2,3))
v2 <- c('a', 'b')

Some examples that don't work of trying to select rows from df1 where x_var_value == v2:

df1[df1$x_var_value == v2]$x_var_value

df1[(df1$x_var_value == v2) | (is.na(df$x_var_value) & is.na(v2))]$x_var_value

library(tidyverse)
df1 %>% filter(x_var_value == v2)

I should be able to use the answers from here or here, but somehow it eludes me.

EDIT: I think the labeller function probably wants output in the same order as input. If so, I need the match in v2 order.

EDIT 2: I also don't know if the labeller function will ever get passed variable values more than once. Probably not?

Upvotes: 1

Views: 69

Answers (3)

akrun
akrun

Reputation: 887118

Another option with data.table

library(data.table)
setDT(df1)[as.character(x_var_value) %chin% v2]

Upvotes: 0

Gavin Kelly
Gavin Kelly

Reputation: 2414

I think the problem is that == v2 is comparing x_var_value to a vector, rather than a single value, so what you need is to check if the value is in rather than equal to v2

df1[df1$x_var_value %in% v2,]

If v2 contains an NA, e.g. v2 <- c('a', NA) then it will include the rows that have NA in that column.

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 173803

Aren't you just looking for match?

df1[match(v2, df1$x_var_value),]
#>   x_var_value num
#> 1           a   1
#> 2           b   2

Upvotes: 0

Related Questions