Reputation: 241
I'm trying to select rows in a data.table. I need the values in variable dt$s
to start with any of the strings in vector y
dt <- data.table(x = (c(1:5)), s = c("a", "ab", "b.c", "db", "d"))
y <- c("a", "b")
Desired result:
x s
1: 1 a
2: 2 ab
3: 3 b.c
I would use dt[s %in% y]
for a full match, and %like%
or "^a*"
for a partial match with a single string, but I'm not sure how to get a strict starts with
match on a character vector.
My real dataset and character vector is quite large, so I'd appreciate an efficient solution.
Thanks.
Upvotes: 2
Views: 1480
Reputation: 887118
Using glue
and filter
library(glue)
library(dplyr)
library(stringr)
dt %>%
filter(str_detect(s, glue("^({str_c(y, collapse = '|')})")))
# x s
#1: 1 a
#2: 2 ab
#3: 3 b.c
Upvotes: 1
Reputation: 388982
You can create a pattern dynamically from y
.
library(data.table)
pat <- sprintf('^(%s)', paste0(y, collapse = '|'))
pat
#[1] "^(a|b)"
and use it to subset the data.
dt[grepl(pat, s)]
# x s
#1: 1 a
#2: 2 ab
#3: 3 b.c
Upvotes: 1