Reputation: 361
I have a data frame with a column that contains some elements that are lists. I would like to find out which rows of the data frame contain a keyword in that column.
The data frame, df, looks a bit like this
idstr tag
1 wl
2 other.to
3 other.from
4 c("wl","other.to")
5 wl
6 other.wl
7 c("ll","other.to")
The goal is to assign all of the rows with 'wl' in their tag to a new data frame. In this example, I would want a new data frame that looks like:
idstr tag
1 wl
4 c("wl","other.to")
5 wl
I tried something like this
df_wl <- df[which(is.element('wl',df$tag)),]
but this only returns the first element of the data frame (whether or not it contains 'wl'). I think the trouble lies in iterating through the rows and implementing the "is.element" function. Here are two implementations of the function and it's results:
is.element('wl',df$tag[[4]]) > TRUE
is.element('wl',df$tag[4]) > FALSE
How do you suggest I iterate through the dataframe to assign df_wl with it's proper values?
PS: Here's the dput:
structure(list(idstr = 1:7, tag = structure(c(6L, 5L, 4L, 2L, 6L, 3L, 1L), .Label = c("c(\"ll\",\"other.to\")", "c(\"wl\",\"other.to\")", "other.wl", "other.from", "other.to", "wl"), class = "factor")), .Names = c("idstr", "tag"), row.names = c(NA, -7L), class = "data.frame")
Upvotes: 1
Views: 1390
Reputation: 99331
Based on your dput
data. this may work. The regular expression (^wl$)|(\"wl\")
matches wl
from beginning to end, or any occurrence of "wl"
(wrapped in double quotes)
df[grepl("(^wl$)|(\"wl\")", df$tag),]
# idstr tag
# 1 1 wl
# 4 4 c("wl","other.to")
# 5 5 wl
Upvotes: 2