Reputation: 743
I'm trying to achieve positive subsetting specifically using a combination of dplyr::select()
and dplyr::contains()`, with the goal being to subset by multiple string matches.
Minimal working example: when starting off with df1
and doing negative subsetting, I generate df2
as expected. In contrast, when attempting positive subsetting of df1
, I generate df3
(no columns) when I'd have expected something like df4
. Thanks for any help.
df1 <- data.frame("ppt_paint"=c(45,98,23),"het_heating"=c(1,1,2) ,"orm_wood"=c("QQ","OA","BB"), "hours"=c(4,6,4), "distance"=c(23,65,21))
df2 <- df1 %>% select(-contains("ppt_")) %>% select(-contains("het_")) %>% select(-contains("orm_"))
df3 <- df1 %>% select(contains("ppt_")) %>% select(contains("het_")) %>% select(contains("orm_"))
df4 <- data.frame("ppt_paint"=c(45,98,23),"het_heating"=c(1,1,2) ,"orm_wood"=c("QQ","OA","BB"))
Upvotes: 2
Views: 1965
Reputation: 7443
Think (and have a look to the resulting data.frame
) to what happens after: df1 %>% select(contains("ppt_"))
. As asked, it only retains the only column that contains "ppt_"
. Further expressions cannot work as you expect since other columns, no matter what you're feeding select
with, are "no longer" there.
You can keep the same idea but combine in the same select
you three keys:
df1 %>% select(matches("ppt_"), matches("het_"), matches("orm_"))
ppt_paint het_heating orm_wood
1 45 1 QQ
2 98 1 OA
3 23 2 BB
Alternatively, you can do it with matches
, that accepts regular expressions:
df1 %>% select(matches(c("ppt_|het_|orm_")))
ppt_paint het_heating orm_wood
1 45 1 QQ
2 98 1 OA
3 23 2 BB
And by the way you can also use it to shorten your "negative" indexing:
df1 %>% select(-matches("ppt_|het_|orm_"))
hours distance
1 4 23
2 6 65
3 4 21
Upvotes: 1