Reputation: 163
I do not really understand why these two dplyr functions behave differently? Why is matches including a "wrong" variable?
# create dataframe
df <- data.frame(agr_barriers.crop_disease = 1,
agr_barriers.lack_mats = 1,
agr_barriers.sickness = 1,
agr_barriers_mats.dontknow = 1)
###1 select dataframe
df2 <- df %>%
select(contains("agr_barriers."))
colnames(df2)
###2 select dataframe
df3 <- df %>%
select(matches("agr_barriers."))
colnames(df3)
Upvotes: 0
Views: 40
Reputation: 11584
Because "matches(): Matches a regular expression." So the (.) dot in "agr_barriers." matches any single character. That's why it's giving agr_barriers_mats.dontknow.
For ex:
> df <- data.frame(agr_barriers.crop_disease = 1,
+ agr_barriers.lack_mats = 1,
+ agr_barrierss.sickness = 1,
+ agr_barriers_mats.dontknow = 1)
> ###1 select dataframe
> df2 <- df %>%
+ select(contains("agr_barriers."))
> colnames(df2)
[1] "agr_barriers.crop_disease" "agr_barriers.lack_mats"
> ###2 select dataframe
> df3 <- df %>%
+ select(matches("agr_barriers."))
> colnames(df3)
[1] "agr_barriers.crop_disease" "agr_barriers.lack_mats" "agr_barrierss.sickness" "agr_barriers_mats.dontknow"
>
I added an extra 's' to third column in df. now when you do select(contains("agr_barriers."))
, the result doesn't have sickness column as seen in colnames(df2).
Upvotes: 3