Reputation: 169
I have a df1 with ids
df1 <- read.table(text="ID
8765
1879
8706
1872
0178
0268
0270
0269
0061
0271", header=T)
second df2 with columns names
> names(df2)
[1] "TW_3784.IT" "TW_3970.IT" "TW_1879.IT" "TW_0178.IT" "SF_0271.IT" "TW_3782.IT"
[7] "TW_3783.IT" "TW_8765.IT" "TW_8706.IT" "SF_0268.IT" "SF_0270.IT" "SF_0269.IT"
[13] "SF_0061.IT"
What i need is to keep only columns from df2 that partial match with df1
df3 = df2 %>%
dplyr::select(df2 , dplyr::contains(df1$ID))
error
Error in dplyr::contains(df1$ID) : is_string(match) is not TRUE
df3 = df2[,grepl(df1$ID, names(df2))]
error
In grepl(df1$ID, names(df2)) :
argument 'pattern' has length > 1 and only the first element will be used
Upvotes: 0
Views: 1230
Reputation: 12074
Here's a solution that uses the dplyr
package.
df2 %>% select(matches(paste(df1$ID, collapse = "|")))
This pastes together the ID
s from df1
with |
as a separator (meaning logical OR
) like this:
"8765|1879|8706|1872|178|268|270|269|61|271"
This is needed as matches
then looks for columns names that matches one OR another of these numbers and these columns are then select
ed. dplyr
is needed for select
, matches
and also %>%
.
Upvotes: 1
Reputation: 4970
As there is a clear pattern in the column names, you can use substr
to extract each 4 digit ID. Convert it to a numeric to remove leading zeros. Use which
to identify the column numbers that you want to keep.
df2 <- c("TW_3784.IT", "TW_3970.IT", "TW_1879.IT", "TW_0178.IT", "SF_0271.IT", "TW_3782.IT")
numbers <- which(as.numeric(substr(df2, 4, 7)) %in% df1[,1])
Next, you can use these column numbers to subset your dataframe: df[,numbers]
.
Upvotes: 1
Reputation: 2021
In df1 your "text" column is of integer type.
str(df1)
'data.frame': 10 obs. of 1 variable:
$ ID: int 8765 1879 8706 1872 178 268 270 269 61 271
Convert to a string and the is_string() should return true.
b6$ID <- as.character(b6$ID)
Upvotes: 0