matteo
matteo

Reputation: 4873

use dplyr to select columns

I'm trying to use the select function of dplyr to extract columns of another dataframe.

Here the data frame:

dput(df1)
structure(list(Al = c(30245, 38060, 36280, 24355, 27776, 35190, 
38733.8, 36400, 29624, 33699.75), As = c(9, 8.75, 13.5, 7.75, 
7.6, 8.33, 8, 8.75, 7.4, 8.25), Cd = c(0.15, 0.13, 0.15, 0.1, 
0.16, 0.13, 0.24, 0.15, 0.22, 0.13), Cr = c(108.5, 111.75, 104.5, 
81.25, 93.2, 109.75, 105, 104, 87.8, 99.75), Hg = c(0.25, 0.35, 
0.48, 1.03, 1.12, 0.2, 1.14, 0.4, 2, 0.48)), row.names = c(NA, 
10L), class = "data.frame", .Names = c("Al", "As", "Cd", "Cr", 
"Hg"))

and here the character vector I want to use as filter:

dput(vec_fil)
c("Elemento", "As", "Cd_totale", "Cr_totale", "Cu_totale", "Hg", 
"Ni_totale", "Pb_totale", "Zn_totale", "Composti_organostannici", 
"PCB_totali", "Sommatoria_DDD", "Sommatoria_DDE", "Sommatoria_DDT", 
"Clordano", "Dieldrin", "Endrin", "Esaclorocicloesano", "Eptacloro_epossido", 
"Sommatoria_IPA", "Acenaftene", "Antracene", "Benzo.a.antracene", 
"Benzo.a.pirene", "Crisene", "Dibenzo.ac._.ah.antracene", "Fenantrene", 
"Fluorantene", "Fluorene", "Naftalene", "Pirene")

As you can see vec_fil has many characters that don't match the columns of df1, so I get this error:

require("dplyr")
df2 <- select(df1, one_of(vec_fil))
Error: Each argument must yield either positive or negative integers

Any hint I can use in order to get only the matched character of the filter vector in the new data frame?

Upvotes: 5

Views: 4888

Answers (3)

Espanta
Espanta

Reputation: 1140

I am late in the party. But, no one explain what was the reason for the error. So, I do.

You have wrongly used the one_of() in the dplyr package. According to the package documentation, it selects [all] the variables that are in the vector.

one_of("x", "y", "z"): selects variables provided in a character vector.

It does not allow you to select a subset of variables from the one_of() vector though the name of the function implies that.

In your case, vec_fil vector has some feature names that do not exist in the data frame. Thus, it throws error. You should only use one_of() when you have a long list of feature names and you don't want to type them manually. So, you can read them directly from a list.

Hope it helps you in your future works.

Upvotes: 5

Mamoun Benghezal
Mamoun Benghezal

Reputation: 5314

you can try this code in base R

df1[, names(df1) %in% vec_fil]

and if you want to use package dplyr

select(df1, which(names(df1) %in% vec_fil))

Upvotes: 7

konvas
konvas

Reputation: 14346

Just get rid of variable names not included in your data frame using intersect:

select(df1, one_of(intersect(vec_fil, names(df1))))

Upvotes: 3

Related Questions