Reputation: 59
I have a data frame with column names z_1, z_2 upto z_200. In the following example, for ease of representation, I am showing only z_1
df <- data.frame(x=1:5, y=2:6, z_1=3:7, u=4:8)
df
i=1
tmp <- paste("z",i,sep="_")
subset(df, select=-c(tmp))
The above code will be used in a loop i for accessing certain elements that need to be removed from the data frame
While executing the above code, I get the error "Error in -c(tmp) : invalid argument to unary operator"
Thank you for your help
Upvotes: 1
Views: 3226
Reputation: 263311
I you want to use subset and have a large number of columns of similar names to include or exclude, I usually think about using grepl
to construct a logical vector of matches to column names (or you could use it to construct a numeric vector just as easily). Negation of the result would remove columns
df <- data.frame(x=1:5, y=2:6, z_1=3:7, u=4:8)
df
i=1
tmp <- paste("z",i,sep="_")
subset(df, select= !grepl("^z", names(df) ) )
x y u
1 1 2 4
2 2 3 5
3 3 4 6
4 4 5 7
5 5 6 8
With negation this lets you remove (or without it include) all of the columns starting with "z" using that pattern. Or you can use grep
with value =TRUE
in combination with character values:
subset(df, select= c("x", grep("^z", names(df), value=TRUE ) ) )
Upvotes: 1
Reputation: 47541
Try:
df[names(df)!=tmp]
The reason your code is not working is because -c(tmp)
, where tmp
is a character, evaluates to nothing. You can use this way of excluding with numerical values only.
Alternatively this would also work:
subset(df, select=-which(names(df)==tmp))
Because which
returns a number.
Upvotes: 1