wenge
wenge

Reputation: 59

Accessing column names in a data frame

I have a data frame with column names z_1, z_2 upto z_200. In the following example, for ease of representation, I am showing only z_1

df <- data.frame(x=1:5, y=2:6, z_1=3:7, u=4:8) 
df
i=1
tmp <- paste("z",i,sep="_")
subset(df, select=-c(tmp))

The above code will be used in a loop i for accessing certain elements that need to be removed from the data frame

While executing the above code, I get the error "Error in -c(tmp) : invalid argument to unary operator"

Thank you for your help

Upvotes: 1

Views: 3226

Answers (2)

IRTFM
IRTFM

Reputation: 263311

I you want to use subset and have a large number of columns of similar names to include or exclude, I usually think about using grepl to construct a logical vector of matches to column names (or you could use it to construct a numeric vector just as easily). Negation of the result would remove columns

df <- data.frame(x=1:5, y=2:6, z_1=3:7, u=4:8) 
df
i=1
tmp <- paste("z",i,sep="_")
subset(df, select= !grepl("^z", names(df) ) )
  x y u
1 1 2 4
2 2 3 5
3 3 4 6
4 4 5 7
5 5 6 8

With negation this lets you remove (or without it include) all of the columns starting with "z" using that pattern. Or you can use grep with value =TRUE in combination with character values:

subset(df, select= c("x", grep("^z", names(df), value=TRUE ) ) )

Upvotes: 1

Sacha Epskamp
Sacha Epskamp

Reputation: 47541

Try:

df[names(df)!=tmp]

The reason your code is not working is because -c(tmp), where tmp is a character, evaluates to nothing. You can use this way of excluding with numerical values only.

Alternatively this would also work:

subset(df, select=-which(names(df)==tmp))

Because which returns a number.

Upvotes: 1

Related Questions