oercim
oercim

Reputation: 1848

Extracting a specific type columns and specific named columns from a data frame-R

Let I have a data frame where some colums rae factor type and there is column named "index" which is not a column. I want to extract columns

For example let

df<-data.frame(a=runif(10),b=as.factor(sample(10)),index=as.numeri(1:10))

So df is:

         a  b index
0.16187501  5     1
0.75214741  8     2
0.08741729  3     3
0.58871514  2     4
0.18464752  9     5
0.98392420  1     6
0.73771960 10     7
0.97141474  6     8
0.15768011  7     9
0.10171931  4    10

Desired output is(let it be a data frame called df1)

df1:

   b index
   5     1
   8     2
   3     3
   2     4
   9     5
   1     6
  10     7
   6     8
   7     9
   4    10

which consist the factor column and the column named "index".

I use such a code

  vars<-apply(df,2,function(x) {(is.factor(x)) || (names(x)=="index")})

  df1<-df[,vars]

However, this code does not work. How can I return df1 using apply types function in R? I will be very glad for any help. Thanks a lot.

Upvotes: 1

Views: 66

Answers (1)

eipi10
eipi10

Reputation: 93761

You could do:

df[ , sapply(df, is.factor) | grepl("index", names(df))]

I think two things went wrong with your method: First, apply converts the data frame to a matrix, which doesn't store values as factors (see here for more on this). Also, in a matrix, every value has to be of the same mode (character, numeric, etc.). In this case, everything gets coerced to character, so there's no factor to find.

Second, the column name isn't accessible within apply (AFAIK), so names(x) returns NULL and names(x)=="index" returns logical(0).

Upvotes: 2

Related Questions