N.Varela
N.Varela

Reputation: 910

R: How to subset columns based on values of the first row?

I want to make a subset of columns according to a certain value in the first row. Here an example:

df <- data.frame( region = c("A", sample(1:5,3)),
                  region = c("B", sample(1:5,3)),
                  region = c("C", sample(1:5,3)),
                  region = c("A", sample(1:5,3)) )

> df
  region region.1 region.2 region.3
1      A        B        C        A
2      5        5        3        3
3      2        1        5        4
4      4        2        1        5

I want to subset all columns that show an A in the first row. I can't do this using index numbers as I have more than 3000 columns in my dataset and the names of the colnames are also important thats why I'm using the first row as a second header. The result for this example should return:

  region  region.3
1      A         A
2      5         3
3      2         4
4      4         5 

And how can I avoid the automatic counting in the colnames for same names (region.1, region.2...)? Thanks for your ideas.

Upvotes: 0

Views: 4382

Answers (1)

Jilber Urbina
Jilber Urbina

Reputation: 61154

You can use index as in

> df[, df[1, ] == "A"]
  region region.3
1      A        A
2      3        1
3      2        5
4      1        4

Try using check.names=FALSE for your second question

> data.frame( region = c("A", sample(1:5,3)),
+             region = c("B", sample(1:5,3)),
+             region = c("C", sample(1:5,3)),
+             region = c("A", sample(1:5,3)), check.names=FALSE )
  region region region region
1      A      B      C      A
2      5      5      4      2
3      2      1      5      5
4      4      2      2      4

Upvotes: 3

Related Questions