user1357015
user1357015

Reputation: 11696

R which statement not selecting string appropriately

I'm trying to cut my matrix down to specific rows. The question is best explained through the output below:

Browse[2]> structure[which(structure$atom == "CA"),]

     recordName serial atom 
  1:       ATOM      2   CA 
  2:       ATOM     10   CA 
  3:       ATOM     18   CA 
  4:       ATOM     24   CA 
  5:       ATOM     31   CA 
 ---                        
572:       ATOM   4353   CA 
573:       ATOM   4358   CA 
574:       ATOM   4368   CA 
575:       ATOM   4377   CA 
576:       ATOM   4389   CA 

Browse[2]> structure[which(structure$atom == atom),]

      recordName serial atom 
   1:       ATOM      1    N 
   2:       ATOM      2   CA 
   3:       ATOM      3    C 
   4:       ATOM      4    O 
   5:       ATOM      5   CB 
  ---                        
4392:       ATOM   4394  ND1 
4393:       ATOM   4395  CD2 
4394:       ATOM   4396  CE1 
4395:       ATOM   4397  NE2 
4396:       ATOM   4398  OXT 

Browse[2]> atom
[1] "CA"

My question is, why is that when I type in atom and not CA, I get a different selection as to the rows. As you can see, the variable itself is equal to "CA".

Thank you for your help!

Upvotes: 2

Views: 223

Answers (2)

Ricardo Saporta
Ricardo Saporta

Reputation: 55390

@matthewlundberg gave you the proper explanation, as for the workaround, use get():

 structure[which(structure$atom == get("atom", envir=globalenv())),]

On a side note, there is a lot of superfluous syntax in your statement. Namely, which is not needed, and there is no need to reference the datatable itself within the i= argument, and no need for the ending comma

ie, use

 structure[atom == get("atom", envir=globalenv())) ]

Upvotes: 1

Matthew Lundberg
Matthew Lundberg

Reputation: 42669

data.table evaluates names in the environment of the table first, that is, columns.

Example:

> x <- data.table(a=1:5, b=11:15)
> x[a==1]
   a  b
1: 1 11
> a <- 1
> x[x$a==a]
   a  b
1: 1 11
2: 2 12
3: 3 13
4: 4 14
5: 5 15

As MrFlick indicates, the last statement is equivalent to x[a==a]. Both a's are the column in x.

Note that which is not necessary nor helpful for this operation, and for data.table, the trailing , is not required to select rows.

Upvotes: 3

Related Questions