egor7
egor7

Reputation: 4941

data.frame tags usage failed

I've created 3 vectors:

v1 = c(1,2,3)
v2 = c(11,22,33)
v3 = c(111,222,333)

Then I've made a frame from them:

> df = data.frame(vec1 = v1, vec2 = v2, vec3 = v3)                                                                                                                                
> df
  vec1 vec2 vec3
1    1   11  111
2    2   22  222
3    3   33  333

It seems like column names is not automatic now, but vec1, vec2, vec3.

After this I want to get a frame row where vec2 is equal to 11:

> df[vec2 == 11,]
Error in `[.data.frame`(df, vec2 == 11, ) : object 'vec2' not found

But the following code works:

> df[v2 == 11,]
  vec1 vec2 vec3
1    1   11  111

I think this is wrong. I don't understand why R uses old vector names, instead of tags vec1, vec2, vec3.

Is it a bug of my version of R?

R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin12.2.0/x86_64 (64-bit)

Upvotes: 1

Views: 81

Answers (4)

musically_ut
musically_ut

Reputation: 34288

You can use the said syntax if you attach df first:

df = data.frame(vec1 = v1, vec2 = v2, vec3 = v3)
attach(df)
df[vec2 == 11,]

will output:

   vec1 vec2 vec3
1    1   11  111

While this can be helpful while working on the console in terms of sheer typing it avoids, it should be generally avoided during scripting as per the Google R style guide.

Upvotes: 0

Hristo Iliev
Hristo Iliev

Reputation: 74375

It's not a bug but rather a misinterpretation - delete v2 using rm(v2) and df[v2 == 11,] would fail. One can use subset() to subset a data frame using column names:

> subset(df, vec2 == 11)
  vec1 vec2 vec3
1    1   11  111

subset also supports extraction of specific columns, e.g.

> subset(df, vec2 == 11, select = vec1:vec2)
  vec1 vec2
1    1   11

Upvotes: 2

juba
juba

Reputation: 49033

When you use the following syntax :

df[vec2 == 11,]

R is trying to select rows of df based on the values of the vec2 vector. But there is no such vector : there is only a column of your data frame with this name. So the syntax you are looking for is :

df[df$vec2 == 11,]

The following works because the vector has been defined previously in your R session :

df[v2 == 11,]

Upvotes: 2

Arun
Arun

Reputation: 118799

Either use:

df[df$vec2 == 11, ]

or

df[with(df, vec2 == 11), ]

The second one worked because v2 == 11 evaluates to TRUE, FALSE, FALSE and so, the first row was being printed. However, vec2 is not a variable that is set. It is a column of a data.frame. So, you'll have to identify it as such with df$vec2 (or use with)

Upvotes: 2

Related Questions