Reputation: 4941
I've created 3 vectors:
v1 = c(1,2,3)
v2 = c(11,22,33)
v3 = c(111,222,333)
Then I've made a frame from them:
> df = data.frame(vec1 = v1, vec2 = v2, vec3 = v3)
> df
vec1 vec2 vec3
1 1 11 111
2 2 22 222
3 3 33 333
It seems like column names is not automatic now, but vec1, vec2, vec3
.
After this I want to get a frame row where vec2
is equal to 11:
> df[vec2 == 11,]
Error in `[.data.frame`(df, vec2 == 11, ) : object 'vec2' not found
But the following code works:
> df[v2 == 11,]
vec1 vec2 vec3
1 1 11 111
I think this is wrong. I don't understand why R
uses old vector names, instead of tags vec1, vec2, vec3
.
Is it a bug of my version of R
?
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin12.2.0/x86_64 (64-bit)
Upvotes: 1
Views: 81
Reputation: 34288
You can use the said syntax if you attach df
first:
df = data.frame(vec1 = v1, vec2 = v2, vec3 = v3)
attach(df)
df[vec2 == 11,]
will output:
vec1 vec2 vec3
1 1 11 111
While this can be helpful while working on the console in terms of sheer typing it avoids, it should be generally avoided during scripting as per the Google R style guide.
Upvotes: 0
Reputation: 74375
It's not a bug but rather a misinterpretation - delete v2
using rm(v2)
and df[v2 == 11,]
would fail. One can use subset()
to subset a data frame using column names:
> subset(df, vec2 == 11)
vec1 vec2 vec3
1 1 11 111
subset
also supports extraction of specific columns, e.g.
> subset(df, vec2 == 11, select = vec1:vec2)
vec1 vec2
1 1 11
Upvotes: 2
Reputation: 49033
When you use the following syntax :
df[vec2 == 11,]
R is trying to select rows of df
based on the values of the vec2
vector. But there is no such vector : there is only a column of your data frame with this name. So the syntax you are looking for is :
df[df$vec2 == 11,]
The following works because the vector has been defined previously in your R session :
df[v2 == 11,]
Upvotes: 2
Reputation: 118799
Either use:
df[df$vec2 == 11, ]
or
df[with(df, vec2 == 11), ]
The second one worked because v2 == 11
evaluates to TRUE, FALSE, FALSE
and so, the first row was being printed. However, vec2
is not a variable that is set. It is a column of a data.frame
. So, you'll have to identify it as such with df$vec2
(or use with
)
Upvotes: 2