Reputation: 18613
This snippet:
names<-c("Alice","Bob","Charlie")
ages<-c(25,24,25)
friends<-data.frame(names,ages)
a25 <- friends[friends$age==25,]
a25
table(a25$names)
gives me this output
names ages
1 Alice 25
3 Charlie 25
Alice Bob Charlie
1 0 1
Now, why "Bob" is in the output since the data frame a25
does not include "Bob"? I would expected an output like this (from the table
command):
Alice Charlie
1 1
What am I missing?
My environment:
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)
Upvotes: 1
Views: 186
Reputation: 193527
This question appears to have an answer in the comments. This answer shares one additional approach and consolidates the suggestions from the comments.
The problem you describe is as follows: There is no "Bob" in your "a25$names" variable, but when you use table
, "Bob" shows up. This is because the levels present in the original column have been retained.
table(a25$names)
#
# Alice Bob Charlie
# 1 0 1
Fortunately, there's a function called droplevels
that takes care of situations like this:
table(droplevels(a25$names))
#
# Alice Charlie
# 1 1
The droplevels
function can work on a data.frame
too, allowing you to do the following:
a25alt <- droplevels(friends[friends$ages==25,])
a25alt
# names ages
# 1 Alice 25
# 3 Charlie 25
table(a25alt$names)
#
# Alice Charlie
# 1 1
As mentioned in the comments, also look at as.character
and factor
:
table(as.character(a25$names))
table(factor(a25$names))
Upvotes: 1