Reputation: 23921
I am very new to R. I have data that looks like this:
> head(NB)
a s e i
9011 20-30 F Others 10-50K
9012 GT 45 M Others 10-50K
I classify it with naiveBayes like this:
c = i ~ a + s + e
cl = naiveBayes(c, head(NB,1500), laplace = 0)
Then I predict its outcome on the new data like this
> p <- predict(classifier, tail(NB, 500), type = c("class", "raw"), threshold = 0.001)
I want to look at the prediction for each datapoint in p and see how well it matches up with the actual value for p -- but I can't figure out what p actually represents. It seems to have no rows and no columns -- but it plots into a histogram that seems to show predictions from the data.
> nrow(p)
NULL
> ncol(p)
NULL
> str(p) says
Factor w/ 3 levels "10-50K","50-80K",..: 1 1 1 1 1 1 1 1 1 1 ...
What is going on? How do I find out what it predicts, for say, the 3rd value in the P dataset? Why doesn't p have any rows or columns?
Upvotes: 0
Views: 131
Reputation: 88
p is a vector of factors. In R, vectors do not have a number of rows or columns, only a length. Typing length(p)
will give you the length. Each element of p is one of "10-50K", "50-80K", or a third value. To see the different values in p, type unique(p)
.
To get the third element of p, just access it as you would with any other vector p[3]
or to see all of p print(p)
. If you want to count the number that are the same as your original data, try sum(p == NB$i)
. Have a look here for more info http://www-users.cs.york.ac.uk/~jc/teaching/arin/R_practical/.
Upvotes: 1