Reputation: 465
I have a clustering need for my simple but a bit large data set. It has 3 columns and about 120k rows, plus all the data is numeric. I tried to use rpart but got this lovely error.
Error in rep(1, numclass^2) : invalid 'times' argument
In addition: Warning message:
In matrix(rep(1, numclass^2) - diag(numclass), numclass) :
NAs introduced by coercion
The function has no kinky stuff either.
fit<-rpart(respVar ~ Var1 + Var2, data=varData, method="class")
I have no problem with 1k rows. It is somewhat slow in 10k rows, but still works. No NA values in the dataset. Currently trying that on a Macbook Air, but will try it on a Mac Mini also.
I suspect it is a memory issue, but the warning message concerns me. Is there some workaround to get the clustering work?
Upvotes: 0
Views: 1160
Reputation: 9
I ran into the same problem, but after searching around, I haven't found any solutions.
One way i worked around it is by changing the method="class" to method="anova" (changing from a classification to a regression), and it worked for me.
How many levels are there in your response variable? I think if you have quite a lot of levels for your data set, maybe you could try method="anova"
Upvotes: 0
Reputation: 1585
Yes I think so,
It's same error when we tried to use rep function with huge number like :
> x <- rep(0,120000*12000000)
Error in rep(0, 120000 * 1.2e+07) : invalid 'times' argument
In addition: Warning message:
NAs introduced by coercion
But i just guess, i don't know exactly
Upvotes: 1