Reputation: 761
I'm using cuML for stochastic gradient descent. I used sklearn's train_test_split to generate the splits for train_X, train_y ... from a cuDF dataframe.
The following code (I removed the hyperparameters which aren't relevant to this question):
from cuml.solvers import SGD as cumlSGD
cu_sgd = cumlSGD(eta0=0.005)
cu_sgd.fit(train_X, train_y)
Throws the following error on the cu_sgd.fit line: 'nvstrings' object has no attribute 'to_gpu_array'
How can I get around this issue?
Upvotes: 0
Views: 120
Reputation: 3919
The solution is to first convert any column in train_X
or train_Y
that have the string
dtype
to category
dtype. Strings can't be converted with to_gpu_array
because they are not fixed-width. You'll lose the actual string values, but they can be reconstructed, and cu_sgd.fit
should work fine.
Upvotes: 1