Reputation: 341
I am building a CNN and have been getting this error when trying to perform:
from tensorflow.keras import utils
trainY=utils.to_categorical(trainY)
ValueError: setting an array element with a sequence.
My trainY are actually labels, and it looks like this:
labels
array([list(['noise']), list(['noise']), list(['noise', 'point_source']),
list(['noise']), list(['noise', 'point_source']),
list(['noise', 'point_source']), list(['noise', 'point_source']),
list(['noise', 'point_source']), list(['noise']), list(['noise']),
list(['noise', 'point_source']), list(['noise']),
list(['noise', 'point_source']), list(['noise', 'point_source']),
list(['noise']), list(['noise', 'point_source']),
list(['noise', 'point_source']), list(['noise']), list(['noise']),
Any suggestions how to fix this? Many thanks!
Upvotes: 0
Views: 74
Reputation: 36624
You can do that with sklearn.preprocessing.MultiLabelBinarizer
import numpy as np
labels = np.array([list(['noise']), list(['noise']), list(['noise', 'point_source']),
list(['noise']), list(['noise', 'point_source']),
list(['noise', 'point_source']), list(['noise', 'point_source']),
list(['noise', 'point_source']), list(['noise']), list(['noise']),
list(['noise', 'point_source']), list(['noise']),
list(['noise', 'point_source']), list(['noise', 'point_source']),
list(['noise']), list(['noise', 'point_source']),
list(['noise', 'point_source']), list(['noise']), list(['noise'])])
That was what you had. Now you need to do this:
from sklearn.preprocessing import MultiLabelBinarizer
as_list = [list(i) for i in labels]
mlb = MultiLabelBinarizer()
ohe = mlb.fit_transform(as_list) # you might need to add .astype(float)
This is what you'll end up with:
array([[1, 0],
[1, 0],
[1, 1],
[1, 0],
[1, 1],
[1, 1],
[1, 1],
[1, 1], ...
Upvotes: 1