How to format scikit-learn output data?

Question

Currently learning a machine learning application and the output by a method has really stumped me, never have I seen an output like this.

Code:

def IsCloseTogether(data):
    amount_of_data = len(data) #i have an array loaded with examples
    local_feature = np.reshape(data, (amount_of_data,-1)) #changes the array so it would work with the clf.fit
    labels = [1, 0, 0, 0, 1, 1] # 1 means it matches, 0 means it doesn't (supervised learning)
    clf = tree.DecisionTreeClassifier()
    clf = clf.fit(local_feature, labels)
    prediction = clf.predict([["111011101"], ["101"]]) #these number strings are the strings im making the machine predict whether they are similar enough to be deemed "similar" or "different"
    return prediction

After printing it I get this output:

[1 0]

Although it the numbers make sense themselves, I ideally would like to the elements to show up as actual list elements like:

['1','0']

I've tried using .join but it's not a string so I can't seem to get it to work, any idea how to format this output?

desertnaut · Accepted Answer

clf.predict returns a Numpy array:

from sklearn import tree
X = [[0, 0], [1, 1]]
Y = [0, 1]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, Y)

print(clf.predict(X))
# [0 1]

type(clf.predict(X))
# numpy.ndarray

To print it as you want, you should first convert the array elements to strings, and then join them; you can perform both operations with a single list comprehension:

pred = clf.predict(X)
[",".join(item) for item in pred.astype(str)]
# ['0', '1']

How to format scikit-learn output data?

Answers (1)

Related Questions