Amin
Amin

Reputation: 43

machine learning in python with scikit-learn

I made DB about cars info, included, car makes, mileage, year, price, like this:

[[('Volkswagen Polo', 82000, 2010, 43000)], [('Porsche 911', 2500, 2018, 349000)], [('Volvo S60', 89000, 2015, 98000)], [('BMW 1', 127467, 2012,  97000)]

I'm learning machine learning and I want to use decision tree.

I want to get car makes, mileage and year and predict the price. I have tried many way and I each time I faced with an error. For example:

ValueError: could not convert string to float: 'Volkswagen Polo'
or
ValueError: Found array with dim 3. Estimator expected <= 2.
or
TypeError: fit() takes 2 positional arguments but 3 were given

I've tried below code:

cursor = cnx.cursor()
    cursor.execute('SELECT * FROM cars_2')
    my_result = cursor.fetchall()
    x = []
    y = []
    for item in my_result:
        x.append([item[1:4]])
        y.append([item[4]])
    le = preprocessing.LabelEncoder()
    le.fit(x, y)

or

cursor = cnx.cursor()
    cursor.execute('SELECT * FROM cars_2')
    my_result = cursor.fetchall()
    x = []
    y = []
    for item in my_result:
        x.append([item[1:4]])
        y.append([item[4]])
    clf = tree.DecisionTreeClassifier()
    clf = clf.fit(x, y

Upvotes: 0

Views: 85

Answers (1)

Ivaylo Strandjev
Ivaylo Strandjev

Reputation: 71009

The LabelEncoder only operates on the target classes so you should not be passing x to it(see here). Also it seems you are using the wrong index for the target classes: y.append([item[4]]) should be y.append([item[0]]).

Upvotes: 1

Related Questions