Reputation: 2890
So, first of all, I'm relatively new to Python so I'm not sure how to achieve my task. I was following an online tutorial on how to plot a decision tree using the Iris dataset (for classification). However, I'm trying to plot a single tree from regression.
Here's a snip of the data I'm using:
Here's the code I was using:
# Import Libraries and Load Data
import pandas as pd
data = pd.read_csv("/Users/.../Desktop/cars_test.csv")
import matplotlib.pyplot as plt
import numpy as np
cars = data
# Model
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor(n_estimators=10)
# Train
model.fit(cars.data, cars.target)
# Extract single tree for analysis
estimator = model.estimators_[5]
However, I'm getting an error that I'm not sure how to fix... The error I'm getting is:
AttributeError Traceback (most recent call last) <ipython-input-27-37164305d7fe> in <module>() 10 11 # Train ---> 12 model.fit(cars.data, cars.target) 13 14 # Extract single tree for analysis ~/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py in __getattr__(self, name) 4370 if self._info_axis._can_hold_identifiers_and_holds_name(name): 4371 return self[name] -> 4372 return object.__getattribute__(self, name) 4373 4374 def __setattr__(self, name, value): AttributeError: 'DataFrame' object has no attribute 'data'
Any suggestions as to what I'm doing wrong?
Upvotes: 0
Views: 593
Reputation: 9968
You need to adapt the code to deal with your own data (note that the DataFrame you loaded doesn't have attributes for target
or data
). This means extracting the matrix of input data (X
) and response variable (y
) from your original dataset. I'm making a few assumptions here, but you can adapt accordingly.
# Import Libraries and Load Data
import pandas as pd
data = pd.read_csv("/Users/.../Desktop/cars_test.csv")
import matplotlib.pyplot as plt
import numpy as np
cars = data
# Model
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor(n_estimators=10)
X = cars.loc[:, cars.columns != 'th_km_per_year'].values
y = cars['th_km_per_year'].values
# Train
model.fit(X, y)
# Extract single tree for analysis
estimator = model.estimators_[5]
Upvotes: 1