coelidonum
coelidonum

Reputation: 543

Dtreeviz - AttributeError: 'DataFrame' object has no attribute 'dtype' Python . Scikit-learn

I am trying to do a decision tree with dtreeviz

import pandas as pd
from sklearn import preprocessing, tree
from dtreeviz.trees import dtreeviz

I have a pandas df like:

df1:

id | age | gender | platform | Customer 
1  | 34  | M      | Web      | User 
2  | 37  | F      | App      | Customer

I create some dummy variables

X = df1[['age', 'gender', 'portfolio_type', 'platform']]
X = pd.get_dummies(data=X, drop_first=True)

Y = df1[[ 'Customer']]
Y = pd.get_dummies(data=Y, drop_first=True)

Then I creat test and train set.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.4, random_state=101)

If i create a decision tree like this,it works:

import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn import tree
from dtreeviz.trees import *


#fit the classifier
clf = tree.DecisionTreeClassifier(max_depth=3, random_state=42)
clf.fit(X_train, y_train)

tree.plot_tree(clf)

enter image description here

viz.view()

It also works if I do this:

tree.plot_tree(clf,
               feature_names = X.columns, 
               class_names= df['Customer'],
               rounded=True, 
               filled = True,
               fontsize=7
               );

enter image description here

But if I try t use dtreeviz, I get error:

viz = dtreeviz(classifier, 
               X[["age",    "gender_M", "portfolio_type_esg",   "platform_web"]], 
               Y,
               target_name='Customer',
               feature_names = X.columns, 
               class_names= list(set(df['Customer']))
              )  
              
viz.view()



AttributeError: 'DataFrame' object has no attribute 'dtype'

Why is so? What can I do?

Upvotes: 0

Views: 921

Answers (1)

Alexander L. Hayes
Alexander L. Hayes

Reputation: 4273

I cannot reproduce this. dtreeviz==1.4.1 at least appears to work when scikit-learn classifiers are fit on dataframes.

MRE:

from sklearn.tree import DecisionTreeRegressor
from sklearn.datasets import fetch_california_housing
from dtreeviz.trees import dtreeviz

housing = fetch_california_housing(as_frame=True)
regr = DecisionTreeRegressor(max_depth=2).fit(housing.data, housing.target)

viz = dtreeviz(regr,
               housing.data,               # pandas.DataFrame
               housing.target,             # pandas.Series
               target_name="MedHouseVal",
               feature_names=list(housing.data.columns))
viz.view()

Upvotes: 1

Related Questions