Reputation: 3039
I'm using Logistical Regression to plot ROC. I'm using this code to pull data.
Diabetes=pd.read_csv('datasource/ScoringDatasheet.csv', sep=';')
Then I'm using iloc
like this.
inputData=Diabetes.iloc[:,:60]
outputData=Diabetes.iloc[:,60]
Then I'm using logistical regression
to analyze the data and plot ROC
from sklearn.linear_model import LogisticRegression
logit1=LogisticRegression()
logit1.fit(inputData,outputData)
logit1.score(inputData,outputData)
np.mean(logit1.predict(inputData)==outputData)
trueInput=Diabetes.ix[Diabetes['Outcome']==1].iloc[:,:62]
trueOutput=Diabetes.ix[Diabetes['Outcome']==1].iloc[:,62]
np.mean(logit1.predict(trueInput)==trueOutput)
falseInput=Diabetes.ix[Diabetes['Outcome']==0].iloc[:,:62]
falseOutput=Diabetes.ix[Diabetes['Outcome']==0].iloc[:,62]
np.mean(logit1.predict(falseInput)==falseOutput)
from sklearn.metrics import confusion_matrix, roc_curve, roc_auc_score
confusion_matrix(logit1.predict(inputData),outputData)
fpr, tpr,_=roc_curve(logit1.predict(inputData),outputData,drop_intermediate=False)
import matplotlib.pyplot as plt
plt.figure()
plt.plot(fpr, tpr, color='red', lw=2, label='ROC curve')
plt.plot([0, 1], [0, 1], color='blue', lw=2, linestyle='--')
plt.xlabel('False Positive ')
plt.ylabel('True Positive ')
plt.title('ROC curve')
plt.show()
roc_auc_score(logit1.predict(inputData),outputData)
coef_DF=pd.DataFrame(data={'Variable':list(inputData),
'value':(logit1.coef_[0])})
coef_DF_standardised=pd.DataFrame(data={'Variable':list(inputData),
'value':(logit1.coef_[0])*np.std(inputData,axis=0)/np.std(outputData)})
import matplotlib.pyplot as plt
plt.figure()
plt.scatter(inputData.iloc[:,1],inputData.iloc[:,5],c=logit1.predict_proba(inputData)[:,1],alpha=0.4)
plt.xlabel('Glucose level ')
plt.ylabel('BMI ')
plt.show()
plt.figure()
plt.scatter(inputData.iloc[:,1],inputData.iloc[:,5],c=outputData,alpha=0.4)
plt.xlabel('Glucose level ')
plt.ylabel('BMI ')
plt.show()
But when I run my code, I get below error:
Traceback (most recent call last): File "index.py", line 13, in <module> logit1.fit(inputData,outputData) File "C:\Users\kulkaa\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\linear_model\logistic.py", line 1221, in fit check_classification_targets(y) File "C:\Users\kulkaa\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\multiclass.py", line 171, in check_classification_targets raise ValueError("Unknown label type: %r" % y_type) ValueError: Unknown label type: 'continuous'
According to this link, I should convert floats
to categorical values
if I'm using classifier. But I'm using regression here, how can I fix that error?
A part of data set that I'm using is given below:
Pat_ID Demo1 Demo2 Demo3 Demo4 Demo5 Demo6 DisHis1 DisHis1Times DisHis2 DisHis2Times DisHis3 DisHis3Times DisHis4 DisHis5 DisHis6 DisHis7 DisStage1 DisStage2 LungFun1 LungFun2 LungFun3 LungFun4 LungFun5 LungFun6 LungFun7 LungFun8 LungFun9 LungFun10 LungFun11 LungFun12 LungFun13 LungFun14 LungFun15 LungFun16 LungFun17 LungFun18 LungFun19 LungFun20 Dis1 Dis1Treat Dis2 Dis2Times Dis3 Dis3Times Dis4 Dis4Treat Dis5 Dis5Treat Dis6 Dis6Treat Dis7 RespQues1 ResQues1a ResQues1b ResQues1c ResQues2a SmokHis1 SmokHis2 SmokHis3 SmokHis4
6 0 0.430159833 0.596541787 0.323296661 0 0.867768595 0 0 0 0 0 0 0 0 0 0 0.8 0.714285714 0.447443182 0.280725319 0.392405063 0.315347722 0.442765731 0.35344 0.306497788 0.078249895 0.230895645 0 0.175430575 0.776595745 0.194322248 0.123935854 0.792696843 0.873987854 0.803933254 0.528064786 1 0.1 0 0 0 0 0.333333333 0.15 0 0 0 0 0.333333333 1 0 0.273565574 0.1074 0.7282 0.0469 0.3 0.082352941 0.085237258 0.724137931 0.145833333
9 0 0.218902015 0.484149856 0.177957923 0 0.225895317 0 0 0 0 0 0 0 0 0 0 0.6 0.142857143 0.899147727 0.441235729 0.620253165 0.708333333 0.69303235 0.55904 0.532922703 0.263357173 0.718707204 0.729159016 0.65096784 0.64893617 0.385594463 0.234804989 0.613921643 0.409665992 0.483313468 0.115610165 0 0.5 0 0 0 0 1 0 1 0 1 0 0.333333333 1 0 0.456557377 0.1791 0.7896 0.3212 0.2 0.176470588 0.144991213 0.620689655 0
15 0 0.628908965 0.433717579 0.594093804 1 0.363636364 0 0 0 0 0 0 0 0 0 0 0 0.142857143 0.970170455 0.396910678 0.746835443 0.575239808 0.478848205 0.36944 0.565368266 0.309002945 0.569433032 0.463643041 0.425392471 0.787234043 0.427004516 0.290833498 0.652339293 0.484311741 0.511323004 0.138788048 0 0.6 0 0 0 0 1 0 0 0 0 0 0.333333333 1 0 0.396413934 0.2596 0.8032 0.1836 0.2 0.058823529 0.052724077 0.637931034 0.0625
25 1 0.236275191 0.268011527 0.280254777 0 0.388429752 0 0 0 0 0 0 0 0 0 0 0.6 0 0.721590909 0.39758227 0.53164557 0.363309353 0.394063278 0.31088 0.224863364 0.096339924 0.321007943 0.351817848 0.361377839 0.521276596 0.213208986 0.059196199 0.728413846 0.497975709 0.62932062 0.147165596 1 0.6 0 0 1 0 0.333333333 0.05 0 0 0 0 0 0 0 0 0 0 0 0.1 0.176470588 0.118629174 0.517241379 0.104166667
27 1 0.397498263 0.327089337 0.425786528 0 0.063360882 0 0 0 0 0 0 0 0 0 0 0 0 0.950284091 0.358629953 0.82278481 0.580035971 0.462851049 0.33696 0.40426824 0.508834666 0.594631608 0.491737055 0.431489102 0.819148936 0.372514517 0.373589388 0.623430962 0.422823887 0.489272944 0.114493158 0 0.9 0 0 0.333333333 0.020833333 0.333333333 0.05 0 0 0 0 0 0 0 0.058709016 0 0.1847 0 0 0.176470588 0.087873462 0.396551724 0.0625
28 1 0.510771369 0.452449568 0.468249373 0 0.027548209 0 0 0 0 0 0 1 0 0 0 0 0.142857143 0.928977273 0.392209537 0.746835443 0.648081535 0.547813722 0.4232 0.46777132 0.379259571 0.675431389 0.581894969 0.502362445 0.79787234 0.351398909 0.388437933 0.597565614 0.441548583 0.472586412 0.122591455 0 0.9 0 0 0 0 0 0 0 0 0 0 0 0 1 0.480840164 0.5239 0.5354 0.4146 0.1 0.411764706 0.156414763 0.272413793 0
36 1 0.385684503 0.341498559 0.405134144 0 0.195592287 0 0 0 0 0 0 0 0 0 0 0.6 0.142857143 0.737215909 0.36937542 0.594936709 0.43735012 0.455563455 0.33952 0.259651254 0.165124106 0.447274719 0.432611091 0.384545039 0.691489362 0.212387823 0.159176401 0.647014074 0.504807692 0.511918951 0.148841106 0 0.8 1 0 0.333333333 0.041666667 0.333333333 0.1 0.333333333 0 1 0 1 0 1 0.453790984 0.5014 0.5946 0.3379 0.2 0.117647059 0.077768014 0.515517241 0
Upvotes: 1
Views: 174
Reputation:
sklearn.linear_model.LogisticRegression
is a classifier (not a regressor) according to http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html.
Upvotes: 1