Chob
Chob

Reputation: 93

Basic Decision Tree things in Python

I am studying the basic things of a decision tree in a machine learning book. This example appears in the book to understand some things(I know this is not a decision tree, but there are some things I do not understand)

def gini (p):
    return (p)*(1-(p))-(1-p)*(1-(1-p))

def entropy(p):
    return -p*np.log2(p)-(1-p)*np.log2((1-p))

def error(p):
    return 1- np.max([p, 1-p])

x=np.arange(0.0, 1.0, 0.01)
ent=[entropy(p) if p != 0 else None for p in x]
sc_ent = [e*0.5 if e else None for e in ent]
err= [error(i) for i in x]
fig = plt.figure()
ax=plt.subplot(111)

for i, lab, ls, c, in zip ([ent, sc_ent, gini(x), err], ['Entropy', 'Entropy(scaled)', 'Gini Impurity', 'Misclassification Error'], ['-', '-', '--', '-.'], ['black', 'lightgray','red','green', 'cyan']):
    line= ax.plot(x,i,label=lab, linestyle=ls, lw=2, color =c)

ax.legend(loc='upper center', bbox_to_anchor=(0.5, 1.15), ncol=5, fancybox=True, shadow=False)
ax.axhline(y=0.5, linewidth= 1, color='k', linestyle='--')
ax.axhline(y=1, linewidth= 1, color='k', linestyle='--')
plt.ylim([0, 1.1])
plt.xlabel('p(i=1)')
plt.ylabel('Impurity index')
plt.show()

Could someone explain what happens here?

ent=[entropy(p) if p != 0 else None for p in x]
sc_ent = [e*0.5 if e else None for e in ent]
err= [error(i) for i in x]

Upvotes: 1

Views: 83

Answers (1)

Igor Rivin
Igor Rivin

Reputation: 4864

x is a list of numbers. entropy of 0 is, apparently, undefined (so we return None) , but for other numbers it returns a number which we return. This is the first line.

The second line takes the list of entropies, and divides every numerical value by 2 (it does nothing for a None). The third line gives a list of errors, one for each element of the list x.

Upvotes: 1

Related Questions