Reputation: 71
I would like in sklearn package, Find the gini coefficients for each feature on a class of paths such as in iris data. like Iris-virginica Petal length gini:0.4 ,Petal width gini:0.4.
Upvotes: 6
Views: 18537
Reputation: 3781
Not sklearn, but this is based on the Lorenz curve definition and should do the trick:
import numpy as np
def gini(x):
return np.sum(np.abs(np.subtract.outer(x, x)))/(2*len(x)**2*x.mean())
Upvotes: 0
Reputation: 136437
You can calculate the gini coefficient with Python+numpy like this:
from typing import List
from itertools import combinations
import numpy as np
def gini(x: List[float]) -> float:
x = np.array(x, dtype=np.float32)
n = len(x)
diffs = sum(abs(i - j) for i, j in combinations(x, r=2))
return diffs / (2 * n**2 * x.mean())
Upvotes: 6