hY8vVpf3tyR57Xib
hY8vVpf3tyR57Xib

Reputation: 3915

Adding a 'rest'-group with Pandas value_counts()

I just started working with the pandas library to analyze large datasets. I am analyzing creditcard data that has the property issuercountrycode, that consists out of 117 possibilities. When trying to visualize what issuercountrycode are used in my dataset, I currently use the following code to generate a piechart.

df['issuercountrycode'].value_counts().plot(kind='pie')
plt.show()

This results in the following piechart:

Example of my piechart

As you can see, this isn't ideal because multiple values are not used that often. Is there a possibility in pandas to, when using the value_counts() function, add a threshold, and add values that are lower than a certain value to a 'rest' group? Are these type of operations even possible in pandas?

Upvotes: 2

Views: 908

Answers (1)

jezrael
jezrael

Reputation: 862511

You need count it with boolean indexing and sum:

tresh = 2
a = df['issuercountrycode'].value_counts()
b = a[a > tresh]
b['rest'] = a[a <= tresh].sum()

Sample:

np.random.seed(10)
L = list('abcdef')
df = pd.DataFrame({'issuercountrycode':np.random.choice(L, size=15)})

tresh = 2
a = df['issuercountrycode'].value_counts()
b = a[a > tresh]
b['rest'] = a[a <= tresh].sum()
print (b)
b       5
f       3
a       3
rest    4
Name: issuercountrycode, dtype: int64

b.plot.pie()

graph

Upvotes: 2

Related Questions