I have a data like {a : 100, b:102, c:500, d:99, e:78, d:88} I want group it by a range with interval of 100. Example: { 100: 2, 0: 3, 500:1 } that is in English 2 occourances of a number between 100..199 1 occourances of a number between 500..599 3 occourances of a number between 0..99 How to express this in pandas?

Reputation: 3315

Pandas Group by a range

I have a data like

{a : 100, b:102, c:500, d:99, e:78, d:88}

I want group it by a range with interval of 100.

Example:

{ 100: 2, 0: 3, 500:1 }

that is in English

2 occourances of a number between 100..199
1 occourances of a number between 500..599
3 occourances of a number between 0..99

How to express this in pandas?

Upvotes: 1

Answers (2)

Quang Hoang

Reputation: 150735

Group by a range is usually pd.cut:

d = {'a' : 100, 'b':102,'c':500, 'd':99, 'e':78, 'd':88}
bins = np.arange(0,601,100)
pd.cut(pd.Series(d), bins=bins, labels=bins[:-1], right=False).value_counts(sort=False)

Output:

0      3
100    2
200    0
300    0
400    0
500    1
dtype: int64

Update

Actually, pd.cut seems overkilled and your case is a bit easier:

(pd.Series(d)//100).value_counts(sort=False)

Output:

0    3
1    2
5    1
dtype: int64

Upvotes: 3

jezrael

Reputation: 862406

Solution with maximal value of Series used for bins anf for labels all values without last by b[:-1] in cut, then count values by GroupBy.size:

d = {'a' : 100, 'b':102, 'c':500, 'd':99, 'e':78, 'f':88}

s = pd.Series(d)

max1 = int(s.max() // 100 + 1) * 100
b = np.arange(0, max1 + 100, 100)
print (b)
[  0 100 200 300 400 500 600]

d1 = s.groupby(pd.cut(s, bins=b, labels=b[:-1], right=False)).size().to_dict()
print (d1)
{0: 3, 100: 2, 200: 0, 300: 0, 400: 0, 500: 1}

Upvotes: 2

Pandas Group by a range

Answers (2)

Update

Related Questions