Counting frequency of values by date using pandas - Part II

Question

I have dataset (dataset1) that looks as follows:

Date        Company     Weekday

2015-01-01  Company1     Monday

2015-01-02  Company1     Tuesday

2015-01-03  Company1     Wednesday

2015-01-04  Company1     Thursday

2015-12-09  Company2     Monday

2015-12-10  Company2     Tuesday
………………………………………………………………………

2016-01-08  Company3     Wednesday

2016-01-09  Company3     Thursday

I then apply the following code:

dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)

Once the above code has been applied, I get the following results:

Index                        0

('Company1', Monday)        80

('Company1', Tuesday)       80

('Company1', Wednesday)     79
………………………………………………………………….

('Company3', Tuesday)       34

I am trying to isolate all dataset2 entries with a count values above 50, but I get all kinds of errors when I try the following:

dataset2=dataset2.loc[dataset2[0]>50]

Can anyone offer an opinion?

jezrael · Accepted Answer

Working with Series, so need:

dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)
dataset2 = dataset2[dataset2 > 50]

Another solution is add Series.reset_index with parameter name for DataFrame and then filter by column count:

dataset2 = (dataset1.groupby(['Company','Weekday'])
                    .size()
                    .sort_values(ascending=False)
                    .reset_index(name='count'))

dataset2 = dataset2[dataset2['count'] > 50]

Counting frequency of values by date using pandas - Part II

Answers (1)

Related Questions