marshallbanana
marshallbanana

Reputation: 465

Pandas group and sort by index count

Say I have this dataframe

d = {     'Path'   : ['abc', 'abc', 'ghi','ghi', 'jkl','jkl'],
          'Detail' : ['foo', 'bar', 'bar','foo','foo','foo'],
          'Program': ['prog1','prog1','prog1','prog2','prog3','prog3'],
          'Value'  : [30, 20, 10, 40, 40, 50],
          'Field'  : [50, 70, 10, 20, 30, 30] }


df = DataFrame(d)
df.set_index(['Path', 'Detail'], inplace=True)
df
               Field Program  Value
Path Detail                      
abc  foo        50   prog1     30
     bar        70   prog1     20
ghi  bar        10   prog1     10
     foo        20   prog2     40
jkl  foo        30   prog3     40
     foo        30   prog3     50

I can group and sort it by any column...

df_count = df.groupby('Program')
df_count.apply(lambda x: x.sort())

Program Path    Detail  Field   Program Value
prog1   abc     foo       50    prog1   30
        ghi     foo       20    prog1   40
        jkl     bar       10    prog1   30
prog2   abc     bar       70    prog2   20
prog3   ghi     foo       10    prog3   60
        jkl     foo       30    prog3   50

But what I REALLY want is to sort the programs by their counts

df['Program'].value_counts()

prog1    3
prog3    2
prog2    1
dtype: int64

Something like

df_count.apply(lambda x: x.sort('Programs'.value_counts()))

The ultimate goal is to plot it such that the bars are in ascending or descending order. How can I do that?

Upvotes: 3

Views: 1472

Answers (1)

EdChum
EdChum

Reputation: 393943

You could just add the count as a column and then you can sort by it:

In [20]:
df['count'] = df['Program'].map(df['Program'].value_counts())
df

Out[20]:
             Field Program  Value  count
Path Detail                             
abc  foo        50   prog1     30      3
     bar        70   prog1     20      3
ghi  bar        10   prog1     10      3
     foo        20   prog2     40      1
jkl  foo        30   prog3     40      2
     foo        30   prog3     50      2

In [23]:
df.sort('count', ascending=False)

Out[23]:
             Field Program  Value  count
Path Detail                             
abc  foo        50   prog1     30      3
     bar        70   prog1     20      3
ghi  bar        10   prog1     10      3
jkl  foo        30   prog3     40      2
     foo        30   prog3     50      2
ghi  foo        20   prog2     40      1

Upvotes: 1

Related Questions