Pandas Groupby and sorting columns simultaneously

Question

I have a dataframe dfas:

  Election Year     Votes   Vote %      Party              Region   
0   2000            42289   29.40   Janata Dal (United)     A
1   2000            27618   19.20   Rashtriya Janata Dal    A
2   2000            20886   14.50   Bahujan Samaj Party     B 
3   2000            17747   12.40   Congress                B
4   2000            14047   19.80   Independent             C
5   2000            17047   10.80   JLS                     C
6   2005            8358    15.80   Janvadi Party           A
7   2005            4428    13.10   Independent             A
8   2005            1647    1.20    Independent             B
9   2005            1610    11.10   Independent             B
10  2005            1334    15.06   Nationalist             C
11  2005            1834    18.06   NJM                     C
12  2010            21114   20.80   Independent             A
13  2010            1042    10.5    Bharatiya Janta Dal     A
14  2010            835     0.60    Independent             B
15  2010            14305   15.50   Independent             B
16  2010            22211   17.70   Congress                C
16  2010            20011   14.70   INC                     C

How can I get the list of the regions that has two or more parties getting more than 10% vote % every election year?

I have used the following code to sort "Vote %" in descending order after grouping by "Election year" and "Region" and to then compare the top 2 Vote% every year, but it is giving an error.

df1 = df.groupby(['Election Year','Region'])sort_values('Vote %', ascending = False).reset_index()

How to correct the error as I want to get the top 2 "Party" of each region in each year after the sorting and then see if they get more than 10% vote every election year?

Desired output:

Election Year    Region    Vote %
  2000             A        29.40
  2000             A        19.40
  2000             C        19.80
  2000             C        10.80
  2005             A        15.80
  2005             A        13.10
  2005             C        15.06
  2005             C        18.06
  2010             A        20.80
  2010             A        10.5
  2010             C        17.70
  2010             C        14.70

Output contains only regions having more than 10% vote every year and Election year and region name in sorted in ascending order. So, here only Region "A" and "C" will be there in the output.

Pandas Groupby and sorting columns simultaneously

Answers (1)

Related Questions