ketan
ketan

Reputation: 2904

How to read pivot table from excel document in python pandas?

I have one excel document which contains sport column, in which sports name and sports persons names are available. If I clicked on sports name sports persons names are disappears i.e. sports persons names are children's of the sports name.

Please look at the data below: enter image description here

If I clicked on cricket then ramesh, suresh,mahesh names are disappears i.e. cricket is the parent of ramesh, suresh and mahesh like same football is the parent of pankaj, riyansh, suraj.

I want to read this excel document and convert in the python pandas Dataframe. I tried to read it with pandas pivot_table but I'm not getting any success.

I tried to read this excel sheet and converted into a dataframe.

df = pd.read_excel("sports.xlsx",skiprows=7,header=0)
d = pd.pivot_table(df,index=["sports"])
print d

But I'm getting all the sports values in single column I want to split it by sports name and it's corresponding sports persons name.

Expected Output:

sports_name player_name   age address
cricket     ramesh        20  aaa
cricket     suresh        21  bbb
cricket     mahesh        22  ccc
football    pankaj        24  eee
football    riyansh       25  fff
football    suraj         26  ggg
basketball  rajesh        28  iii
basketball  abhijeet      29  jjj

Upvotes: 2

Views: 19512

Answers (1)

Shijo
Shijo

Reputation: 9721

pandas.pivot_table is there to support data analysis and helps you to create pivot tables similar to excel, not to read excel pivot tables.

Create a spreadsheet-style pivot table as a DataFrame. The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of the result DataFrame

Example from Documentation

>>> df
   A   B   C      D
0  foo one small  1
1  foo one large  2
2  foo one large  2
3  foo two small  3
4  foo two small  3
5  bar one large  4
6  bar one small  5
7  bar two small  6
8  bar two large  7


>>> table = pivot_table(df, values='D', index=['A', 'B'],
...                     columns=['C'], aggfunc=np.sum)
>>> table
          small  large
foo  one  1      4
     two  6      NaN
bar  one  5      4
     two  6      7

Now to help you on the problem, I created a sample data set and a pivot table.

Then read the excel sheet into pandas dataframe. This dataframe contains nans to be replaced using df.fillna(method='ffill')

enter image description here

df = pd.read_excel(pviotfile,skiprows=12,header=0)
df=df.fillna(method='ffill')
print (df)

output

       Sports     Name  Address  Age
0  basketball  Abhijit  129 ABC   20
1  basketball   Rajesh  128 ABC   20
2     Cricket   Mahesh  123 ABC   20
3     Cricket   Ramesh  126 ABC   20
4     Cricket   Suresh  124 ABC   20
5    Football   Riyash  125 ABC   20
6    Football    suraj  127 ABC   20

Upvotes: 4

Related Questions