Reputation: 2904
I have one excel document which contains sport column, in which sports name and sports persons names are available. If I clicked on sports name sports persons names are disappears i.e. sports persons names are children's of the sports name.
Please look at the data below:
If I clicked on cricket then ramesh, suresh,mahesh names are disappears i.e. cricket is the parent of ramesh, suresh and mahesh like same football is the parent of pankaj, riyansh, suraj.
I want to read this excel document and convert in the python pandas Dataframe. I tried to read it with pandas pivot_table but I'm not getting any success.
I tried to read this excel sheet and converted into a dataframe.
df = pd.read_excel("sports.xlsx",skiprows=7,header=0)
d = pd.pivot_table(df,index=["sports"])
print d
But I'm getting all the sports values in single column I want to split it by sports name and it's corresponding sports persons name.
Expected Output:
sports_name player_name age address
cricket ramesh 20 aaa
cricket suresh 21 bbb
cricket mahesh 22 ccc
football pankaj 24 eee
football riyansh 25 fff
football suraj 26 ggg
basketball rajesh 28 iii
basketball abhijeet 29 jjj
Upvotes: 2
Views: 19512
Reputation: 9721
pandas.pivot_table is there to support data analysis and helps you to create pivot tables similar to excel, not to read excel pivot tables.
Create a spreadsheet-style pivot table as a DataFrame. The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of the result DataFrame
Example from Documentation
>>> df
A B C D
0 foo one small 1
1 foo one large 2
2 foo one large 2
3 foo two small 3
4 foo two small 3
5 bar one large 4
6 bar one small 5
7 bar two small 6
8 bar two large 7
>>> table = pivot_table(df, values='D', index=['A', 'B'],
... columns=['C'], aggfunc=np.sum)
>>> table
small large
foo one 1 4
two 6 NaN
bar one 5 4
two 6 7
Now to help you on the problem, I created a sample data set and a pivot table.
Then read the excel sheet into pandas dataframe. This dataframe contains nans to be replaced using df.fillna(method='ffill')
df = pd.read_excel(pviotfile,skiprows=12,header=0)
df=df.fillna(method='ffill')
print (df)
output
Sports Name Address Age
0 basketball Abhijit 129 ABC 20
1 basketball Rajesh 128 ABC 20
2 Cricket Mahesh 123 ABC 20
3 Cricket Ramesh 126 ABC 20
4 Cricket Suresh 124 ABC 20
5 Football Riyash 125 ABC 20
6 Football suraj 127 ABC 20
Upvotes: 4