Reputation: 481
How do I roll a Pandas Dataframe into a matrix whose element's row and column numbers depend on other two columns? I have a Pandas Dataframe that looks like the following:
year,weekofyear,flow
1923,1,12.88
1923,2,3.1
1923,3,3.18
1923,4,10.95
...
1923,50,7.51
1923,51,5.82
1923,52,11.15
1924,1,10.88
1924,2,13.21
1924,3,15.33
...
2015,50,13.93
2015,51,13.93
2015,52,13.93
2015,53,7.96
I want it to turn it into a matrix with flows organized in years x weekofyear. This would result in a matrix with 91 rows (one per year, 2015-1924=91) and 52 columns (it's fine to ignore the 53rd week of the year) looking like the one below:
| 0 1 2 3 ... 50 51 52
-------------------------------------------------
1923 | 12.88,3.1,3.18,10.95, ..., 7.51,5.82,11.15
1924 | 10.88,13.21,15.33, ...
...
2015 | ...,13.93,13.93,13.93,7.96
How should I proceed?
Upvotes: 1
Views: 111
Reputation: 382
Using a numpy matrix:
import pandas as pd
import numpy as np
df = pd.DataFrame() # Your dataframe
matrix = np.zeros((91,53))
for _, row in df.iterrows():
# iterate through the rows in your dataframe
year = row['year']-1924
week = row['weekofyear']-1
flow = row['flow']
matrix[year][week] = flow
def get_flow(year,week):
return matrix[year-1924][week-1]
Upvotes: 1