Reputation: 21971
I would like to transform the foll. dataframe:
index YR BIOM RWT site
0 0 2008 0.53 0.20 1
1 1 2009 3.23 1.18 1
2 2 2010 11.51 3.94 1
3 3 2011 18.14 5.82 1
4 4 2012 22.88 6.73 1
5 5 2013 26.65 7.20 1
6 0 2008 0.39 0.15 10
7 1 2009 2.43 0.90 10
8 2 2010 8.95 3.09 10
9 3 2011 16.63 5.38 10
10 4 2012 24.36 7.23 10
11 5 2013 29.72 8.10 10
I would like to extract the BIOM
values for the 2 unique site
i.e. 1 and 10 for the YR
2008 and 2013 so that I get something like this:
index BIOM_2008 BIOM_2013 site
0 0.53 26.65 1
1 26.65 29.72 10
This is what I am doing:
lst_yrs = [2008, 2013]
sub_df = df[['YR', 'BIOM', 'site']]
for yr in lst_yrs:
sub_df['BIOM'+str(yr)] = sub_df.loc['YR' == yr]
Not sure how to get the for loop right.
Upvotes: 1
Views: 1452
Reputation: 13955
I am not sure you need a loop here. You can simply create a slice of the data you want, set the index and then unstack as follows:
import pandas as pd
DF = pd.DataFrame({
'site' : [1, 1, 1, 1, 1, 1, 10, 10, 10],
'BIOM' : [0.53, 3.23, 11.51, 18.14, 22.88, 26.65, 0.39, 2.43, 8.95],
'YR' : ['2008', '2009', '2010', '2011', '2012', '2013', '2008', '2009', '2010']
})
slice = DF[(DF['site'].isin([1, 10]) & DF['YR'].isin(['2008', '2013']))]
result = slice.set_index(['site','YR']).unstack('YR')
Which gives the following:
BIOM
YR 2008 2013
site
1 0.53 26.65
10 0.39 NaN
In response to your comment, to flatten the hierarchical column index do the following:
result.columns = [' '.join(col).strip() for col in result.columns.values]
site BIOM 2008 BIOM 2013
0 1 0.53 26.65
1 10 0.39 NaN
Possibly this is now more complex than your original loop idea but I think it uses the pandas functionality in a more sophisticated way.
Upvotes: 2