Reputation: 254
I am struggling to reindex a multiindex. Example code below:
rng = pd.date_range('01/01/2000 00:00', '31/12/2004 23:00', freq='H')
ts = pd.Series([h.dayofyear for h in rng], index=rng)
daygrouped = ts.groupby(lambda x: x.dayofyear)
daymean = daygrouped.mean()
myindex = np.arange(1,367)
myindex = np.concatenate((myindex[183:],myindex[:183]))
daymean.reindex(myindex)
gives (as expected):
184 184
185 185
186 186
187 187
...
180 180
181 181
182 182
183 183
Length: 366, dtype: int64
BUT if I create a multindex:
hourgrouped = ts.groupby([lambda x: x.dayofyear, lambda x: x.hour])
hourmean = hourgrouped.mean()
myindex = np.arange(1,367)
myindex = np.concatenate((myindex[183:],myindex[:183]))
hourmean.reindex(myindex, level=1)
I get:
1 1 1
2 1
3 1
4 1
...
366 20 366
21 366
22 366
23 366
Length: 8418, dtype: int64
Any ideas on my mistake? - Thanks.
Bevan
Upvotes: 1
Views: 73
Reputation: 139142
First, you have to specify level=0
instead of 1
(as it is the first level -> zero-based indexing -> 0).
But, there is still a problem: the reindexing works, but does not seem to preserve the order of the provided index in the case of a MultiIndex:
In [54]: hourmean.reindex([5,4], level=0)
Out[54]:
4 0 4
1 4
2 4
3 4
4 4
...
20 4
21 4
22 4
23 4
5 0 5
1 5
2 5
3 5
4 5
...
20 5
21 5
22 5
23 5
dtype: int64
So getting a new subset of the index works, but it is in the same order as the original and not as the new provided index.
This is possibly a bug with reindex
on a certain level (I opened an issue to discuss this: https://github.com/pydata/pandas/issues/8241)
A solution for now to reindex your series, is to create a MultiIndex and reindex with that (so not on a specified level, but with the full index, that does preserve the order). Doing this is very easy with MultiIndex.from_product
as you already have myindex
:
In [79]: myindex2 = pd.MultiIndex.from_product([myindex, range(24)])
In [82]: hourmean.reindex(myindex2)
Out[82]:
184 0 184
1 184
2 184
3 184
4 184
5 184
6 184
7 184
8 184
9 184
10 184
11 184
12 184
13 184
14 184
...
183 9 183
10 183
11 183
12 183
13 183
14 183
15 183
16 183
17 183
18 183
19 183
20 183
21 183
22 183
23 183
Length: 8784, dtype: int64
Upvotes: 1