Reputation: 4864
I have a pandas.DataFrame
with a Multiindex
, thus:
a val
dog 1
cat 2
b
fox 3
rat 4
And I want a series whose entries are the lists of the index values at level 1,
so:
a [dog, cat]
b [fox, rat]
the following does work, but is quite slow and inelegant:
fff = df.groupby(level=0)['val'].agg(lambda x:[i[1] for i in list(x.index.values)])
So I am hoping there is a better way.
Upvotes: 4
Views: 5446
Reputation: 49794
To get another of order of magnitude speed up over Wen's Answer, we can use native iterators like:
index_as_dict = {}
for k, v in index.ravel():
index_as_dict.setdefault(k, []).append(v)
pd.Series(index_as_dict)
import pandas as pd
df = pd.read_fwf(StringIO(u"""
level_0 level_1 val
a dog 1
a cat 2
b fox 3
b rat 4"""), header=1).set_index(['level_0', 'level_1'])
print(df)
def method1():
return df.reset_index(level=1).groupby(level=0)['level_1'].apply(list)
def method2():
index_as_dict = {}
for k, v in df.index.ravel():
index_as_dict.setdefault(k, []).append(v)
return pd.Series(index_as_dict)
print(method1())
print(method2())
from timeit import timeit
print(timeit(method1, number=50))
print(timeit(method2, number=50))
val
level_0 level_1
a dog 1
cat 2
b fox 3
rat 4
level_0
a [dog, cat]
b [fox, rat]
Name: level_1, dtype: object
a [dog, cat]
b [fox, rat]
dtype: object
0.0760027870983045
0.006749932432252637
Upvotes: 1
Reputation: 323236
reset_index
and groupby
df.reset_index(level=1).groupby(level=0)['level_1'].apply(list)
Out[21]:
a [dog, cat]
b [fox, rat]
Name: level_1, dtype: object
Upvotes: 2