Reputation: 160
I have a dataframe with multi index as follows
arrays = [
["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
["one", "two", "one", "two", "one", "two", "one", "two"],
]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
s = pd.DataFrame(np.random.randn(8), index=index).T
which looks like this
bar baz foo qux
one two one two one two one two
0 -0.144135 0.625481 -2.139184 -1.066893 -0.123791 -1.058165 0.495627 -0.654353
to which the documentation says to index in the following way
df.loc[:, (slice("bar", "two"), ...)]
and so I do
s.loc[:, (slice("bar", "two"):(slice("baz", "two"))]
which gives me a SyntaxError
.
Cell In[98], line 3
s.loc[:, (slice("bar", "two"):(slice("baz", "two")))]
^
SyntaxError: invalid syntax
In my specific use-case [albeit beyond the scope of this question], the level 1 indices are of type timestamp [Year], but I figure the answer should be the same. What is the proper way to access a range of multi-indexed items via a multi-index column?
Upvotes: 3
Views: 88
Reputation: 14414
As per the documentation, you have a few options to return this slice:
Option 1: hierarchical index using tuples (docs section)
s.loc[:, ('bar', 'two'):('baz', 'two')]
Here we reference start
(('bar', 'two')
) and stop
simply by tuples (('baz', 'two')
) with the colon (:
) in between to create a range between the specified columns.
Option 2: using slicers (docs section, cf. slice
)
s.loc[:, slice(('bar', 'two'), ('baz', 'two'))]
The signature is slice(start, stop[, step])
, so that ('bar', 'two')
gets passed as start
and ('baz', 'two')
as stop
.
Option 3: using pd.IndexSlice
idx = pd.IndexSlice
s.loc[:, idx['bar', 'two']:idx['baz', 'two']]
Similar to option 1: start
+ :
+ stop
.
All three of these result in:
# using `np.random.seed(0)` for reproducibility
first bar baz
second two one two
0 0.400157 0.978738 2.240893
Upvotes: 2
Reputation: 149
If you want to get the data from bar two to baz two, the following code works.
s.loc[:, ("bar", "two"):("baz", "two")]
The result looks like this:
first bar baz
second two one two
0 0.625481 -2.139184 -1.066893
Upvotes: 3