BhishanPoudel
BhishanPoudel

Reputation: 17144

How to select pandas series rows based on length of index name?

I have a pandas series like shown below, how to select only rows where the length of the index is greater than 3?

s = pd.Series([1,2,3,4,5], index=['a','bb','ccc','dddd','eeeee'])

Required output:

dddd     4
eeeee    5

My attempt:

s[len(s.index.name)>3]

Upvotes: 0

Views: 483

Answers (4)

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

I'll enrich a collection of approaches with additional one powered by pandas.Series.filter routine:

In [216]: s.filter(regex='.{4,}')                                                                               
Out[216]: 
dddd     4
eeeee    5
dtype: int64
  • '.{4,}' - regex pattern to match only labels (of the index) that contain at least 4 characters

A simplified version may look as '.' * 4 or ....


And here we go with time execution measurements:

In [217]: %timeit s[s.index.str.len()>3]                                                                        
254 µs ± 691 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [218]: %timeit s[[len(i)>3 for i in s.index]]                                                                
84.5 µs ± 375 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [219]: %timeit s[s.index.str.get(3).notnull()]                                                               
258 µs ± 1.65 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [220]: %timeit s.filter(regex='.{4,}')                                                                       
170 µs ± 480 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Upvotes: 4

rafaelc
rafaelc

Reputation: 59274

Using get

s[s.index.str.get(3).notnull()]

dddd     4
eeeee    5
dtype: int64

Upvotes: 4

Scott Boston
Scott Boston

Reputation: 153460

Use list comprehension:

s[[len(i)>3 for i in s.index]]

Output:

dddd     4
eeeee    5
dtype: int64

Upvotes: 3

BhishanPoudel
BhishanPoudel

Reputation: 17144

You can try:

s[s.index.str.len()>3]

Gives
dddd     4
eeeee    5

Upvotes: 3

Related Questions