Extract rows range with .between(), and specific columns, from Pandas DataFrame?

Question

I just got tripped on this: consider this example:

>>> import pandas as pd
>>>
df = pd.DataFrame({
  "key":[1,3,6,10,15,21],
  "columnA":[10,20,30,40,50,60],
  "columnB":[100,200,300,400,500,600],
  "columnC":[110,202,330,404,550,606],
})

>>> df
   key  columnA  columnB  columnC
0    1       10      100      110
1    3       20      200      202
2    6       30      300      330
3   10       40      400      404
4   15       50      500      550
5   21       60      600      606

So, I want to extract data from this table, where key column (assume it grows monotonically) lies between two values (say 2 and 15) - but only for some specific columns (say, "columnA" and "columnC". Hopefully, this can be done in a one-liner.

Now, if I want to use the .between() method - it basically returns true/false for all rows:

>>> df['key'].between(2, 16)
0    False
1     True
2     True
3     True
4     True
5    False

So, to actually extract those rows, I need to put the above command in square brackets:

>>> df[df['key'].between(2, 16)]
   key  columnA  columnB  columnC
1    3       20      200      202
2    6       30      300      330
3   10       40      400      404
4   15       50      500      550

Excellent, this is what I need - I just need to limit the columns; so I try this:

>>> df[df['key'].between(2, 16), ["columnA"]]
Traceback (most recent call last):
  File "", line 1, in 
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/frame.py", line 2800, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:/msys64/mingw64/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 116, in pandas._libs.index.IndexEngine.get_loc
TypeError: '(0    False
1     True
2     True
3     True
4     True
5    False
Name: key, dtype: bool, ['columnA'])' is an invalid key

Uh... no dice.

So, how can I do the above, and limit specific columns?

UJIN · Accepted Answer

You can just use the standard way of slicing DataFrames:

df[df['key'].between(2,16)][['key','columnA','columnC']]

Extract rows range with .between(), and specific columns, from Pandas DataFrame?

Answers (2)

Related Questions