Reputation: 459
I am looking to slice a Pandas dataframe according to values in a couple of pandas series.
So I need the rows in-between the values of the pandas series.
For example:
df = pd.DataFrame(np.random.rand(10,5), columns = list('abcde'))
df_info = pd.DataFrame(data= {'beginRows': [2, 7], 'endRows': [4, 9]}
I need the rows in df that are between the value of beginRows and endRows, in each row of df_info.
Technically, I can do this as:
df_result = df[df.index.isin(np.r_[2:4+1,7:9+1])]
I am not sure how to make that list to send as a parameter to np.r_, from the df_info dataframe.
Thank you.
Upvotes: 1
Views: 406
Reputation: 164773
You can pass slice
objects:
slice1 = slice(2, 4+1)
slice2 = slice(7, 9+1)
df_result = df[df.index.isin(np.r_[slice1, slice2])]
Given your input df_info
:
s1, s2 = [slice(i, j+1) for i, j in df_info.values]
df_result = df[df.index.isin(np.r_[s1, s2])]
Or, for an arbitrary number of slices, you can pass a tuple
to np.r_.__getitem__
:
slices = tuple(slice(i, j+1) for i, j in df_info.values)
df_result = df[df.index.isin(np.r_.__getitem__(slices))]
Upvotes: 2