KateP
KateP

Reputation: 29

Select specific value in one column and get n rows before/after from another column in pandas

I have a dataframe with two columns: trialType and diameter. I need to find everywhere where trialType == 'start', and then get the diameter for all 2500 rows before and after those locations, inclusive. I tried the following:

idx = df.loc(df[df['trialType']=='start'])
df.iloc[idx - 2500 : idx + 2500]

My goal is to have a dataframe with only those relevant rows (2500 rows in between each 'start' trial). Below is an example just with much fewer rows:

trialType diameter
start     3.15
          3.17
          3.18
start     3.14
          3.13
          3.13

Upvotes: 2

Views: 1031

Answers (2)

Muftawo Omar
Muftawo Omar

Reputation: 68

I will only change the first line to

idx = df.index[df['trialType']=="start"].tolist()[0]

this will return the first index where the condition is true

the line 2 should work fine df.iloc[idx - 2500 : idx + 2500]

you can run this code to try

df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'Z', 'C', 'C', 'D'],
                   'points': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})
df
idx = df.index[df['team']=="B"].tolist()[0]
df.iloc[idx - 2 : idx + 2]

output

    team    points  rebounds
1   A         7          8
2   A         7          10
3   B         9          6
4   Z         12         6

Upvotes: 0

MYK
MYK

Reputation: 2997

Interesting.

What about:

idx = df.loc[lambda x: x['trialType']=='start'].index

rows = df.loc[idx]

a = df.shift( 2500).loc[idx]
b = df.shift(-2500).loc[idx]

You can then combine them however you find best.

pd.concat([a,rows,b])

You could also do:


idx = df.loc[lambda x: x['trialType']=='start'].index
df.loc[lambda x: (x.index-2500).isin(idx) 
               ¦  x.index.isin(idx) 
               ¦ (x.index+2500).isin(idx)]

But you have to modify the code above if your index is not sequential (0,1,2,3,etc.)

Upvotes: 2

Related Questions