Reputation: 197
Here is my table.
Day Close
1 1
2 2
3 6
4 4
5 5
6 4
7 5
And I want out put like this.
Day Close Previous_nearest_larger_row
1 1 Nan
2 2 0
3 6 0
4 4 1
5 5 2
6 4 1
7 5 2
If the current row is the largest, I want the result is 0.
Upvotes: 0
Views: 172
Reputation: 120429
The code uses conditions and loops so it's probably non-optimal for a large dataframe but it works!
sr = df["Close"]
distance = []
for idx, val in sr.iteritems():
msk = sr[idx::-1] - val >= 0
if msk.sum():
dist = idx - msk.idxmax()
else:
dist = 0
distance.append(dist)
df["Previous_nearest_larger_row"] = distance
>>> df
Day Close Previous_nearest_larger_row
0 1 1 0
1 2 2 0
2 3 6 0
3 4 4 1
4 5 5 2
5 6 4 1
6 7 5 2
Example for idx=4
and val=5
Slice the series from the index excluded ([1:]
) to the beginning (-1
), then subtract the current value and finally, replace values where the condition is true (>= 0
):
>>> sr[idx::-1][1:]
3 4 # distance from idx == 1
2 6 # distance from idx == 2
1 2 # distance from idx == 3
0 1 # distance from idx == 4
Name: Close, dtype: int64
>>> sr[idx::-1][1:] - val
3 -1
2 1 # 6 - val >= 0
1 -3
0 -4
Name: Close, dtype: int64
>>> sr[idx::-1][1:] - val >= 0
3 False
2 True # idx - 2 = 2 <- distance
1 False
0 False
Name: Close, dtype: bool
If there is at least a True value (msk.sum()
), compute the distance between current index and the index of the first True value (msk.idxmax()
):
>>> idx - msk.idxmax()
2
A bit more complex than the previous solution but maybe more scalable:
v = df["Close"].astype(float).values
s = v.size
i = np.arange(s)
v = np.full((s, s), v)
v = v - v.T
v[np.triu_indices(s)] = np.nan
r = i - (s - np.argmax(v[:, ::-1] >= 0, axis=1) - 1)
r[r < 0] = 0
>>> r
array([0, 0, 0, 1, 2, 1, 2])
Upvotes: 2