Reputation: 2365
I have a pandas dataframe which has a column like
df = pd.DataFrame({'A':[0,0,15,0,0,0,0,0,0,5]})
A
0 0
1 0
2 15
3 0
4 0
5 0
6 0
7 0
8 0
9 5
Now based on the index(lets say 5) I want to determine the nearest non zero number in the column(here index 2 and value 15) and return the hops it takes to get there from given index. In the given example it would be +3 as it comes from index 2 to index 5 and if the given index is 7 then the answer would be -2 as index is 9 value 5
Upvotes: 1
Views: 771
Reputation: 294508
Use numpy.flatnonzero
to return back the positions of where the array has non zero values.
Subtract the index you are referencing from to get the direction and distance from those positions.
d = np.flatnonzero(df.A.values) - 5
i = d[np.abs(d).argmin()] + 5
df.iloc[i]
A 15
Name: 2, dtype: int64
Upvotes: 1
Reputation: 21274
With a boolean vector of zero/non-zero, you can calculate the distance from a given index to the non-zero values using subtract()
. That will give you the "hops", and idxmin()
can locate the actual value.
def get_closest(df, target):
df["not_zero"] = df.ne(0)
not_zero = pd.Series(df.index[df.not_zero])
dist = not_zero.subtract(target).abs()
minidx = not_zero.loc[dist.idxmin()]
steps = dist.min() if minidx < target else -dist.min()
print("Closest steps to non-zero:", steps)
print("Closest non-zero value:", df.A[minidx])
get_closest(df, 5)
# Closest steps to non-zero: 3
# Closest non-zero value: 15
get_closest(df, 7)
# Closest steps to non-zero: -2
# Closest non-zero value: 5
Upvotes: 1
Reputation: 57105
Start by finding the indexes of all non-zero elements:
nonzeros = df[df.A != 0]).index
Calculate the distances from your row to all of them:
anchor = 5
dists = anchor - nonzeros
Find the smallest distance (by the absolute value):
nhops = min(dists, key=abs)
Altogether, in one line:
hnops = min((anchor - df[df.A != 0].index), key=abs)
#3
The index of the closest non-zero value can be calculated by recombining nhops
and anchor
:
min_index = anchor - nhops
#2
Upvotes: 2
Reputation: 323356
IIUC nonzero
with argmin
a=df.A.nonzero()[0]
a[abs(np.argmin(a-5))]
Out[950]: 2
Upvotes: 1