Reputation: 23907
Let's say I have a list L=[1.1, 1.8, 4.4, 5.2]
. For some integer, n
, I want to know whether L
has a value val
with n-1<val<n+1
, and if so I want to know the index of val
.
The best I can do so far is to define a generator
x = (index for index,val in enumerate(L) if n-1<val<n+1)
and check to see whether it has an appropriate value using try... except
. So let's assume I'm looking for the smallest n>=0 for which such a value exists...
L=[1.1, 1.8, 4.4, 5.2]
n=0
while True:
x = (index for index,val in enumerate(L) if n-1<val<n+1)
try:
index=next(x)
break
except StopIteration:
n+=1
print n,index
1 0
In reality, I'm doing a more complicated task. I'll want to be able to take an n, find the first index, and if it doesn't exist, I need to do something else.
This doesn't seem like particularly clean code to me. Is there a better way? I feel like numpy probably has the answer, but I don't know it well enough.
Upvotes: 2
Views: 6720
Reputation: 2879
Now,when I think,that I finally understand your task : just find the minimum value in the array and it's index - n will be equal to cell(mininum). Or even simpler :
n,index = int(min(L)),L.index(min(L))
Upvotes: 1
Reputation: 56714
If L is sorted, you could use bisect.bisect_left
to find the index i for which all L[< i] < n <= all L[>= i].
Then
if n - L[i-1] < 1.0:
val = L[i-1]
elif L[i] - n < 1.0:
val = L[i]
else:
val = None # no such value found
Edit: Depending on your data, what you want to accomplish, and how much time you want to spend writing a clever algorithm, sorting may or may not be a good solution for you; and before I see too many more O(n)s waved around, I would like to point out that his actual problem seems to involve repeatedly probing for various values of n - which would pretty rapidly amortize the initial sorting overhead - and that his suggested algorithm above is actually O(n**2).
@AntoinePelisse: by all means, let's do some profiling:
from bisect import bisect_left, bisect_right
from functools import partial
import matplotlib.pyplot as plt
from random import randint, uniform
from timeit import timeit
#blues
density_col_lin = [
(0.000, 0.502, 0.000, 1.000),
(0.176, 0.176, 0.600, 1.000),
(0.357, 0.357, 0.698, 1.000),
(0.537, 0.537, 0.800, 1.000)
]
# greens
density_col_sor = [
(0.000, 0.502, 0.000, 1.000),
(0.176, 0.600, 0.176, 1.000),
(0.357, 0.698, 0.357, 1.000),
(0.537, 0.800, 0.537, 1.000)
]
def make_data(length, density):
max_ = length / density
return [uniform(0.0, max_) for _ in range(length)], max_
def linear_probe(L, max_, probes):
for p in range(probes):
n = randint(0, int(max_))
for index,val in enumerate(L):
if n - 1.0 < val < n + 1.0:
# return index
break
def sorted_probe(L, max_, probes):
# initial sort
sL = sorted((val,index) for index,val in enumerate(L))
for p in range(probes):
n = randint(0, int(max_))
left = bisect_right(sL, (n - 1.0, max_))
right = bisect_left (sL, (n + 1.0, 0.0 ), left)
if left < right:
index = min(sL[left:right], key=lambda s:s[1])[1]
# return index
def main():
densities = [0.8, 0.2, 0.08, 0.02]
probes = [1, 3, 10, 30, 100]
lengths = [[] for d in densities]
lin_pts = [[[] for p in probes] for d in densities]
sor_pts = [[[] for p in probes] for d in densities]
# time each function at various data lengths, densities, and probe repetitions
for d,density in enumerate(densities):
for trial in range(200):
print("{}-{}".format(density, trial))
# length in 10 to 5000, with log density
length = int(10 ** uniform(1.0, 3.699))
L, max_ = make_data(length, density)
lengths[d].append(length)
for p,probe in enumerate(probes):
lin = timeit(partial(linear_probe, L, max_, probe), number=5) / 5
sor = timeit(partial(sorted_probe, L, max_, probe), number=5) / 5
lin_pts[d][p].append(lin / probe)
sor_pts[d][p].append(sor / probe)
# plot the results
plt.figure(figsize=(9.,6.))
plt.axis([0, 5000, 0, 0.004])
for d,density in enumerate(densities):
xs = lengths[d]
lcol = density_col_lin[d]
scol = density_col_sor[d]
for p,probe in enumerate(probes):
plt.plot(xs, lin_pts[d][p], "o", color=lcol, markersize=4.0)
plt.plot(xs, sor_pts[d][p], "o", color=scol, markersize=4.0)
plt.show()
if __name__ == "__main__":
main()
which results in
x-axis is number of items in L, y-axis is amortized time per probe; green dots are sorted_probe(), blue are linear_probe().
Conclusions:
Upvotes: 4
Reputation: 20583
I have an interesting thought, by using defaultdict
and build the index with the values (n-1)
and (n+1)
, it will require looping the list once, and thereafter just compare the key/values, like this:
from collections import defaultdict
L = [1.1, 1.8, 4.4, 5.2]
x = defaultdict(dict)
for idx, item in enumerate(L):
x[int(item)] = {int(item-1): item-1, int(item+1): item+1, 'index':idx}
Usage:
n = 5
x[n].get(n-1) < n < x[n].get(n+1) and x[n]['index']
Out[8]: 3
n = 2
x[n].get(n-1) < n < x[n].get(n+1) and x[n]['index']
Out[10]: False
Explanation:
As if:
1) True
and Index returns Index
2) False
and Index will return False
Because you are going to input n
as an integer, if first part is True
it will return the second part index
value. If first part fails, it returns False
.
This will return the LAST occurrence of n
, in case you need the FIRST occurrence of n
, just reversed the list and Index:
...
l = len(L)
for idx, item in enumerate(reversed(L)):
x[int(item)] = {int(item-1): item-1,
int(item+1): item+1,
'index': l-idx-1}
...
Upvotes: 2
Reputation: 15369
Here's a solution that doesn't rely on try
... except
and is relatively easy-to-read. As a result it "feels cleaner" to me, but that's always going to have an element of subjectivity to it.
def where_within_range( sequence, lower, upper ):
for index, value in enumerate( sequence ):
if lower < value < upper: return index
L = [ 1.1, 1.8, 4.4, 5.2 ]
import itertools
for n in itertools.count():
index = where_within_range( L, n - 1, n + 1 )
if index != None: break
print n, index
If you'd prefer to avoid the repeated-function-call overhead you could instead do it as follows, which once again uses the StopIteration
exception but by using itertools.count
and a return
statement, (again, "somehow") ends up seeming cleaner to me. Perhaps it's because there's only one statement in each part of the try
...except
... clause (there's not much rational basis for that feeling, admittedly).
import itertools
def find_joel_root( sequence ):
for n in itertools.count():
solutions = ( index for index, value in enumerate( sequence ) if n - 1 < value < n + 1 )
try: return n, next( solutions )
except StopIteration: pass
L = [ 1.1, 1.8, 4.4, 5.2 ]
n, index = find_joel_root( L )
print n, index
Upvotes: 2
Reputation: 5877
l
can be a list or numpy array:
next(((i,v) for i,v in enumerate(l) if n-1<v<n+1))
uses generators and stops on the first value.
Upvotes: 1