Reputation: 179
I am trying to iterate over a large list. I want a method that can iterate this list quickly. But it takes much time to iterate. Is there any method to iterate quickly or python is not built to do this. My code snippet is :-
for i in THREE_INDEX:
if check_balanced(rc, pc):
print('balanced')
else:
rc, pc = equation_suffix(rc, pc, i)
Here THREE_INDEX has a length of 117649. It takes much time to iterate over this list, is there any method to iterate it quicker. But it takes around 4-5 minutes to iterate
equation_suffix functions:
def equation_suffix(rn, pn, suffix_list):
len_rn = len(rn)
react_suffix = suffix_list[: len_rn]
prod_suffix = suffix_list[len_rn:]
for re in enumerate(rn):
rn[re[0]] = add_suffix(re[1], react_suffix[re[0]])
for pe in enumerate(pn):
pn[pe[0]] = add_suffix(pe[1], prod_suffix[pe[0]])
return rn, pn
check_balanced function:
def check_balanced(rl, pl):
total_reactant = []
total_product = []
reactant_name = []
product_name = []
for reactant in rl:
total_reactant.append(separate_num(separate_brackets(reactant)))
for product in pl:
total_product.append(separate_num(separate_brackets(product)))
for react in total_reactant:
for key in react:
val = react.get(key)
val_dict = {key: val}
reactant_name.append(val_dict)
for prod in total_product:
for key in prod:
val = prod.get(key)
val_dict = {key: val}
product_name.append(val_dict)
reactant_name = flatten_dict(reactant_name)
product_name = flatten_dict(product_name)
for elem in enumerate(reactant_name):
val_r = reactant_name.get(elem[1])
val_p = product_name.get(elem[1])
if val_r == val_p:
if elem[0] == len(reactant_name) - 1:
return True
else:
return False
Upvotes: 2
Views: 4113
Reputation: 155363
General for
loops aren't an issue, but using them to build (or rebuild) list
s is usually slower than using list comprehensions (or in some cases, map
/filter
, though those are advanced tools that are often a pessimization).
Your functions could be made significantly simpler this way, and they'd get faster to boot. Example rewrites:
def equation_suffix(rn, pn, suffix_list):
prod_suffix = suffix_list[len(rn):]
# Change `rn =` to `rn[:] = ` if you must modify the caller's list as in your
# original code, not just return the modified list (which would be fine in your original code)
rn = [add_suffix(r, suffix) for r, suffix in zip(rn, suffix_list)] # No need to slice suffix_list; zip'll stop when rn exhausted
pn = [add_suffix(p, suffix) for p, suffix in zip(pn, prod_suffix)]
return rn, pn
def check_balanced(rl, pl):
# These can be generator expressions, since they're iterated once and thrown away anyway
total_reactant = (separate_num(separate_brackets(reactant)) for reactant in rl)
total_product = (separate_num(separate_brackets(product)) for product in pl)
reactant_name = []
product_name = []
# Use .items() to avoid repeated lookups, and concat simple listcomps to reduce calls to append
for react in total_reactant:
reactant_name += [{key: val} for key, val in react.items()]
for prod in total_product:
product_name += [{key: val} for key, val in prod.items()]
# These calls are suspicious, and may indicate optimizations to be had on prior lines
reactant_name = flatten_dict(reactant_name)
product_name = flatten_dict(product_name)
for i, (elem, val_r) in enumerate(reactant_name.items()):
if val_r == product_name.get(elem):
if i == len(reactant_name) - 1:
return True
else:
# I'm a little suspicious of returning False the first time a single
# key's value doesn't match. Either it's wrong, or it indicates an
# opportunity to write short-circuiting code that doesn't have
# to fully construct reactant_name and product_name when much of the time
# there will be an early mismatch
return False
I'll also note that using enumerate
without unpacking the result is going to get worse performance, and more cryptic code; in this case (and many others), enumerate
isn't needed, as listcomps and genexprs can accomplish the same result without knowing the index, but when it is needed, always unpack, e.g. for i, elem in enumerate(...):
then using i
and elem
separately will always run faster than for packed in enumerate(...):
and using packed[0]
and packed[1]
(and if you have more useful names than i
and elem
, it'll be much more readable to boot).
Upvotes: 1
Reputation: 575
I believe the reason why "iterating" the list take a long time is due to the methods you are calling inside the for loop. I took out the methods just to test the speed of the iteration, it appears that iterating through a list of size 117649 is very fast. Here is my test script:
import time
start_time = time.time()
new_list = [(1, 2, 3) for i in range(117649)]
end_time = time.time()
print(f"Creating the list took: {end_time - start_time}s")
start_time = time.time()
for i in new_list:
pass
end_time = time.time()
print(f"Iterating the list took: {end_time - start_time}s")
Output is:
Creating the list took: 0.005337953567504883s
Iterating the list took: 0.0035648345947265625s
Edit: time() returns second.
Upvotes: 2