Reputation: 478
Hi I have a list of lists and I need to compare a value of each list with another one extracted from an XML file. The structure is similar to this:
[('example', '123', 'foo', 'bar'), ('example2', '456', 'foo', 'bar'), ...]
I need to compare the second value of each list with the values in the XML:
for item in main_list:
for child in xml_data:
if item[4] == child.get('value'):
print item[4]
The problem is that the main_list has a huge ammount of lines (1000+) and this multiplied by the values from the xml (100+) results in a lot of iterations becoming this method unefficient.
Is there a way to do this efficiently?
Regards.
Upvotes: 1
Views: 208
Reputation: 89077
A membership check on a set will be significantly faster than manually iterating and checking:
children = {child.get('value') for child in xml_data}
for item in main_list:
if item[4] in children:
print(item[4])
Here we construct the set with a simple set comprehension.
Note that it may be worth swapping what data is in the set - if main_list
is longer, it will be more efficient to make the set of that data.
items = {item[4] for item in main_list}
for child in xml_data:
value = child.get('value')
if value in items:
print(value)
These both also only do the processing on the data once, rather than each time a check is made.
Note that a set will not handle duplicate values or order on the set side - if that is important, this isn't a valid solution. This version will only use the order/duplicates from the data you are iterating over. If that isn't valid, then you can still process the data beforehand, and use itertools.product()
to iterate a little quicker.
items = [item[4] for item in main_list]
children = [child.get('value') for child in xml_data]
for item, child in itertools.product(items, children):
if item == child:
print(item)
As Karl Knechtel points out, if you really don't care about order to duplicates at all, you can just do a set intersection:
for item in ({child.get('value') for child in xml_data} &
{item[4] for item in main_list}):
print(item)
Upvotes: 6