mart1n
mart1n

Reputation: 6213

Create a set from items in two lists

I have the following code:

ids = set()
for result in text_results:
    ids.add(str(result[5]))
for result in doc_results:
    ids.add(str(result[4]))

Both text_results and doc_results are lists that contain other lists as items as you might have already guessed. Is there a more efficient way to do this using a nifty oneliner rather than two for loops?

Upvotes: 2

Views: 725

Answers (5)

Steve Jessop
Steve Jessop

Reputation: 279255

I would probably write:

ids = set(str(result[5]) for result in text_results)
ids.update(str(result[4]) for result in doc_results)

As for efficiency, if you want to squeeze every possible bit of performance then you first need a realistic dataset, then you can try things like map (or itertools.imap in Python 2) and operator.itemgetter, to see what's faster.

If you absolutely must have a one-liner:

ids = set(itertools.chain((str(result[5]) for result in text_results), (str(result[4]) for result in doc_results)))

Although, if you want a one-liner it's also worth optimizing for conciseness so that your one-liner will be readable, and then seeing whether performance is adequate:

ids = set([str(x[5]) for x in text_results] + [str(x[4]) for x in doc_results]))

This "feels" inefficient because it concatenates two lists, which shouldn't be necessary. But that doesn't mean it really is inefficient for your data, so its worth including in your tests.

Upvotes: 4

matsjoyce
matsjoyce

Reputation: 5844

Do this:

ids = {str(i) for text, doc in zip(text_results, doc_results) for i in (text[5], doc[4])}

This is assuming results is something like:

text_results = [['t11', 't12', 't13', 't14', 't15', 't16'], ['t21', 't22', 't23', 't24', 't25', 't26']]
doc_results = [['d11', 'd12', 'd13', 'd14', 'd15', 'd16'], ['d21', 'd22', 'd23', 'd24', 'd25', 'd26']]

And you want:

ids = {'d15', 't26', 't16', 'd25'}

Upvotes: 0

Irshad Bhat
Irshad Bhat

Reputation: 8709

I guess this is a more pythonic way:

map(str,set([i[5] for i in text_results]+[i[4] for i in doc_results]))

Demo:

>>> text_results = [[1,2,3,4,5,6,7,8,9],[1,2,3,4,56,6],[4,5,6,1,2,6,22],[1,2,3,4,5,7,8,9]]
>>> doc_results = [[1,2,3,4,5,9,7,8,9],[1,2,3,4,56,3],[4,5,6,1,2,7,22],[1,2,3,4,5,7,7,9]]
>>> map(str,set([i[5] for i in text_results]+[i[4] for i in doc_results]))
['56', '2', '5', '6', '7']

Upvotes: 0

101
101

Reputation: 8999

This (wrapped) one liner should work:

ids = set([str(tr[5]) for tr in text_results] +
          [str(dr[4]) for dr in doc_results])

Upvotes: 0

Reut Sharabani
Reut Sharabani

Reputation: 31339

This one liner should work:

ids = set(map (lambda x: str(x[4]), doc_results) + map(lambda x: str(x[5]), text_results))

Upvotes: 0

Related Questions