Reputation: 14977
I have a dictionary containing lists. For example,
{1: [[sender11, receiver11, text11, address11]],
2: [[sender21, receiver21, text21, address21], [sender22, receiver22, text22, address22]],
3: [[sender31, receiver31, text31, address31], [sender32, receiver32, text32, address32], [sender33, receiver33, text33, address33]]
4: [[sender41, receiver41, text41, address41], [sender42, receiver42, text42, address42], [sender43, receiver43, text43, address43], [sender44, receiver44, text44, address44]]}
What I want to do is, for dictionary elements that contain a list with 2 or more elements (i.e. dict[2]
, dict[3]
and dict[4]
in this example), I do a comparison of the sender, receiver, text
for each list value. For each group of list values with the same sender, receiver, text
, I'll do something.
So for example, in dict[3]
, if sender31, receiver31, text31
are the same values as sender32, receiver32, text32
and sender33, receiver33, text33
, then I'll do something with all 3 list values.
Say in dict[4]
, if sender41, receiver41, text41
are the same values as sender42, receiver42, text42
, while sender43, receiver43, text43
are the same values as sender44, receiver44, text44
, but different from sender41, receiver41, text41
, then I'll work on these 2 groups separately.
I wrote a Python script that pretty much brute force compares the values of sender21, receiver21, text21
and sender22, receiver22, text22
, i.e.
if sender21 == sender22 and receiver21 == receiver22 and text21 == text22:
# Do something
This isn't efficient as it only works for 2 list values, but I don't know how I should implement this such that it works for any number of list values greater than 1.
Upvotes: 2
Views: 91
Reputation: 59148
I think a defaultdict
is the obvious way to go here:
from collections import defaultdict
def collate(seq):
groups = defaultdict(list)
for subseq in seq:
groups[tuple(subseq[:3])].append(subseq[3])
return groups
Depending on your actual data, you might replace tuple(subseq[:3])
in the function above with e.g. (subseq[1], subseq[4], subseq[5])
, or the appended subseq[3]
with subseq
itself ... that'll depend on what you're doing with the data.
The key has to be a tuple rather than a list, though, because keys must be immutable.
Example:
>>> data = [
... ['S1', 'R1', 'T1', 'A3'],
... ['S2', 'R2', 'T2', 'A4'],
... ['S1', 'R1', 'T1', 'A5'],
... ['S2', 'R2', 'T2', 'A6']
... ]
>>> collate(data)
defaultdict(<type 'list'>, {
('S2', 'R2', 'T2'): ['A4', 'A6'],
('S1', 'R1', 'T1'): ['A3', 'A5']
})
You can work with this just as you would any other dictionary, e.g.
>>> for (sender, receiver, text), addresses in collate(data).items():
... print sender, receiver, text
... print '|'.join(addresses)
... print
...
S2 R2 T2
A4|A6
S1 R1 T1
A3|A5
Upvotes: 1