Alberto
Alberto

Reputation: 711

Python merge lists by common element

Im trying to merge two lists that have a common thing between them (in that case is a the id parameter). I have something like this:

list1=[(id1,host1),(id2,host2),(id1,host5),(id3,host4),(id4,host6),(id5,host8)]

list2=[(id1,IP1),(id2,IP2),(id3,IP3),(id4,IP4),(id5,IP5)]

The host is unique but the id in the list1 can be repeated like you can see. I want a output that relates the id parameter that is the common thing to both lists:

Some output like:

IP1(host1,host5), IP2(host2), IP3(host4), IP4(host6), IP5(host8)

As you can see the IP1 has two host associated.

Is there any fast way to do it?

Thank you

Upvotes: 0

Views: 2970

Answers (6)

dstromberg
dstromberg

Reputation: 7167

Maybe something like this?

#!/usr/local/cpython-3.3/bin/python

import pprint
import collections

class Host_data:
    def __init__(self, ip_address, hostnames):
        self.ip_address = ip_address
        self.hostnames = hostnames
        pass

    def __str__(self):
        return '{}({})'.format(self.ip_address, ','.join(self.hostnames))

    __repr__ = __str__

    # The python 2.x way
    def __cmp__(self, other):
        if self.ip_address < other.ip_address:
            return -1
        elif self.ip_address > other.ip_address:
            return 1
        else:
            if self.hostnames < other.hostnames:
                return -1
            elif self.hostnames > other.hostnames:
                return 1
            else:
                return 0

    # The python 3.x way
    def __lt__(self, other):
        if self.__cmp__(other) < 0:
            return True
        else:
            return False


def main():
    list1=[('id1','host1'),('id2','host2'),('id1','host5'),('id3','host4'),('id4','host6'),('id5','host8')]

    list2=[('id1','IP1'),('id2','IP2'),('id3','IP3'),('id4','IP4'),('id5','IP5')]

    keys1 = set(tuple_[0] for tuple_ in list1)
    keys2 = set(tuple_[0] for tuple_ in list2)
    keys = keys1 | keys2

    dict1 = collections.defaultdict(list)
    dict2 = {}

    for tuple_ in list1:
        id_str = tuple_[0]
        hostname = tuple_[1]
        dict1[id_str].append(hostname)

    for tuple_ in list2:
        id_str = tuple_[0]
        ip_address = tuple_[1]
        dict2[id_str] = ip_address

    result_dict = {}
    for key in keys:
        hostnames = []
        ip_address = ''
        if key in dict1:
            hostnames = dict1[key]
        if key in dict2:
            ip_address = dict2[key]
        host_data = Host_data(ip_address, hostnames)
        result_dict[key] = host_data

    pprint.pprint(result_dict)
    print('actual output:')
    values = list(result_dict.values())
    values.sort()
    print(', '.join(str(value) for value in values))

    print('desired output:')
    print('IP1(host1,host5), IP2(host2), IP3(host4), IP4(host6), IP5(host8)')


main()

Upvotes: 1

emitle
emitle

Reputation: 343

from collections import defaultdict
list1 = [("id1","host1"),("id2","host2"),("id1","host5"),("id3","host4"),("id4","host6"),("id5","host8")]
list2 = [("id1","IP1"),("id2","IP2"),("id3","IP3"),("id4","IP4"),("id5","IP5")]
host = defaultdict(list)
IP4id = {}
for k, v in list2:
    IP4id[v] = {"id" : k, "host" : []}

for k, v in list1:
    host[k].append(v)

for item in IP4id:
    IP4id[item]["host"] = host[IP4id[item]["id"]]
print IP4id

Upvotes: 1

Sravan K Ghantasala
Sravan K Ghantasala

Reputation: 1318

Code:

list1=[('id1','host1'),('id2','host2'),('id1','host5'),('id3','host4'),('id4','host6'),('id5','host8')]
list1 = map(list,list1)
list2=[('id1','IP1'),('id2','IP2'),('id3','IP3'),('id4','IP4'),('id5','IP5')]
list2 = map(list,list2)

for item in list1:
    item += [x[1] for x in list2 if x[0]==item[0]]

list1 += [x for x in list2 if not any(i for i in list1 if x[0]==i[0])]

print list1

Ouptut:

[['id1', 'host1', 'IP1'], ['id2', 'host2', 'IP2'], ['id1', 'host5', 'IP1'], ['id3', 'host4', 'IP3'], ['id4', 'host6', 'IP4'], ['id5', 'host8', 'IP5']]  

Hope This helps :)

Upvotes: 1

intuited
intuited

Reputation: 24034

You'll want to go through each of the two lists of lists and add their contents to a new defaultdict with elements of type list.

This will have the effect of creating a dictionary with contents like {id1: (host1, host5), id2: host2, ...}.

You can then go through and map the id values to their corresponding IP values.

Note that in order for this to work, the id values have to be hashable. Strings, numbers, and other basic types are hashable.

If the id values are objects of a class you've defined, you can have that class inherit from the collections.Hashable abstract base class.

Upvotes: 0

linbo
linbo

Reputation: 2431

  1. use collections.defaultdict to map id->ip
  2. then map id -> ip
>>> d = defaultdict(set)
>>> d['id'].add('host1')
>>> d['id'].add('host2')
>>> d['id'].add('host1')
>>> d
defaultdict(<type 'set'>, {'id': set(['host2', 'host1'])})

Upvotes: 1

John La Rooy
John La Rooy

Reputation: 304117

>>> from collections import defaultdict
>>> list1 = [('id1','host1'),('id2','host2'),('id1','host5'),('id3','host4'),('id4','host6'),('id5','host8')]
>>> list2 = [('id1','IP1'),('id2','IP2'),('id3','IP3'),('id4','IP4'),('id5','IP5')]
>>> d1 = defaultdict(list)
>>> for k,v in list1:
...     d1[k].append(v)
... 

You can print the items like this

>>> for k, s in list2:
...     print s, d1[k]
... 
IP1 ['host1', 'host5']
IP2 ['host2']
IP3 ['host4']
IP4 ['host6']
IP5 ['host8']

You can use a list comprehension to put the results into a list

>>> res = [(s, d1[k]) for k, s in list2]
>>> res
[('IP1', ['host1', 'host5']), ('IP2', ['host2']), ('IP3', ['host4']), ('IP4', ['host6']), ('IP5', ['host8'])]

Upvotes: 4

Related Questions