AlexW
AlexW

Reputation: 2587

Python - itertools groupby but only include groups in new list. then filter list?

I have a two lists of dictionaries with sample data as per the below:

list 1:

list_1 = [
    {
        "route": "10.10.4.0",
        "mask": "255.255.255.0",
        "next_hop": "172.18.1.5"
    },
    {
        "route": "10.10.5.0",
        "mask": "255.255.255.0",
        "next_hop": "172.18.1.5"
    },
    {
        "route": "10.10.8.0",
        "mask": "255.255.255.0",
        "next_hop": "172.16.66.34"
    },
    {
        "route": "10.10.58.0",
        "mask": "255.255.255.0",
        "next_hop": "172.18.1.5"
    },
    {
        "route": "172.18.12.4",
        "mask": "255.255.255.252",
        "next_hop": "172.18.1.5"
    }
]

list 2

list_2 = [
    {
        "route": "10.10.4.0",
        "site": "Edinburgh"
    },
    {
        "route": "10.10.8.0",
        "site": "Manchester"
    },
    {
        "route": "10.10.5.0",
        "site": "London"
    },
]

im joing these lists with iterools as per the below

temp_merged_data = sorted(itertools.chain(list_1, list_2), key=lambda x:x['route'])
route_data = []
for k,v in itertools.groupby(temp_merged_data, key=lambda x:x['route']):
    d = {}
    for dct in v:
        d.update(dct)
    route_data.append(d) 

Which returns the below, however i dont want any routes in there that dont have site, how would i acheive this? and when I have the final list of dictionaries/json, how can I filter this efficiently say for example if I want to know the next hop for London only?

Thanks

[
    {
        "route": "10.10.4.0",
        "mask": "255.255.255.0",
        "next_hop": "172.18.1.5",
        "site": "Edinburgh"
    },
    {
        "route": "10.10.5.0",
        "mask": "255.255.255.0",
        "next_hop": "172.18.1.5",
        "site": "London"
    },
    {
        "route": "10.10.58.0",
        "mask": "255.255.255.0",
        "next_hop": "172.18.1.5"
    },
    {
        "route": "10.10.8.0",
        "mask": "255.255.255.0",
        "next_hop": "172.16.66.34",
        "site": "Manchester"
    },
    {
        "route": "172.18.12.4",
        "mask": "255.255.255.252",
        "next_hop": "172.18.1.5"
    }
]

Upvotes: 0

Views: 324

Answers (6)

VPfB
VPfB

Reputation: 17332

Given the structure of those lists (routing info and routing sites) I see no need for merging and grouping.

routes_to_sites = {rs['route']: rs['site'] for rs in list_2}
route_data = []
for ri in list_1:
    site = routes_to_sites.get(ri['route'])
    if site is not None:
        route_data.append({**ri, 'site': site})

Upvotes: 0

Sunitha
Sunitha

Reputation: 12015

>>> from itertools import groupby, chain
>>> temp_merged_data  = sorted(chain(list_1, list_2), key=lambda x:x['route'])
>>> route_data = [dict(chain(*map(dict.items, v))) for k,v in groupby(temp_merged_data, key=lambda x:x['route'])]
>>> route_data = [d for d in route_data if 'site' in d]
>>> pprint (route_data)
[{'mask': '255.255.255.0',
  'next_hop': '172.18.1.5',
  'route': '10.10.4.0',
  'site': 'Edinburgh'},
 {'mask': '255.255.255.0',
  'next_hop': '172.18.1.5',
  'route': '10.10.5.0',
  'site': 'London'},
 {'mask': '255.255.255.0',
  'next_hop': '172.16.66.34',
  'route': '10.10.8.0',
  'site': 'Manchester'}]

Now if you convert the route data into a dict it would be easier to access the parameters for each site

>>> route_dict = {d['site']:d for d in route_data}
>>> route_dict['London']['next_hop']
'172.18.1.5'

Upvotes: 0

Ashish Acharya
Ashish Acharya

Reputation: 3399

Here's a solution in pandas:

In [18]: df1=pd.DataFrame(list_1)

In [19]: df2=pd.DataFrame(list_2)    

In [22]: df1.merge(df2, on='route', how='left')
Out[22]: 
              mask      next_hop        route        site
0    255.255.255.0    172.18.1.5    10.10.4.0   Edinburgh
1    255.255.255.0    172.18.1.5    10.10.5.0      London
2    255.255.255.0  172.16.66.34    10.10.8.0  Manchester
3    255.255.255.0    172.18.1.5   10.10.58.0         NaN
4  255.255.255.252    172.18.1.5  172.18.12.4         NaN

Filter away routes without site like this:

In [29]: merged=df1.merge(df2, on='route', how='left')
In [31]: df=merged[~merged.site.isna()]
Out[31]: 
            mask      next_hop      route        site
0  255.255.255.0    172.18.1.5  10.10.4.0   Edinburgh
1  255.255.255.0    172.18.1.5  10.10.5.0      London
2  255.255.255.0  172.16.66.34  10.10.8.0  Manchester

Filter only for Edinburgh:

df[df['site']=='Edinburgh']

To get it in your format:

[v for k, v in df.T.to_dict().items()]

Output:

[{'mask': '255.255.255.0',
  'next_hop': '172.18.1.5',
  'route': '10.10.4.0',
  'site': 'Edinburgh'},
 {'mask': '255.255.255.0',
  'next_hop': '172.18.1.5',
  'route': '10.10.5.0',
  'site': 'London'},
 {'mask': '255.255.255.0',
  'next_hop': '172.16.66.34',
  'route': '10.10.8.0',
  'site': 'Manchester'}]

Upvotes: 2

Graipher
Graipher

Reputation: 7186

Use actual data analysis tools, like pandas:

import pandas as pd

df1 = pd.DataFrame(list_1)
df2 = pd.DataFrame(list_2)

print(df1.merge(df2))
#             mask      next_hop      route        site
# 0  255.255.255.0    172.18.1.5  10.10.4.0   Edinburgh
# 1  255.255.255.0    172.18.1.5  10.10.5.0      London
# 2  255.255.255.0  172.16.66.34  10.10.8.0  Manchester

Upvotes: 0

Ajax1234
Ajax1234

Reputation: 71461

You can filter your results:

d = [{'route': '10.10.4.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5', 'site': 'Edinburgh'}, {'route': '10.10.5.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5', 'site': 'London'}, {'route': '10.10.58.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5'}, {'route': '10.10.8.0', 'mask': '255.255.255.0', 'next_hop': '172.16.66.34', 'site': 'Manchester'}, {'route': '172.18.12.4', 'mask': '255.255.255.252', 'next_hop': '172.18.1.5'}]
new_d = [i for i in d if i.get('site')]

Output:

[{'route': '10.10.4.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5', 'site': 'Edinburgh'}, {'route': '10.10.5.0', 'mask': '255.255.255.0', 'next_hop': '172.18.1.5', 'site': 'London'}, {'route': '10.10.8.0', 'mask': '255.255.255.0', 'next_hop': '172.16.66.34', 'site': 'Manchester'}]

Upvotes: 0

Rakesh
Rakesh

Reputation: 82785

import itertools
temp_merged_data = sorted(itertools.chain(list_1, list_2), key=lambda x:x['route'])
route_data = []
for k,v in itertools.groupby(temp_merged_data, key=lambda x:x['route']):
    d = {}
    for dct in v:
        if "site" in dct.keys():   #Check if site is in keys
            d.update(dct)
    if d:
        route_data.append(d)
print(route_data)

Output:

[{'route': '10.10.4.0', 'site': 'Edinburgh'}, {'route': '10.10.5.0', 'site': 'London'}, {'route': '10.10.8.0', 'site': 'Manchester'}]

Upvotes: 0

Related Questions