pythoonatic
pythoonatic

Reputation: 11

delete common dictionaries in list based on a value

How would I delete all corresponding dictionaries in a list of dictionaries based on one of the dictionaries having a character in it.

data = [
       { 'x' : 'a', 'y' : '1' },
       { 'x' : 'a', 'y' : '1/1' },
       { 'x' : 'a', 'y' : '2' },
       { 'x' : 'b', 'y' : '1' },
       { 'x' : 'b', 'y' : '1' },
       { 'x' : 'b', 'y' : '1' },
    ]

For example, how would I delete all of the x = a due to one of the y in the x=a having a / in it? Based on the example data above, here is where I would like to get to:

cleaneddata = [
       { 'x' : 'b', 'y' : '1' },
       { 'x' : 'b', 'y' : '1' },
       { 'x' : 'b', 'y' : '1' },
    ]

I have a CSV with a dump of many network devices (imported via DictReader). Unfortunately each device name is repeated in "x". The modules are in "y". I'm inevitably trying to reconstruct the devices by grouping them and listing all the modules for each device. Within the CSV data there is devices I don't care about. These can be identified by certain characteristics in "y". Therefore my thought is to identify and clean them out of the data up front. I'd be open to a more optimal method of cleaning the data.

Upvotes: 1

Views: 52

Answers (2)

georg
georg

Reputation: 214959

Basically

for p in data:
    if '/' in p['y']:
        cleaneddata = [q for q in data if q['x'] != p['x']]
        break

Based on your update, I'd suggest that you create a dict like device->module from CSV:

from collections import defaultdict

devices = defaultdict(list)
for row in data:
    devices[row['x']].append(row['y'])

And then remove devices you're not interested in:

clean_devices = {}
for dev, modules in devices.items():
    if all('/' not in m for m in modules):
        clean_devices[dev] = modules

If your rows have more data than just modules, you can consider a dict of lists of dicts (eek:):

devices = defaultdict(list)
for row in data:
    devices[row['x']].append(row)

and then:

clean_devices = {}
for dev, row in devices.items():
    if all('/' not in m for m in row['modules']):
        clean_devices[dev] = row

Upvotes: 2

Joran Beasley
Joran Beasley

Reputation: 113968

something like this maybe?

x_excludes = set([d["x"] for d in data if not d["y"].isdigit()])
new_list = [d for d in data if d["x"] not in x_excludes]
print new_list

this first creates a list of bad X values based on some condition (in this case if there are any non-digit characters in Y)

then it just filters out any data where X exists in our earlier calculated set of bad X values

Upvotes: 0

Related Questions