bob.sacamento
bob.sacamento

Reputation: 6661

Filtering a dictionary in Python ... again

Yes, there are alot of questions on this site about fitering a python dictionary. But nothing that I have seen quite gets at what I am trying to do. So, I have a dictionary. It contains a list of some times and a list of some data values. Something like

data_and_time = {"time":['2:30','2:45','3:25','5:15','7:21','8:22'],
                 "data":[    5.,    7.,    2.,    3.,    8.,   10.]}

I want to filter this so that, for instance, I only have data values greater than or equal to 5. The result being:

data_and_time_5 = {"time":['2:30','2:45','7:21','8:22'],
                   "data":[    5.,    7.,    8.,   10.]}

I can think of a few ways to do this -- all very ugly and taking many lines of code. I would like an elegant, readable way to do it. Is there such a way with python dictionaries? (BTW, the times being expressed as strings is completely incidental, just a compact way for me to express my problem here.) Thanks.

Upvotes: 1

Views: 176

Answers (3)

Gill Bates
Gill Bates

Reputation: 15217

If you need to preserve your data structure:

data_and_time = {"time": ['2:30', '2:45', '3:25', '5:15', '7:21', '8:22'],
                 "data": [5., 7., 2., 3., 8., 10.]}

#it builds list like a [True, True, False, ...]    
index = map(lambda x: x >= 5, data_and_time['data'])
#and then 'applies' it to 'columns' of data_and_time
data_and_time = {k: [e for e in itertools.compress(v, index)]
                 for k, v in data_and_time.iteritems()}

Results:

{'data': [5.0, 7.0, 8.0, 10.0],
 'time': ['2:30', '2:45', '7:21', '8:22']}

Upvotes: 0

Eudis Duran
Eudis Duran

Reputation: 782

I would go with Blender's approach. However, if you'd like to stick to your current data structure, you can use dict/list comprehensions:

data_and_time = { k: [i for i in v if i >= 5] for k, v in data_and_time.iteritems() }

Of course, you'd have to modify the i >= 5 part to handle the date format. I did not include it here since you mentioned how you only did that here to simplify your example.

Hope that helps.

Upvotes: 0

Blender
Blender

Reputation: 298562

I would start by storing the data in a nicer, JSON-like format:

data = [dict(zip(data_and_time, val)) for val in zip(*data_and_time.values())]

It looks like this:

>>> data
    [{'data': 5.0, 'time': '2:30'},
 {'data': 7.0, 'time': '2:45'},
 {'data': 2.0, 'time': '3:25'},
 {'data': 3.0, 'time': '5:15'},
 {'data': 8.0, 'time': '7:21'},
 {'data': 10.0, 'time': '8:22'}]

Now, you can filter the object much more easily:

>>> [item for item in data if item['data'] >= 5.0]
    [{'data': 5.0, 'time': '2:30'},
 {'data': 7.0, 'time': '2:45'},
 {'data': 8.0, 'time': '7:21'},
 {'data': 10.0, 'time': '8:22'}]

Upvotes: 5

Related Questions