Reputation: 282865
I've got a dict
that has a whole bunch of entries. I'm only interested in a select few of them. Is there an easy way to prune all the other ones out?
Upvotes: 812
Views: 722000
Reputation: 133
selecting keys according to a comparison operator (e.g. key greater than 3):
new_dic = { k: old_dic[k] for k in old_dic.keys() if k > 3 }
Upvotes: 0
Reputation: 914
With glom:
target = {'a': 1, 'b': 2, 'c': 3}
spec = {'a': 'a', 'b': 'b'}
glom(target, spec)
# {'a': 1, 'b': 2}
Rename keys
spec = {'My A': 'a', 'My B': 'b'}
glom(target, spec)
# {'My A': 1, 'My B': 2}
Advanced
target = {
'system': {
'planets': [
{'name': 'earth', 'moons': 1},
{'name': 'jupiter', 'moons': 69}
]
},
'telescopes': ['Proton-1', 'Proton-2']
}
spec = {
'names': ('system.planets', ['name']),
'moons': ('system.planets', ['moons']),
'telescopes': 'telescopes'
}
glom(target, spec)
# {'names': ['earth', 'jupiter'],
# 'moons': [1, 69],
# 'telescopes': ['Proton-1', 'Proton-2']}
Upvotes: 0
Reputation:
Constructing a new dict:
dict_you_want = {key: old_dict[key] for key in your_keys}
Uses dictionary comprehension.
If you use a version which lacks them (ie Python 2.6 and earlier), make it dict((key, old_dict[key]) for ...)
. It's the same, though uglier.
Note that this, unlike jnnnnn's version, has stable performance (depends only on number of your_keys) for old_dict
s of any size. Both in terms of speed and memory. Since this is a generator expression, it processes one item at a time, and it doesn't looks through all items of old_dict.
Removing everything in-place:
unwanted = set(old_dict) - set(your_keys)
for unwanted_key in unwanted: del your_dict[unwanted_key]
Upvotes: 1014
Reputation: 12018
You can use python's built in filter
function and rebuild a dict from the items – though it's not as neat or performant as some of the other methods here:
my_dict = {i: str(i) for i in range(10)}
# I only want specific keys
want_keys = [6, 7, 8]
new_dict = dict(filter(lambda x: x[0] in want_keys, my_dict.items()))
# Or use logic – I want greater than 6
new_dict_2 = dict(filter(lambda x: x[0] >6, my_dict.items()))
You can get unnecessarily fancy with partial functions and operators if you wish too:
from functools import partial
from operator import is_, is_not, gt, contains
condition = partial(contains, want_keys)
condition = partial(gt, 6)
# use one of the conditions
dict(filter(lambda x: condition(x[0]), my_dict.items()))
Upvotes: 1
Reputation: 3194
The accepted answer throws a KeyError
if one of the filter keys is not present in the given dict.
To get a copy of the given dict, containing only some keys from the allowed keys, an approach is to check that the key was indeed present on the given dict in the dict comprehension:
filtered_dict = { k: old_dict[k] for k in allowed_keys if k in old_dict }
This does not impact performance, as the lookup against the dictionary has constant runtime complexity.
Alternatively, you could use old_dict.get(k, some_default)
to populate missing items.
Upvotes: 3
Reputation: 897
Just a simple one-line function with a filter to allow only for existing keys.
data = {'give': 'what', 'not': '___', 'me': 'I', 'no': '___', 'these': 'needed'}
keys = ['give', 'me', 'these', 'not_present']
n = { k: data[k] for k in filter(lambda k: k in data, keys) }
print(n)
print(list(n.keys()))
print(list(n.values()))
output:
{'give': 'what', 'me': 'I', 'these': 'needed'} ['give', 'me', 'these'] ['what', 'I', 'needed']
Upvotes: 2
Reputation: 3361
If you know the negation set (aka not
keys) in advance:
v = {'a': 'foo', 'b': 'bar', 'command': 'fizz', 'host': 'buzz' }
args = {k: v[k] for k in v if k not in ["a", "b"]}
args # {'command': 'fizz', 'host': 'buzz'}
Upvotes: 4
Reputation: 34026
According to the title of the question, one would expect to filter the dictionary in place - a couple of answers suggest methods for doing that - still it's not obvious what is the one obvious way - I added some timings:
import random
import timeit
import collections
repeat = 3
numbers = 10000
setup = ''
def timer(statement, msg='', _setup=None):
print(msg, min(
timeit.Timer(statement, setup=_setup or setup).repeat(
repeat, numbers)))
timer('pass', 'Empty statement')
dsize = 1000
d = dict.fromkeys(range(dsize))
keep_keys = set(random.sample(range(dsize), 500))
drop_keys = set(random.sample(range(dsize), 500))
def _time_filter_dict():
"""filter a dict"""
global setup
setup = r"""from __main__ import dsize, collections, drop_keys, \
keep_keys, random"""
timer('d = dict.fromkeys(range(dsize));'
'collections.deque((d.pop(k) for k in drop_keys), maxlen=0)',
"pop inplace - exhaust iterator")
timer('d = dict.fromkeys(range(dsize));'
'drop_keys = [k for k in d if k not in keep_keys];'
'collections.deque('
'(d.pop(k) for k in list(d) if k not in keep_keys), maxlen=0)',
"pop inplace - exhaust iterator (drop_keys)")
timer('d = dict.fromkeys(range(dsize));'
'list(d.pop(k) for k in drop_keys)',
"pop inplace - create list")
timer('d = dict.fromkeys(range(dsize));'
'drop_keys = [k for k in d if k not in keep_keys];'
'list(d.pop(k) for k in drop_keys)',
"pop inplace - create list (drop_keys)")
timer('d = dict.fromkeys(range(dsize))\n'
'for k in drop_keys: del d[k]', "del inplace")
timer('d = dict.fromkeys(range(dsize));'
'drop_keys = [k for k in d if k not in keep_keys]\n'
'for k in drop_keys: del d[k]', "del inplace (drop_keys)")
timer("""d = dict.fromkeys(range(dsize))
{k:v for k,v in d.items() if k in keep_keys}""", "copy dict comprehension")
timer("""keep_keys=random.sample(range(dsize), 5)
d = dict.fromkeys(range(dsize))
{k:v for k,v in d.items() if k in keep_keys}""",
"copy dict comprehension - small keep_keys")
if __name__ == '__main__':
_time_filter_dict()
results:
Empty statement 8.375600000000427e-05
pop inplace - exhaust iterator 1.046749841
pop inplace - exhaust iterator (drop_keys) 1.830537424
pop inplace - create list 1.1531293939999987
pop inplace - create list (drop_keys) 1.4512304149999995
del inplace 0.8008298079999996
del inplace (drop_keys) 1.1573763689999979
copy dict comprehension 1.1982901489999982
copy dict comprehension - small keep_keys 1.4407784069999998
So seems del is the winner if we want to update in place - the dict comprehension solution depends on the size of the dict being created of course and deleting half the keys is already too slow - so avoid creating a new dict if you can filter in place.
Edited to address a comment by @mpen - I calculated drop keys from keep_keys (given we do not have drop keys) - I assumed keep_keys/drop_keys are sets for this iteration or would take ages. With these assumptions del is still faster - but to be sure the moral is: if you have a (set, list, tuple) of drop keys, go for del
Upvotes: 4
Reputation: 1704
We can also achieve this by slightly more elegant dict comprehension:
my_dict = {"a":1,"b":2,"c":3,"d":4}
filtdict = {k: v for k, v in my_dict.items() if k.startswith('a')}
print(filtdict)
Upvotes: 7
Reputation: 4623
This is my approach, supports nested fields like mongo query.
How to use:
>>> obj = { "a":1, "b":{"c":2,"d":3}}
>>> only(obj,["a","b.c"])
{'a': 1, 'b': {'c': 2}}
only
function:
def only(object,keys):
obj = {}
for path in keys:
paths = path.split(".")
rec=''
origin = object
target = obj
for key in paths:
rec += key
if key in target:
target = target[key]
origin = origin[key]
rec += '.'
continue
if key in origin:
if rec == path:
target[key] = origin[key]
else:
target[key] = {}
target = target[key]
origin = origin[key]
rec += '.'
else:
target[key] = None
break
return obj
Upvotes: 2
Reputation: 191
This seems to me the easiest way:
d1 = {'a':1, 'b':2, 'c':3}
d2 = {k:v for k,v in d1.items() if k in ['a','c']}
I like doing this to unpack the values too:
a, c = {k:v for k,v in d1.items() if k in ['a','c']}.values()
Upvotes: 15
Reputation: 1704
We can do simply with lambda function like this:
>>> dict_filter = lambda x, y: dict([ (i,x[i]) for i in x if i in set(y) ])
>>> large_dict = {"a":1,"b":2,"c":3,"d":4}
>>> new_dict_keys = ("c","d")
>>> small_dict=dict_filter(large_dict, new_dict_keys)
>>> print(small_dict)
{'c': 3, 'd': 4}
>>>
Upvotes: 3
Reputation: 1971
You could use python-benedict
, it's a dict subclass.
Installation: pip install python-benedict
from benedict import benedict
dict_you_want = benedict(your_dict).subset(keys=['firstname', 'lastname', 'email'])
It's open-source on GitHub: https://github.com/fabiocaccamo/python-benedict
Disclaimer: I'm the author of this library.
Upvotes: 3
Reputation: 7303
Here is another simple method using del
in one liner:
for key in e_keys: del your_dict[key]
e_keys
is the list of the keys to be excluded. It will update your dict rather than giving you a new one.
If you want a new output dict, then make a copy of the dict before deleting:
new_dict = your_dict.copy() #Making copy of dict
for key in e_keys: del new_dict[key]
Upvotes: 2
Reputation: 1061
If we want to make a new dictionary with selected keys removed, we can make use of dictionary comprehension
For example:
d = {
'a' : 1,
'b' : 2,
'c' : 3
}
x = {key:d[key] for key in d.keys() - {'c', 'e'}} # Python 3
y = {key:d[key] for key in set(d.keys()) - {'c', 'e'}} # Python 2.*
# x is {'a': 1, 'b': 2}
# y is {'a': 1, 'b': 2}
Upvotes: 7
Reputation: 577
This one liner lambda should work:
dictfilt = lambda x, y: dict([ (i,x[i]) for i in x if i in set(y) ])
Here's an example:
my_dict = {"a":1,"b":2,"c":3,"d":4}
wanted_keys = ("c","d")
# run it
In [10]: dictfilt(my_dict, wanted_keys)
Out[10]: {'c': 3, 'd': 4}
It's a basic list comprehension iterating over your dict keys (i in x) and outputs a list of tuple (key,value) pairs if the key lives in your desired key list (y). A dict() wraps the whole thing to output as a dict object.
Upvotes: 29
Reputation: 1029
Another option:
content = dict(k1='foo', k2='nope', k3='bar')
selection = ['k1', 'k3']
filtered = filter(lambda i: i[0] in selection, content.items())
But you get a list
(Python 2) or an iterator (Python 3) returned by filter()
, not a dict
.
Upvotes: 8
Reputation: 13642
Short form:
[s.pop(k) for k in list(s.keys()) if k not in keep]
As most of the answers suggest in order to maintain the conciseness we have to create a duplicate object be it a list
or dict
. This one creates a throw-away list
but deletes the keys in original dict
.
Upvotes: 3
Reputation: 529
Code 1:
dict = { key: key * 10 for key in range(0, 100) }
d1 = {}
for key, value in dict.items():
if key % 2 == 0:
d1[key] = value
Code 2:
dict = { key: key * 10 for key in range(0, 100) }
d2 = {key: value for key, value in dict.items() if key % 2 == 0}
Code 3:
dict = { key: key * 10 for key in range(0, 100) }
d3 = { key: dict[key] for key in dict.keys() if key % 2 == 0}
All pieced of code performance are measured with timeit using number=1000, and collected 1000 times for each piece of code.
For python 3.6 the performance of three ways of filter dict keys almost the same. For python 2.7 code 3 is slightly faster.
Upvotes: 27
Reputation: 3055
You can do that with project function from my funcy library:
from funcy import project
small_dict = project(big_dict, keys)
Also take a look at select_keys.
Upvotes: 34
Reputation: 4302
This function will do the trick:
def include_keys(dictionary, keys):
"""Filters a dict by only including certain keys."""
key_set = set(keys) & set(dictionary.keys())
return {key: dictionary[key] for key in key_set}
Just like delnan's version, this one uses dictionary comprehension and has stable performance for large dictionaries (dependent only on the number of keys you permit, and not the total number of keys in the dictionary).
And just like MyGGan's version, this one allows your list of keys to include keys that may not exist in the dictionary.
And as a bonus, here's the inverse, where you can create a dictionary by excluding certain keys in the original:
def exclude_keys(dictionary, keys):
"""Filters a dict by excluding certain keys."""
key_set = set(dictionary.keys()) - set(keys)
return {key: dictionary[key] for key in key_set}
Note that unlike delnan's version, the operation is not done in place, so the performance is related to the number of keys in the dictionary. However, the advantage of this is that the function will not modify the dictionary provided.
Edit: Added a separate function for excluding certain keys from a dict.
Upvotes: 12
Reputation: 2917
Slightly more elegant dict comprehension:
foodict = {k: v for k, v in mydict.items() if k.startswith('foo')}
Upvotes: 276
Reputation: 1806
Based on the accepted answer by delnan.
What if one of your wanted keys aren't in the old_dict? The delnan solution will throw a KeyError exception that you can catch. If that's not what you need maybe you want to:
only include keys that excists both in the old_dict and your set of wanted_keys.
old_dict = {'name':"Foobar", 'baz':42}
wanted_keys = ['name', 'age']
new_dict = {k: old_dict[k] for k in set(wanted_keys) & set(old_dict.keys())}
>>> new_dict
{'name': 'Foobar'}
have a default value for keys that's not set in old_dict.
default = None
new_dict = {k: old_dict[k] if k in old_dict else default for k in wanted_keys}
>>> new_dict
{'age': None, 'name': 'Foobar'}
Upvotes: 12
Reputation: 5330
Given your original dictionary orig
and the set of entries that you're interested in keys
:
filtered = dict(zip(keys, [orig[k] for k in keys]))
which isn't as nice as delnan's answer, but should work in every Python version of interest. It is, however, fragile to each element of keys
existing in your original dictionary.
Upvotes: 18
Reputation: 4328
Here's an example in python 2.6:
>>> a = {1:1, 2:2, 3:3}
>>> dict((key,value) for key, value in a.iteritems() if key == 1)
{1: 1}
The filtering part is the if
statement.
This method is slower than delnan's answer if you only want to select a few of very many keys.
Upvotes: 76