Reputation: 55
I want to find most optimal way to iterate values in key in python.
I have file with that structure:
17 key1
18 key1
45 key2
78 key2
87 key2
900 key3
92 key4
so I need to set the second column as key(with no repetition) and link to this key all the values (first column) corresponding to it.
'key1':['17','18']
'key2':['45','78','87']
'key3':['900']
'key4':['92']
Up to now I do it without using the dictionary:
for line in file:
value, key = line.strip().split(None,1)
And then I can put it into the dictionary with
diction.setdefault(key, []).append(value)
so after that I have a nice dictionary as I needed.
But after that I have to reread file for changes. changes can occur in keys(pairs) (adding/removing) or only in value (adding/removing) How can I check if change occured by iteration keys by values?
UPD***: for keys check is more or less clear:
if diction[key]:
but how to iterate values inside the key? I need to find the difference, and then add\remove this value\pair(if last value of the key) from dictionary?
I suppose it can be done with some iteritem()\itervalues() or smthng but I m not familiar with that.
Thank you for help.
UPD***
Thank you @Joël. Finally I used 3 checks. first is any keys added:
set_old_dict = set(new_old.keys())
set_new_dict = set(new_dict.keys())
intersect = set_new_dict.intersection(set_old_dict)
def added(self):
return set_new_dict - intersect
def removed(self):
return set_old_dict - intersect
And then if I do not catch or have already processed this situations I will use your function:
def comp(old_dict, new_dict):
for key, old_val in old_dict.items():
new_val = new_dict[key]
print 'evolutions for', key
print 'new content:', [x for x in new_val if x not in old_val]
print 'removed content:', [x for x in old_val if x not in new_val]
Upvotes: 1
Views: 4358
Reputation: 2822
My advice is that, if you have to re-read the input file, you may as well re-create your dictionary, but that depends on the time needed for dictionary creation. As you request, maybe it's quicker to analyze differences in file, and to update the dictionary.
You can have a look at the difflib
module, and then to analyze the differences. Based on this, removals can be deleted in dictionary, addition added as necessary.
Sadly, I bet you'll have a hard time with its output: this is meant to be human-readable, not machine-readable, so there may be a better answer.
EDIT if you want to keep track of the changes between two files version, as written in your comment, you can compare the dictionaries. For the keys, you already have what is needed.
Now, for updated values: if you are sure that your values will always be lists of strings, then you can do quite the same thing as for comparing the dict keys:
>>> def comp(old_dict, new_dict):
... for key, old_val in old_dict.items():
... new_val = new_dict[key] # warning: to be used on keys in both dict
... print 'evolutions for', key
... print 'new content:', [x for x in new_val if x not in old_val]
... print 'removed content:', [x for x in old_val if x not in new_val]
# now testing on a simple example
>>> o = {'key1': ['a', 'b', 'c']}
>>> n = {'key1': ['b', 'c', 'd']}
>>> comp(o, n)
evolutions for key1
new content: ['d']
removed content: ['a']
Warning: this function works only if new_dict
contains all keys of old_dict
, otherwise creation of new_val
will fail. You can easily go around this concern, by adding the comparisons of keys in the function:
old_dict
that are not in new_dict
are removed entries; new_dict
and not in old_dict
are additions.Please publish your result in your answer, so that others may benefit from it.
Upvotes: 1