Jazz.Min
Jazz.Min

Reputation: 55

iterate dictionary multiple values by keys in python

I want to find most optimal way to iterate values in key in python.

I have file with that structure:

17 key1

18 key1

45 key2

78 key2

87 key2

900 key3

92 key4

so I need to set the second column as key(with no repetition) and link to this key all the values (first column) corresponding to it.

'key1':['17','18']

'key2':['45','78','87']

'key3':['900']

'key4':['92']

Up to now I do it without using the dictionary:

for line in file:

           value, key = line.strip().split(None,1)

And then I can put it into the dictionary with

 diction.setdefault(key, []).append(value)

so after that I have a nice dictionary as I needed.

But after that I have to reread file for changes. changes can occur in keys(pairs) (adding/removing) or only in value (adding/removing) How can I check if change occured by iteration keys by values?

UPD***: for keys check is more or less clear:

if diction[key]:

but how to iterate values inside the key? I need to find the difference, and then add\remove this value\pair(if last value of the key) from dictionary?

I suppose it can be done with some iteritem()\itervalues() or smthng but I m not familiar with that.

Thank you for help.

UPD***

Thank you @Joël. Finally I used 3 checks. first is any keys added:

set_old_dict = set(new_old.keys())
set_new_dict = set(new_dict.keys()) 
intersect = set_new_dict.intersection(set_old_dict)



def added(self):
    return set_new_dict - intersect 
  def removed(self):
    return set_old_dict - intersect

And then if I do not catch or have already processed this situations I will use your function:

 def comp(old_dict, new_dict):
     for key, old_val in old_dict.items():
         new_val = new_dict[key]  
        print 'evolutions for', key
         print 'new content:', [x for x in new_val if x not in old_val]
         print 'removed content:', [x for x in old_val if x not in new_val]

Upvotes: 1

Views: 4358

Answers (1)

Joël
Joël

Reputation: 2822

My advice is that, if you have to re-read the input file, you may as well re-create your dictionary, but that depends on the time needed for dictionary creation. As you request, maybe it's quicker to analyze differences in file, and to update the dictionary.

You can have a look at the difflib module, and then to analyze the differences. Based on this, removals can be deleted in dictionary, addition added as necessary.

Sadly, I bet you'll have a hard time with its output: this is meant to be human-readable, not machine-readable, so there may be a better answer.


EDIT if you want to keep track of the changes between two files version, as written in your comment, you can compare the dictionaries. For the keys, you already have what is needed.

Now, for updated values: if you are sure that your values will always be lists of strings, then you can do quite the same thing as for comparing the dict keys:

>>> def comp(old_dict, new_dict):
...     for key, old_val in old_dict.items():
...         new_val = new_dict[key]  # warning: to be used on keys in both dict
...         print 'evolutions for', key
...         print 'new content:', [x for x in new_val if x not in old_val]
...         print 'removed content:', [x for x in old_val if x not in new_val]

# now testing on a simple example
>>> o = {'key1': ['a', 'b', 'c']}
>>> n = {'key1': ['b', 'c', 'd']}
>>> comp(o, n)
evolutions for key1
new content: ['d']
removed content: ['a']

Warning: this function works only if new_dict contains all keys of old_dict, otherwise creation of new_val will fail. You can easily go around this concern, by adding the comparisons of keys in the function:

  • keys in old_dict that are not in new_dict are removed entries;
  • keys in new_dict and not in old_dict are additions.

Please publish your result in your answer, so that others may benefit from it.

Upvotes: 1

Related Questions