alinaz
alinaz

Reputation: 149

Counter for occurrences of element of one dictionary in another dictionary in python

I have a list of nested dictionaries that all have another list of dictionaries inside that I am interested in. That is, if I have:

list_of_dicts[0]['data_i_need']

it contains the following dictionary:

[
  {
    'key1': ['item1', 'item2', 'item3'],
    'details': 'details'
  },
  {
    'key2': ['item2', 'item4'],
    'details': 'other details'
  }
]

I have another large dictionary (d2) that has the following structure:

{
  'item1': {
    'details': ['detail1', 'detail2',],
    'classes': ['class2'],
  },
  'item2': {
    'details': ['detail1'],
    'classes': ['class1', 'class2'],
  },
}

I would like to add another dictionary to every dictionary in list_of_dicts that would be the following:

{'class1': 2, 'class2': 3}

I.e. if an item is corresponding to a class is in the list_of_dicts[0]['data_i_need'], I would need to count it. I want to loop through every dictionary in list_of_dicts.

I have tried many things, among which is something like the below, but I feel stuck now.

import Counter

for l in list_of_dict:
   for d in l['data_i_need']:
    Counter(d2[d]["classes"])

Upvotes: 0

Views: 140

Answers (1)

Ivan Vučica
Ivan Vučica

Reputation: 9679

You should always define full inputs and full outputs for code you want to implement; this question doesn't specify exact input and exact output you desire, it tries to vaguely explain it. It's really hard to process the question as posed: you should provide semantic value and you should provide exact values. "classes" means nothing without context; "item1" and "key1" means nothing. "details" is also meaning-free as-is. Providing specifics eases reading of the question and allows readers to help out beyond just "how do I implement this puzzle": you want good answers and solutions, not answers to a puzzle.

If I am understanding correctly, you want to iterate over all the dictionaries in the first list and update this dict based on some other criteria. Let's just separate this by passing both to a new function -- doable because python dicts are passed by reference:

for d in list_of_dicts:
  add_class_counts(d, itemInfo)

(I'm renaming d2 into itemInfo because it has at least a bit more semantic information.)

I'm assuming d is going to be a single dictionary consisting of:

{
  'data_i_need': [
    {
      'key1': ['item1', 'item2', 'item3'],
      'details': 'some irrelevant details',
    },
    {
       'key2': [...],
       # more?
    }
  ]
}

I am confused that you have key1 and key2. Shall we assume it's just items? You should turn key1 and key2 from keys into a value under a key name, e.g.:

{
  'name': 'key1',
  'items': ['item1', 'item2']
  'details': 'some irrelevant details'
}

The problem is, if the input is not structured like this, how do you know which key1 and key2 are these magical key names? You need an allowlist of other constant keys like details, which you would then ignore trying to guess which of the other keys is key1:

def name_for_datadict(d):
  # note: untested code.
  allow_list = ['details']
  all_keys = d.keys()
  unknown_keys = [k for k in all_keys if k not in allow_list]
  if len(unknown_keys) == 1: 
     return unknown_keys[0]
  # if it's 0 or greater than 1, we couldn't guess, give up
  raise Exception('invalid data: ' + str(d))

This is ugly and will (intentionally) break if you have more than one 'non-constant' key.

Let's now try to understand what should add_class_counts() do. It seems like you want to specify how many items in the list are part of a certain class.

Let's see; add_class_counts() needs to go through every item in the passed dict:

def add_class_counts(d, itemInfo):
  # expected input:
  # d = {
  #   'name': 'key1',
  #   'items': ['item1', 'item2'],
  # }

  class_counts = {}
  for itm in d['items']:
    class_list = classes_for_item(itmName, itemInfo):
    # TODO: count it somehow

How do we know which classes is an item a part of? It's obvious, really:

def class_list(itmName, itemInfo):
  # this should return ['class1', 'class2']
  return itemInfo[itemName]['classes']

How do we count this?

def add_class_counts(d, itemInfo):
  class_counts = {}
  for itm in d['items']:
    class_list = classes_for_item(itmName, itemInfo):
    for class_name in class_list:
      try:
        class_counts[class_name] += 1
      except: # TODO: limit to the 'key not found' exception which I cannot remember right now
        class_counts[class_name] = 0

  # we've finished counting.
  # update the existing dictionary with the counts.
  d['class_counts'] = class_counts

Note: the code is untested, and it's doing weird things (maybe you want to return class_counts and then update d?). But it might give you a basic idea.

Of course, if it's not implementing what you expected, you will really want to write a much more concrete, semantically-rich explanation of what the inputs are and what you want to receive as an output. Anyone that's genuinely trying to help will want to understand why you're trying to do what you're trying to do.

Upvotes: 1

Related Questions