dani_anyman
dani_anyman

Reputation: 133

print multidimensional dictionary with repeating key

I am new to python and currently using python 2. I built a multidimensional dictionary that looks like this:

targets = {house: {N: {red: {A:1}, garden: {N: 6}}}
          {great: {A: {very: {Adv:12}, so: {Adv: 5}, a: {Det: 3}}}}
etc.

Basically there are always 4 nested dictionaries but the entries of the 'third' dictionary ({red: {}, horse: {} etc) can consist of an arbitrary number of items. So, the number of items in the dictionary varies.

Now, I like to write the dictionary into a file, preferably into a csv-file. The output file should display all the entries in a tab-separated manner, each line starting with the outmost key. For example:

house    N    red      A    1
house    N    garden   N    6
great    A    very     Adv  12
great    A    so       Adv  5
great    A    a        Det  3

I know, there are a lot of posts about printing multidimensional dictionaries, however I have not found one (yet) where the outmost key is printed during every iteration. I tried to include the code snippets provided for other questions concerning the multidimensional dictionaries but it did not work well so far.

I just managed to write the dictionary into a normal .txt-file in the dictionary format with this for loop:

for target in targets_dict:
    results.write(str(target) + str(targets_dict[str(target)]) + '\n')

or write it to a csv-file using csvwriter (I know there is also DictWriter, I just could not get it to work properly):

w = csv.writer(results, delimiter = '\t')
for target in targets_dict.iteritems():
    w.writerow(target)

Obviously, this is pretty basic and the iteration does not enter the inner dictionaries.

Trying a modified solution that has been posted to a related problem (recursively traverse multidimensional dictionary, dimension unknown) always resides in an 'expected a character buffer object'-error.

for for k,v in sorted(targets_dict.items(),key=lambda x: x[0]):
    if isinstance(v, dict):
        results.write(" ") + ("%s %s") % (k, v)

Every suggestion or hint is appreciated to help me understand the logic behind all this, so that I am able to figure it out.

Upvotes: 2

Views: 852

Answers (3)

Eugene
Eugene

Reputation: 1639

Here is a simple solution. The idea is just to loop through the dict, into an list, then create the tsv file from that list, but only because you know the nest depth (4, which seems OK). The below is not optimized for speed, and doesn't check existence anywhere, but hopefully you get the idea.

import csv
targets = {'house': {'N': {'red': {'A':1}, 'garden': {'N': 6}}},
           'great': {'A': {'very': {'Adv':12}, 'so': {'Adv': 5}, 'a': {'Det': 3}}}}
with open('targets.tsv', 'w', newline='\n') as tsvfile:
    writer = csv.writer(tsvfile, delimiter='\t')
    for t in targets:
        for u in targets[t]:
            for v in targets[t][u]:
                for w in targets[t][u][v]:
                    #print [t, u, v, w, targets[t][u][v][w]]
                    writer.writerow([t, u, v, w, targets[t][u][v][w]])

Prints:

['house', 'N', 'red', 'A', 1]
['house', 'N', 'garden', 'N', 6]
['great', 'A', 'very', 'Adv', 12]
['great', 'A', 'so', 'Adv', 5]
['great', 'A', 'a', 'Det', 3]

And also creates the tsv file:

house   N   red A   1
house   N   garden  N   6
great   A   very    Adv 12
great   A   so  Adv 5
great   A   a   Det 3

EDIT: updated code according to comment in OP (the keys in the outermost dictionary are unique and should be treated as keys to targets).

Upvotes: 1

niemmi
niemmi

Reputation: 17263

Recursion is indeed the solution to the problem. You can define generator function that recursively traverses the dictionary while constructing a path of encountered items. When you encounter non-dict item just yield whatever has been added to the path and write that to CSV file:

import csv

targets = {
    'house': {'N': {'red': {'A':1}, 'garden': {'N': 6}}},
    'great': {'A': {'very': {'Adv':12}, 'so': {'Adv': 5}, 'a': {'Det': 3}}}
}

def get_rows(o, path=None):
    if path is None:
        path = []

    # Base case, add object to path and yield it
    if not isinstance(o, dict):
        path.append(o)
        yield path
        path.pop()
        return

    for k, v in o.items():
        path.append(k)
        yield from get_rows(v, path)
        path.pop()

with open('result.csv', 'w', newline='') as f:
    writer = csv.writer(f, delimiter='\t')
    for row in get_rows(targets):
        writer.writerow(row)

Output:

great   A   a   Det 3
great   A   so  Adv 5
great   A   very    Adv 12
house   N   red A   1
house   N   garden  N   6

Note that the output you get might be in different order since dict is unordered. The above solution will work with nested dictionaries with any depth. If you're using Python 2 the code needs to be tweaked a bit since Python 2 doesn't have yield from.

Upvotes: 2

MoRe
MoRe

Reputation: 1538

It is very simple to just nest for loops over all dicts:

import csv

targets = {'house': {'N': {'red': {'A':1}, 'garden': {'N': 6}}}, 'great': {'A': {'very': {'Adv':12}, 'so': {'Adv': 5}, 'a': {'Det': 3}}}}

with open('file.csv', 'wb') as csvfile:
  csvwriter = csv.writer(csvfile, delimiter='\t')
  for k,v in targets.iteritems():
    for k2,v2 in v.iteritems():
      for k3,v3 in v2.iteritems():
        for k4,v4 in v3.iteritems():
          csvwriter.writerow([str(k), str(k2), str(k3), str(k4), str(v4)])
          #print(str(k) + "\t" + str(k2) + "\t" + str(k3) + "\t" + str(k4) + "\t" + str(v4))

outputs exactly what you want.

Upvotes: 1

Related Questions