oliverbj
oliverbj

Reputation: 6052

Python3 - Nested dict to JSON

I am trying to convert multiple .txt file to "table-like" data (with columns and rows). Each .txt file should be considered as a new column.

Consider below content of the .txt file:

File1.txt

Hi there
How are you doing?

What is your name?

File2.txt

Hi
Great!

Oliver, what's yours?

I have created a simple method, that accepts the file and and integer (the file number from another method):

def txtFileToJson(text_file, column):

    data = defaultdict(list)

    i = int(1)
    with open(text_file) as f:

        data[column].append(column)

        for line in f:
            i = i + 1
            for line in re.split(r'[\n\r]+', line):
                data[column] = line

    with open("output.txt", 'a+') as f:
        f.write(json.dumps(data))

So above method will run two times (one time for each file, and append the data).

This is the output.txt file after I have run my script:

{"1": "What is your name?"}{"2": "Oliver, what's yours?"}

As you can see, I can only get it to create a new for each file I have, and then add the entire line.

[{
   "1": [{
       "1": "Hi there",
       "2": "How are you doing?",
       "3": "\n"
       "4": "What is your name?"
   },
   "2": [{
       "1": "Hi"
       "2": "Great!",
       "3": "\n",
       "4": "Oliver, what's yours?"
   },
}]

Update:

OK, so I played around a bit and got a bit closer:

myDict = {str(column): []}
i = int(1)
with open(text_file) as f:
    for line in f:
        # data[column].append(column)
        match = re.split(r'[\n\r]+', line)
        if match:
            myDict[str(column)].append({str(i): line})
            i = i + 1

with open(out_file, 'a+') as f:
    f.write(json.dumps(myDict[str(column)]))

That gives me below output:

[{"1": "Hi there\n"}, {"2": "How are you doing?\n"}, {"3": "\n"}, {"4": "What is your name?"}]
[{"1": "Hi\n"}, {"2": "Great!\n"}, {"3": "\n"}, {"4": "Oliver, what's yours?"}]

But as you can see, now I have multiple JSON root elements.

Solution

Thanks to jonyfries, I did this:

 data = defaultdict(list)

 for path in images.values():
     column = column + 1

     data[str(column)] = txtFileToJson(path, column)

 saveJsonFile(path, data)

And then added a new method to save the final combined list:

def saveJsonFile(text_file, data):

    basename = os.path.splitext(os.path.basename(text_file))
    dir_name = os.path.dirname(text_file) + "/"
    text_file = dir_name + basename[0] + "1.txt"

    out_file = dir_name + 'table_data.txt'

    with open(out_file, 'a+') as f:
        f.write(json.dumps(data))

Upvotes: 1

Views: 569

Answers (3)

jonyfries
jonyfries

Reputation: 854

You're creating a new dictionary within the function itself. So each time you pass a text file in it will create a new dictionary.

The easiest solution seems to be returning the dictionary created and add it to an existing dictionary.

def txtFileToJson(text_file, column):
    myDict = {str(column): []}
    i = int(1)
    with open(text_file) as f:
        for line in f:
            # data[column].append(column)
            match = re.split(r'[\n\r]+', line)
            if match:
                myDict[str(column)].append({str(i): line})
                i = i + 1

    with open(out_file, 'a+') as f:
        f.write(json.dumps(myDict[str(column)]))

    return myDict

data = defaultdict(list)

data["1"] = txtFileToJson(text_file, column)
data["2"] = txtFileToJson(other_text_file, other_column)

Upvotes: 1

Booboo
Booboo

Reputation: 44108

First, if I understand you are trying to get as output a dictionary of dictionaries, then let me observe that what I understand to be your desired output seems to be enclosing the whole thing within a list, Furthermore, you have unbalanced open and closed list brackets within the dictionaries, which I will ignore, as I will the enclosing list.

I think you need something like:

#!python3

import json
import re

def processTxtFile(text_file, n, data):
    d = {}
    with open(text_file) as f:
        i = 0
        for line in f:
            for line in re.split(r'[\n\r]+', line):
                i = i + 1
                d[str(i)] = line
    data[str(n)] = d


data = dict()
processTxtFile('File1.txt', 1, data)
processTxtFile('File2.txt', 2, data)
with open("output.txt", 'wt') as f:
    f.write(json.dumps(data))

If you really need the nested dictionaries to be enclosed within a list, then replace

data[str(n)] = d

with:

data[str(n)] = [d]

Upvotes: 0

igrinis
igrinis

Reputation: 13626

def read(text_file):
    data, i = {}, 0
    with open(text_file) as f:
        for line in f:
            i = i + 1
            data['row_%d'%i] = line.rstrip('\n')
    return data

res = {}
for i, fname in enumerate([r'File1.txt', r'File2.txt']):
    res[i] = read(fname)
with open(out_file, 'w') as f:
    json.dump(res, f)

Upvotes: 0

Related Questions