Reputation: 6052
I am trying to convert multiple .txt
file to "table-like" data (with columns and rows). Each .txt
file should be considered as a new column.
Consider below content of the .txt
file:
File1.txt
Hi there
How are you doing?
What is your name?
File2.txt
Hi
Great!
Oliver, what's yours?
I have created a simple method, that accepts the file and and integer (the file number from another method):
def txtFileToJson(text_file, column):
data = defaultdict(list)
i = int(1)
with open(text_file) as f:
data[column].append(column)
for line in f:
i = i + 1
for line in re.split(r'[\n\r]+', line):
data[column] = line
with open("output.txt", 'a+') as f:
f.write(json.dumps(data))
So above method will run two times (one time for each file, and append the data).
This is the output.txt
file after I have run my script:
{"1": "What is your name?"}{"2": "Oliver, what's yours?"}
As you can see, I can only get it to create a new for each file I have, and then add the entire line.
[{
"1": [{
"1": "Hi there",
"2": "How are you doing?",
"3": "\n"
"4": "What is your name?"
},
"2": [{
"1": "Hi"
"2": "Great!",
"3": "\n",
"4": "Oliver, what's yours?"
},
}]
OK, so I played around a bit and got a bit closer:
myDict = {str(column): []}
i = int(1)
with open(text_file) as f:
for line in f:
# data[column].append(column)
match = re.split(r'[\n\r]+', line)
if match:
myDict[str(column)].append({str(i): line})
i = i + 1
with open(out_file, 'a+') as f:
f.write(json.dumps(myDict[str(column)]))
That gives me below output:
[{"1": "Hi there\n"}, {"2": "How are you doing?\n"}, {"3": "\n"}, {"4": "What is your name?"}]
[{"1": "Hi\n"}, {"2": "Great!\n"}, {"3": "\n"}, {"4": "Oliver, what's yours?"}]
But as you can see, now I have multiple JSON root elements.
Thanks to jonyfries, I did this:
data = defaultdict(list)
for path in images.values():
column = column + 1
data[str(column)] = txtFileToJson(path, column)
saveJsonFile(path, data)
And then added a new method to save the final combined list:
def saveJsonFile(text_file, data):
basename = os.path.splitext(os.path.basename(text_file))
dir_name = os.path.dirname(text_file) + "/"
text_file = dir_name + basename[0] + "1.txt"
out_file = dir_name + 'table_data.txt'
with open(out_file, 'a+') as f:
f.write(json.dumps(data))
Upvotes: 1
Views: 569
Reputation: 854
You're creating a new dictionary within the function itself. So each time you pass a text file in it will create a new dictionary.
The easiest solution seems to be returning the dictionary created and add it to an existing dictionary.
def txtFileToJson(text_file, column):
myDict = {str(column): []}
i = int(1)
with open(text_file) as f:
for line in f:
# data[column].append(column)
match = re.split(r'[\n\r]+', line)
if match:
myDict[str(column)].append({str(i): line})
i = i + 1
with open(out_file, 'a+') as f:
f.write(json.dumps(myDict[str(column)]))
return myDict
data = defaultdict(list)
data["1"] = txtFileToJson(text_file, column)
data["2"] = txtFileToJson(other_text_file, other_column)
Upvotes: 1
Reputation: 44108
First, if I understand you are trying to get as output a dictionary of dictionaries, then let me observe that what I understand to be your desired output seems to be enclosing the whole thing within a list, Furthermore, you have unbalanced open and closed list brackets within the dictionaries, which I will ignore, as I will the enclosing list.
I think you need something like:
#!python3
import json
import re
def processTxtFile(text_file, n, data):
d = {}
with open(text_file) as f:
i = 0
for line in f:
for line in re.split(r'[\n\r]+', line):
i = i + 1
d[str(i)] = line
data[str(n)] = d
data = dict()
processTxtFile('File1.txt', 1, data)
processTxtFile('File2.txt', 2, data)
with open("output.txt", 'wt') as f:
f.write(json.dumps(data))
If you really need the nested dictionaries to be enclosed within a list, then replace
data[str(n)] = d
with:
data[str(n)] = [d]
Upvotes: 0
Reputation: 13626
def read(text_file):
data, i = {}, 0
with open(text_file) as f:
for line in f:
i = i + 1
data['row_%d'%i] = line.rstrip('\n')
return data
res = {}
for i, fname in enumerate([r'File1.txt', r'File2.txt']):
res[i] = read(fname)
with open(out_file, 'w') as f:
json.dump(res, f)
Upvotes: 0