Reputation: 3
I have this data (Remark: don't consider this data a json file consider it a normal txt file). :
{"tstp":1383173780727,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14},{"nb":903,"state":"open","freebk":2,"freebs":18}]}{"tstp":1383173852184,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14}]}
I want to take all the values inside the first tstp only and stop when reaching the other tstp.
What I am trying to do is to create a file for each tstp and inside this file, it will have nb, state, freebk, freebs as columns in this file.
expected output:
first tstp file:
nb state freebk freebs
901 open 6 14
903 open 2 18
second tstp file:
nb state freebk freebs
901 open 6 14
this output is for the first tstp I want to create a different file for each tstp in my data so for the provided data 2 files will be created ( because we have only 2 tstp in the data)
Remark: don't consider this data a json file consider it a normal txt file.
Upvotes: 0
Views: 87
Reputation: 483
This below approach will help you with all types of data available for "tstp" which may have spaces in between.
I used regex for properly capturing starting of each JSON to prepare a valid data. (Also works If your data is unorganized in your file.)
import re
import ast
# Reading Content from Text File
with open("text.txt", "r") as file:
data = file.read()
# Transforming Data into Json for better value collection
regex = r'{[\s]*"tstp"'
replaced_content = ',{"tstp"'
# replacing starting of every {json} dictionary with ,{json}
data = re.sub(regex, replaced_content, data)
data = "[" + data.strip()[1:] + "]" # removing First unnecessary comma (,)
data = ast.literal_eval(data) # converting string to list of Json
# Preparing data for File
headings_data = "nb state freebk freebs"
for count, json in enumerate(data, start=1):
# Remove this part with row = "" if you dont want tstp value in file.
row = "File - {0}\n\n".format(json["tstp"])
row += headings_data
for item in json["ststates"]:
row += "\n{0} {1} {2} {3}".format(
item["nb"], item["state"], item["freebk"], item["freebs"])
# Preparing different file for each tstp
filename = "file-{0}.txt".format(count)
with open(filename, "w") as file:
file.write(row)
Output:
File 1
File - 1383173780727
nb state freebk freebs
901 open 6 14
903 open 2 18
File 2
File - 1383173852184
nb state freebk freebs
901 open 6 14
Note: We cannot replace "}{" in every situation. Maybe, in your data the brackets may placed in different lines.
Upvotes: 1
Reputation: 169407
Well, it looks like }{
is a nice separator for the entries, so let's (ab)use that fact. Better formatting of the output is left as an exercise to the reader.
import ast
# (0) could be read with f.read()
data = """{"tstp":1383173780727,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14},{"nb":903,"state":"open","freebk":2,"freebs":18}]}{"tstp":1383173852184,"ststates":[{"nb":901,"state":"open","freebk":6,"freebs":14}]}"""
# (1) split data by ´}{`
entries = data.replace("}{", "}\n{").splitlines()
# (2) read each entry (since we were told it's not JSON,
# don't use JSON but ast.literal_eval, but the effect is the same)
entries = [ast.literal_eval(ent) for ent in entries]
# (3) print out some ststates!
for ent in entries:
print("nb\tstate\tfreebk\tfreebs")
for ststate in ent.get("ststates", []):
print("{nb}\t{state}\t{freebk}\t{freebs}".format_map(ststate))
print("---")
The output is
nb state freebk freebs
901 open 6 14
903 open 2 18
---
nb state freebk freebs
901 open 6 14
---
Upvotes: 1