Reputation: 3
I have a list of JSON objects stored as a text file, one JSON object per line (total size is 30 GB), and what I'm trying to do is extract elements from those objects and store them in a new list. Here is my code to do that
print("Extracting fingerprints...")
start = time.time()
for jsonObj in open('ctl_records_sample.jsonlines'):
temp_dict = {}
temp_dict = json.loads(jsonObj)
finger = temp_dict['data']['leaf_cert']['fingerprint']
with open("fingerprints.txt", "w") as f:
f.write(finger+"\n")
finger = ""
end = time.time()
print("Fingerprint extraction finished in" + str(end-start) +"s")
Basically, I'm trying to go line-by-line of the original file and write that line's "fingerprint" to the new text file. However, after letting the code run for several seconds, I open up fingerprints.txt and see that only one fingerprint has been written to the file. Any idea what could be happening?
Upvotes: 0
Views: 47
Reputation: 2041
You're opening the file in each loop iteration, in write mode as per your w
parameter passed to the open
function. Therefore it's being overwritten from the beginning.
You can solve it for example with two different approaches:
for
loop and everything will work, since it will be writing sequentially over the same file (using the same descriptor and pointer into the file).w
with an a
.Upvotes: 1
Reputation: 1962
Your code here is the issue:
with open("fingerprints.txt", "w") as f:
f.write(finger+"\n")
The "w"
part will truncate file each time it's opened.
You either want to open the file and keep it open throughout your loop, or check that the file exists and if it does open it with "a"
to append.
Upvotes: 1
Reputation: 4680
When calling open()
with the "w"
mode, all the file contents will be deleted. From the Python documentation for the open()
function:
'w'
: open for writing, truncating the file first
I think you are looking to use the "a"
mode, which appends new contents to the end of the file:
'a'
: open for writing, appending to the end of the file if it exists
with open("fingerprints.txt", "a", newline="\n") as f:
f.write(finger)
(You can also drop the +"\n"
to the f.write()
call by passing the newline="\n"
argument to open()
.)
Upvotes: 0