Reputation: 75
EDIT: Forgot to mention I am using Python 2.7
I have a large json file strctured like this:
[{
"headline": "Algérie Télécom prolonge son dispositif spécial Covid-19",
"url_src": "https://www.algerie360.com/algerie-telecom-prolonge-son-dispositif-special-covid-19/",
"img_src": "https://www.algerie360.com/wp-content/uploads/2020/04/DIA-Iddom-Algérie-télécom-320x200.jpg",
"news_src": "Algérie 360",
"catPT": "Ciência e Tecnologia",
"catFR": "Science et Technologie",
"catEN": "Science and Technology",
"lang": "French",
"epoch": 1591293345.817
},
{
"headline": "Internet haut débit à Alger : Lancement de la généralisation du » fibre to home »",
"url_src": "https://www.algerie360.com/20200510-internet-haut-debit-a-alger-lancement-de-la-generalisation-du-fibre-to-home/",
"img_src": "https://www.algerie360.com/wp-content/uploads/2020/05/unnamed-320x200.jpg",
"news_src": "Algérie 360",
"catPT": "Ciência e Tecnologia",
"catFR": "Science et Technologie",
"catEN": "Science and Technology",
"lang": "French",
"epoch": 1591283345.817
},
...
I've been trying to write a .py script that opens my json file, removes all objects where the "epoch" key is less than 1591293345.817, and overwrites the current file.
Is this possible at all?
I've tried the following but my python knowledge is sketchy at best:
import time
import os
import json
import jsonlines
json_lines = []
with open('./json/news_done.json', 'r') as open_file:
for line in open_file.readlines():
j = json.loads(line)
now = time.time()
print(j['epoch'])
lastWeek = now - 3600
if not j['{epoch}'] > lastWeek:
json_lines.append(line)
with open('./json/news_done.json', 'w') as open_file:
open_file.writelines('\n'.join(json_lines))
Upvotes: 1
Views: 96
Reputation: 21
Have you tried pandas framework? You can easily filter your columns with it.
I got this code snippet work with your example data:
import pandas as pd
import json
dataset = pd.read_json('example.json')
new_dataset = dataset[dataset['epoch'] >= 1591293345.817]
final_data = new_dataset.to_json(orient='records')
with open('example.json', 'w') as f:
json.dump(final_data, f)
Upvotes: 2
Reputation: 320
Looks like you're only removing the "epoch" tag but if I've understood correctly you want to dismiss the whole element
you can open the file entirely as a json instead of lines individually
import json,time
with open('./json/news_done.json', 'r') as open_file:
yourFileRead = open_file.read()
yourJson = json.loads(yourFileRead)
filteredList = []
for j in yourJson: # j is one element out of the list not only one line
if time.time()-3600 > j['epoch']:
filteredList.append(j)
with open('./json/news_done.json', 'w') as open_file:
open_file.write(json.dumps(filteredList))
Upvotes: 1