Reputation: 1482
I am writing a Django command to delete data older than x days from my app.
The filtering is made using the following:
qs = Data.objects.filter(date_created__lte=timezone.now()-timedelta(days=days_del))
With days_del
being an integer and date_created
a DateTimeField
.
When trying to either print this queryset, or calling .delete()
on it result in a JSONDecodeError
and ValidationError
. I don't really get why this happens or how I can prevent it trying to decode JSON fileds in this case.
Note that I am using the jsonfield
pypi package and the Data model has a JSONField
.
It is possible that some data are stalled and causing the problem (see traceback), is there a way to ignore the validation and proceed with the deletion anyway?
Traceback (most recent call last):
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/jsonfield/fields.py", line 83, in pre_init
return json.loads(value, **self.load_kwargs)
File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 464 (char 463)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
utility.execute()
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/core/management/__init__.py", line 356, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/core/management/base.py", line 283, in run_from_argv
self.execute(*args, **cmd_options)
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/core/management/base.py", line 330, in execute
output = self.handle(*args, **options)
File "/webapps/myproj/server/mirrors/management/commands/data_cleanup.py", line 39, in handle
qs.delete()
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/db/models/query.py", line 616, in delete
collector.collect(del_query)
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/db/models/deletion.py", line 191, in collect
reverse_dependency=reverse_dependency)
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/db/models/deletion.py", line 89, in add
if not objs:
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/db/models/query.py", line 254, in __bool__
self._fetch_all()
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/db/models/query.py", line 1118, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/db/models/query.py", line 63, in __iter__
obj = model_cls.from_db(db, init_list, row[model_fields_start:model_fields_end])
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/db/models/base.py", line 583, in from_db
new = cls(*values)
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/db/models/base.py", line 502, in __init__
_setattr(self, field.attname, val)
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/jsonfield/subclassing.py", line 43, in __set__
obj.__dict__[self.field.name] = self.field.pre_init(value, obj)
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/jsonfield/fields.py", line 85, in pre_init
raise ValidationError(_("Enter valid JSON"))
django.core.exceptions.ValidationError: ['Enter valid JSON']
I am deleting a lot of data at the same time, maybe there's a better way to handle this. Anyway fixing the old staled data is not an option here.
Thanks
Here is the command file:
from django.core.management.base import BaseCommand, CommandError
from oauth2_provider.models import Application
from django.utils import timezone
import pytz
from datetime import timedelta
from confluence_core.models import Data
class Command(BaseCommand):
help = 'Delete data older than given days'
def add_arguments(self, parser):
parser.add_argument("-d", "--days", type=int, dest='days', required=True, help="Days limit")
parser.add_argument("-c", "--confirm", action='store_true', dest='confirm', default=False, required=False, help="Confirm before deletion")
def handle(self, *args, **options):
days_del = options['days']
do_delete = False
qs = Data.objects.filter(date_created__lte=timezone.now()-timedelta(days=days_del))
if qs.count() > 0:
if options['confirm']:
print(f"{qs.count():,} data entries will be deleted.")
ret = input("Confirm ? (y/n)\n")
if ret in ['y', 'Y', 'yes']:
do_delete = True
else:
do_delete = True
if do_delete is True:
print(f"Deleting {qs.count():,} data entries...")
qs.delete()
else:
print("Not deleting anything.")
else:
print("No data to delete.")
print("Done.")
Upvotes: 1
Views: 514
Reputation: 77912
Given that the ORM force loads the queryset when deleting, cf:
File "/root/virtualenvs/myproj-prod/lib/python3.6/site-packages/django/db/models/deletion.py", line 89, in add
if not objs:
the only workaround I can think of (that does not require forking or monkeypatching anything) would be to first update the whole queryset so the jsonfield is set to something valid, ie:
qs.update(name_of_the_field={})
qs.delete()
But this won't preven other issues with your data being inconsistent, so the real solution is obviously to clean up your whole dataset.
Upvotes: 2