Zack Plauché
Zack Plauché

Reputation: 4230

Django: Process FileField in model clean method, NOT after save

I have a model:

class MyModel(models.Model):
    csv = models.FileField(upload_to='csvs')
    rows = models.IntegerField(default=0)

I'm trying to read a CSV file to see how many rows it has and save that to an attribute rows. I'm using pandas to read the csv and get the number of lines of the CSV has. I want to do this BEFORE the model saves.

I thought it would work like this:

import pandas as pd
from django.db import models

class MyModel(models.Model):
    csv = models.FileField(upload_to='csvs')
    rows = models.IntegerField(default=0)

    def clean(self):
        self.rows = len(pd.read_csv(self.csv).index)
 
    def save(self, *args, **kwargs):
        self.full_clean()
        return super().save(*args, **kwargs)

However, this returns the following error (which seems to be from pandas):

EmptyDataError: No columns to parse from file

The weird part is it works if I put it AFTER an initial save, like this:

import pandas as pd
from django.db import models

class MyModel(models.Model):
    csv = models.FileField(upload_to='csvs')
    rows = models.IntegerField(default=0)
         
    def save(self, *args, **kwargs):
        self.full_clean()
        super().save(*args, **kwargs)
        self.rows = len(pd.read_csv(self.csv).index)
        return super().save(*args, **kwargs)

That seems really weird and inefficient (saving twice I mean).

How can I process the csv like I would after a save BEFORE a save?

And what's the difference?

Upvotes: 1

Views: 239

Answers (1)

Abdul Aziz Barkat
Abdul Aziz Barkat

Reputation: 21802

Most likely the problem is that you have called read on the file somewhere before your clean method is called. One fix would be to simply call open on the file before passing it to read_csv, this as described in the documentation would either open or reopen the file so it will ensure that the file pointer is at its start:

def clean(self):
    self.csv.open(mode='r')
    self.rows = len(pd.read_csv(self.csv).index)

Upvotes: 1

Related Questions