Sahil Suri
Sahil Suri

Reputation: 33

Uploading two different Csv file in django with different Encoding

In my Django admin i have a button which is used to upload csv File. I have two files, one in UTF-8 Encoding and one in ASCI/cp1252 encoding. So in my code if i write

 data = pd.read_csv(value.file, encoding = "ASCI", engine='python')

one csv file is uploaded but other after being uploaded it has special character between texts. I dont want special characters to be uploaded. And if i write

 data = pd.read_csv(value.file, encoding = "UTF-8", engine='python')

The one showing special characters does not give error while other one do not gets uploaded. Can anyone tell me how to fix this? Below is my Forms.py

class CsvUpload(forms.Form):
    csv_file = forms.FileField()

    def clean_csv_file(self):
        # Probably worth doing this check first anyway
        value = self.cleaned_data['csv_file']
        if not value.name.endswith('.csv'):
            raise forms.ValidationError('Invalid file type')

      
        try:
            data = pd.read_csv(value.file, encoding = "UTF-8", engine='python')
            data.columns= data.columns.str.strip().str.lower()
            data=data.rename(columns = {'test case id':'Test Case ID'})

        except Exception as e:
            print('Error while parsing CSV file=> %s', e)
            raise forms.ValidationError('Failed to parse the CSV file')
        if 'summary' not in data or 'Test Case ID' not in data:
            raise forms.ValidationError(
                'CSV file must have "summary" column and "Issue Key" column')
        return data

CSV 1.

  Test Case ID,Summary

TCMT-10,Verify that Process CSV sub module is displayed under “Process CSV” module on Dashboard of Client’s user.
TCMT-11,Verify that only View to “Duplicate test  cases” under “Test_Suite_Optimizer” module on Dashboard of Client’s user.
TCMT-12,Verify that Process CSV sub module is displayed under “Process CSV” module on Dashboard of Client’s user.
TCMT-13,Verify that toggle view is displayed on “Duplicate test cases” under “Test_Suite_Optimizer” module on Dashboard of Client’s user.
TCMT-14,Toggle view-? “Duplicate test cases” under “Test_Suite_Optimizer” module on Dashboard of Client’s user

CSV-2.

Test Case ID,summary
TC-16610,“verify that user is able to update 'active' attribute  'false ' on adding “new category records” using 'v3/definition/categories' PUT API on specifying the 'active' attribute 'true'”
TC-16609,“verify that user is able to update 'active' attribute  'true ' on adding “new category records” using 'v3/definition/categories' PUT API on specifying the 'active' attribute 'false'”

Also in csv-2 i am adding inverted commas in open office. i want this filw to be uploaded

Upvotes: 0

Views: 127

Answers (1)

Moosa Saadat
Moosa Saadat

Reputation: 1177

If you are trying to read files with any type of encoding, you can write dynamic code to do so. The following code will first open a file and get its encoding. Then, it creates the DataFrame with that encoding:

# Get file encoding
fileEncoding = None
with open(value.file, "r") as f:
    fileEncoding = f.encoding
data = pd.read_csv(value.file, encoding = fileEncoding, engine='python')

Upvotes: 1

Related Questions