(python) How do i split a csv with /n if the fields have strings with /n inside

I'm having trouble splitting a csv because some of the fields have a "\n" inside them

i'm using:

file_data = csv_file.read().decode("utf-8")
csv_data = file_data.split("\n")

but the fields look something like

'string 1','string 2',
'string
 3'
'string 4',

i would like csv_data[0] to be strings 1 and 2, csv_data[1] to be string 3, and csv_data[2] to be string 4

the way i'm currently using, i get csv_data[0] correctly, but string 3 is split in two indexes since it has a /n inside it's text...

---------------[edit]---------------

i solved it by not using split, instead iterating through csv_data (answer posted below)

Upvotes: 0

Views: 857

Answers (3)

i solved it by not using split, instead iterating through csv_data as following:

        csv_file = request.FILES["csv_upload"]

        if not csv_file.name.endswith('.csv'):
            messages.warning(request, "O arquivo não é um csv!")
            return HttpResponseRedirect(request.path_info)

        file_data = csv_file.read().decode("utf-8")
        csv_data = file_data.split("\r\n")

        fields = []
        fieldsTemp = []

        # pegando os campos do csv
        text = ''
        firstQuote = False
        secondQuote = False
        for x in csv_data:
            for char in x:
                # removendo a virgulas de separação
                if char != ',':
                    text = text + char

                # tratando strings que contém virgula
                if char == '\"':
                    if firstQuote:
                        secondQuote = True
                    firstQuote = True
                    if secondQuote:
                        firstQuote = False
                        secondQuote = False

                # adicionando o campo
                if not firstQuote:
                    if char == ',':
                        fieldsTemp.append(text)
                        text = ''
            fields.append(fieldsTemp)
            fieldsTemp = []

as it turned out, i could split by /r/n and it would solve part of the problem for my specific csv, but later i couldn't split by commas for te same reason, commas appear in strings, so instead i used that loop to check if i'm inside quotes, and manually creating my fields

Upvotes: 0

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476813

Use a library. Python has the csv module [Python-doc] to parse csv files. I strongly advise to use a parser since the CSV file format is more complicated than it looks like, for example there is syntax to specify quotes and new lines as content of a string.

You can parse the csv content and for example produce a list of lists with:

import csv

with open('mycsv.csv') as mycsv:
    csvreader = csv.reader(mycsv)
    data = [tuple(row) for row in csvreader]

Upvotes: 1

Nabil
Nabil

Reputation: 1278

You should use the library csv instead of trying to parse it yourself.

Here a link that can help you

Upvotes: 2

Related Questions