Reputation: 21
I'm having trouble splitting a csv because some of the fields have a "\n" inside them
i'm using:
file_data = csv_file.read().decode("utf-8")
csv_data = file_data.split("\n")
but the fields look something like
'string 1','string 2',
'string
3'
'string 4',
i would like csv_data[0] to be strings 1 and 2, csv_data[1] to be string 3, and csv_data[2] to be string 4
the way i'm currently using, i get csv_data[0] correctly, but string 3 is split in two indexes since it has a /n inside it's text...
---------------[edit]---------------
i solved it by not using split, instead iterating through csv_data (answer posted below)
Upvotes: 0
Views: 857
Reputation: 21
i solved it by not using split, instead iterating through csv_data as following:
csv_file = request.FILES["csv_upload"]
if not csv_file.name.endswith('.csv'):
messages.warning(request, "O arquivo não é um csv!")
return HttpResponseRedirect(request.path_info)
file_data = csv_file.read().decode("utf-8")
csv_data = file_data.split("\r\n")
fields = []
fieldsTemp = []
# pegando os campos do csv
text = ''
firstQuote = False
secondQuote = False
for x in csv_data:
for char in x:
# removendo a virgulas de separação
if char != ',':
text = text + char
# tratando strings que contém virgula
if char == '\"':
if firstQuote:
secondQuote = True
firstQuote = True
if secondQuote:
firstQuote = False
secondQuote = False
# adicionando o campo
if not firstQuote:
if char == ',':
fieldsTemp.append(text)
text = ''
fields.append(fieldsTemp)
fieldsTemp = []
as it turned out, i could split by /r/n and it would solve part of the problem for my specific csv, but later i couldn't split by commas for te same reason, commas appear in strings, so instead i used that loop to check if i'm inside quotes, and manually creating my fields
Upvotes: 0
Reputation: 476813
Use a library. Python has the csv
module [Python-doc] to parse csv files. I strongly advise to use a parser since the CSV file format is more complicated than it looks like, for example there is syntax to specify quotes and new lines as content of a string.
You can parse the csv content and for example produce a list of lists with:
import csv
with open('mycsv.csv') as mycsv:
csvreader = csv.reader(mycsv)
data = [tuple(row) for row in csvreader]
Upvotes: 1