Athylus
Athylus

Reputation: 191

Python removing all \r\n within quotes in CSV file

I have a CSV file that has some data in it. I want to replace all the newlines within "" by some character. But the new lines outside of these quotes should stay. What is the best way to achieve this?

import sys, getopt

def main(argv):
    inputfile = ''
    outputfile = ''

    print(argv[0:])
    inputfile = argv[0:]

    file_object = open(argv[0:], "r")
    print(file_object)

    data = file.read(file_object)
    strings = data.split('"')[1::2]

    for string in strings:
        string.replace("\r", "")
        string.replace("\n", "")
        print(string)

    f = open("output.csv", "w")
    for string in strings:
        string = string.replace("\r", "")
        string = string.replace("\n", "")
        f.write(string)

    f.close()


if __name__ == "__main__":
    main(sys.argv[1])

This does not quite work, since the "" get lost as well as the ,'s.

Expected input:

“dssdlkfjsdfj   \r\n ashdiowuqhduwqh \r\n”,
 "3"

Expected output:

"dssdlkfjsdfj    ashdiowuqhduwqh",
 "3"

Upvotes: 0

Views: 2480

Answers (2)

Mark Tolonen
Mark Tolonen

Reputation: 177961

A real sample would help, but given in.csv:

"multi
line
data","more data"
"more multi
line data","other data"

The following will replace newlines in quotes:

import csv

with open('in.csv',newline='') as fin:
    with open('out.csv','w',newline='') as fout:
        r = csv.reader(fin)
        w = csv.writer(fout)
        for row in r:
            row = [col.replace('\r\n','**') for col in row]
            w.writerow(row)

out.csv:

multi**line**data,more data
more multi**line data,other data

Upvotes: 1

Athylus
Athylus

Reputation: 191

The problem got solved in a very easy way. Create an output file, and read the input file for each character. Write each character to the output file, but toggle replace mode by using the ~ operator when a " appears. When in replace mode, replace all \r\n with '' (nothing).

Upvotes: 0

Related Questions