morasta
morasta

Reputation: 148

Strip all trailing occurrences of a character

I have a CSV that needs cleaning up. It has records along the lines of

1, a[]b[][][], c[]d[][][]
2, a[]b[]c[]d[]e, a[]b[]c[]d[]e

Any occurrences of trailing [] within a field need to be removed from rows which are then written to a new file. For example, in this case line 1 would become 1, a[]b, c[]d and line 2 would be unchanged.

Here's as far as I have gotten:

import csv

new_rows = []

with open('input.csv', 'rb') as f:
    reader = csv.reader(f)
    for row in reader:    
        new_row = [i.split(',').rstrip('[]') for i in row] #won't work since this is a list, not a string
        new_rows.append(new_row) 

with open('output.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(new_rows)

I know rstrip only works on strings and not lists. I'm a bit of a novice with Python and can't quite figure it out, can anyone lend a hand?

Upvotes: 1

Views: 135

Answers (2)

chepner
chepner

Reputation: 531948

import csv

new_rows = []

with open('input.csv', 'rb') as inf:
    with open('output.csv', 'wb') as outf:
        reader = csv.reader(inf)
        writer = csv.writer(outf)
        for row in reader: 
            new_row = [ field.rstrip('[]') for field in row ]
            writer.writerow(new_row)

Prior to Python 2.7, you can combine the context managers into one with statement with

import contextlib
with contextlib.nested(open('input.csv', 'rb'), open('output.csv', 'wb')) as (inf, outf):
    reader = ...

In Python 2.7, the with statement itself was modified to support this:

with open('input.csv', 'rb') as inf, open('output.csv', 'wb') as outf:

although implicit line continuation is not available; to split across two lines, you must use a line-continuation character:

with open('input.csv', 'rb') as inf,\
     open('output.csv', 'rb') as outf:

Upvotes: 2

Brian
Brian

Reputation: 3131

Almost there, looks like just an errant split.

import csv

new_rows = []

with open('input.csv', 'rb') as f:
    reader = csv.reader(f)
    for row in reader:    
        new_row = [x.rstrip('[]') for x in row]
        new_rows.append(new_row) 

with open('output.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(new_rows)

On input of:

1, a[]b[][][], c[]d[][][]
2, a[]b[]c[]d[]e, a[]b[]c[]d[]e

I get the following:

1, a[]b, c[]d
2, a[]b[]c[]d[]e, a[]b[]c[]d[]e

Upvotes: 2

Related Questions