Reputation: 3

Python loops through CSV, but writes header row twice

I have csv files with unwanted first characters in the header row except the first column. The while loop strips the first character from the headers and writes the new header row to a new file (exit by counter). The else statement then writes the rest of the rows to the new file. The problem is the else statement begins with the header row and writes it a second time. Is there a way to have else begin an the next line with out breaking the for iterator? The actual files are 21 columns by 400,000+ rows. The unwanted character is a single space, but I used * in the example below to make it easier to see. Thanks for any help!

file.csv =

a,*b,*c,*d

1,2,3,4

import csv

reader = csv.reader(open('file.csv', 'rb'))

writer = csv.writer(open('file2.csv','wb'))

count = 0

for row in reader:
    while (count <= 0):
        row[1]=row[1][1:]
        row[2]=row[2][1:]
        row[3]=row[3][1:]
        writer.writerow([row[0], row[1], row[2], row[3]])
        count = count + 1
    else:
        writer.writerow([row[0], row[1], row[2], row[3]])

Upvotes: 0

Answers (4)

elyase

Reputation: 40993

If you only want to change the header and copy the remaining lines without change:

with open('file.csv', 'r') as src, open('file2.csv', 'w') as dst:
    dst.write(next(src).replace(" ", ""))     # delete whitespaces from header
    dst.writelines(line for line in src)

If you want to do additional transformations you can do something like this or this question.

Upvotes: 1

djas

Reputation: 1023

If you have 21 columns, you don't want to write row[0], ... , row[21]. Plus, you want to close your files after opening them. .next() gets your header. And strip() lets you flexibly remove unwanted leading and trailing characters.

import csv

file = 'file1.csv'
newfile = open('file2.csv','wb')
writer = csv.writer(newfile)

with open(file, 'rb') as f:
  reader = csv.reader(f)
  header = reader.next()

  newheader = []  
  for c in header:
    newheader.append(c.strip(' '))
    writer.writerow(newheader)  

  for r in reader:
    writer.writerow(r)  

newfile.close()

Upvotes: 0

jrs

Reputation: 616

Hmm... It seems like your logic might be a bit backward. A bit cleaner, I think, to check if you're on the first row first. Also, a slightly more idiomatic way to remove spaces is to use string's lstrip method with no arguments to remove leading whitespace.

Why not use enumerate and check if your row is the header?

import csv

reader = csv.reader(open('file.csv', 'rb'))

writer = csv.writer(open('file2.csv','wb'))

for i, row in enumerate(reader):
    if i == 0:            
        writer.writerow([row[0], 
                         row[1].lstrip(), 
                         row[2].lstrip(), 
                         row[3].lstrip()])
    else:
        writer.writerow([row[0], row[1], row[2], row[3]])

Upvotes: 0

JSutton

Reputation: 71

If all you want to do is remove spaces, you can use:

string.replace(" ", "")

Upvotes: 0

Python loops through CSV, but writes header row twice

Answers (4)

Related Questions