satyaki
satyaki

Reputation: 641

Writing a print statement into a csv file

I trying to write the print statement of the following python program in a csv file.

import optparse
import datetime
import csv
file_name='sample.txt'
with open(file_name,'rb') as f:               
    reader = csv.reader(f,delimiter=",")                                              
    #headers = reader.next()
    for row in reader:

        row[0] = row[0].zfill(6) 
        row[2] = row[2].zfill(6)
        row[3] = row[3].zfill(6)
        row[4] = row[4].zfill(6)
        row[1] = row[1][5:7] + "-" + row[1][8:10] + "-" + row[1][:4]
        p = row[0],row[1],row[2],row[3],row[4]
        print p
with open('names.csv', 'wb') as ofile:
    writer = csv.writer(ofile)
    writer.writerow(p)

The output is following:-

C:\Python27\office>python python_poc.py
('User_ID', '--Date', '0Num_1', '0Num_2', 'Com_ID')
('000101', '04-13-2015', '000012', '000021', '001011')
('000102', '04-03-2014', '000001', '000007', '001002')
('000103', '06-05-2013', '000003', '000004', '000034')
('000104', '12-31-2012', '000004', '000009', '001023')
('000105', '09-09-2011', '000009', '000005', '000104')

But my output csv file shows something like:-- enter image description here

What is it that I am doing wrong ?

Upvotes: 2

Views: 7063

Answers (4)

TigerhawkT3
TigerhawkT3

Reputation: 49330

You don't want to write a print statement to a CSV file. The issue is that you keep overwriting p after every row, so only the last one is saved. That's not too hard to fix:

p = []
for row in reader:
    ...
    p.append(row[:5])

with open('names.csv', 'wb') as ofile:
    writer = csv.writer(ofile)
    for row in p:
        writer.writerow(row)

Or to clean up the header row:

with open('names.csv', 'wb') as ofile:
    writer = csv.writer(ofile)
    writer.writerow([row[0], row[1].lstrip('-'), row[2].lstrip('0'), row[3].lstrip('0'), row[4]])
    for row in p[1:]:
        writer.writerow(row)

Or, better yet (but without header row cleanup):

with open(file_name,'rb') as f, open('names.csv', 'wb') as ofile:
    writer = csv.writer(ofile)
    ...
    for row in reader:
        ...
        writer.writerow(row[:5])

As you can see, cleaning up the header programatically can get messy (and I don't even know if the cleanup implementation is correct - what if you get a header cell of __Date or -+-Date instead of --Date?). Unless you're trying to generate this file automatically as part of a scheduled task or something, it makes a lot more sense to just open the file in Notepad/Excel/etc. and manually clean up the header row.

Upvotes: 4

Pynchia
Pynchia

Reputation: 11606

given the problem has been correctly identified by @TigerhawkT3 (whose solution I find excellent), let me suggest an alternative, which follows your original train of thought:

import optparse
import datetime
import csv
file_name='sample.txt'
with open(file_name,'rb') as f, open('names.csv', 'wb') as ofile:               
    reader = csv.reader(f, delimiter=",")
    writer = csv.writer(ofile)
    #headers = reader.next()
    for row in reader:
        row[0] = row[0].zfill(6) 
        row[2] = row[2].zfill(6)
        row[3] = row[3].zfill(6)
        row[4] = row[4].zfill(6)
        row[1] = row[1][5:7] + "-" + row[1][8:10] + "-" + row[1][:4]
        p = row[0],row[1],row[2],row[3],row[4]
        print p
        writer.writerow(p)

The main difference is that the output file is written as the input file is being read. The performance may be slightly slower, given the buffering, but it may come in handy for high volumes of data.

Upvotes: 0

user1801810
user1801810

Reputation: 614

Your spreadsheet application is formatting the data. Open the CSV in a text editor and you'll see the data as expected.

===UPDATE===
The problem I saw was ill-formatted data thus my comment above - which is correct. Supposing you're inquiring about the missing records then your logic is close and @TigerhawkT3 has the answer to that.

Upvotes: 0

knowledgehaver
knowledgehaver

Reputation: 1

I'm assuming that your question has to do with how the data are being displayed, and not that only one line has printed out. If that assumption is wrong, then please clarify what exactly what is wrong with how the CSV is displaying.

It looks like the problem is the program you are using to view the .csv file. If you want to see what the raw .csv looks like, you should open it in a text editor like gedit or Notepad.

When you open a .csv file in a spreadsheet program, it will interpret and display the .csv output by your program, but it will format each cell according to it's own default settings. In addition, most spreadsheet programs allow you to choose how to interpret delimiters in the CSV file, which will alter whether or not all your data is imported/displayed properly.

Upvotes: 0

Related Questions