jdoe
jdoe

Reputation: 654

Empty CSV file when writing lots of data

I am currently conducting a data scraping project with Python 3 and am attempting to write the scraped data to a CSV file. My current process to do it is this:

import csv

outputFile = csv.writer(open('myFilepath', 'w'))
outputFile.writerow(['header1', 'header2'...])
for each in data:
     scrapedData = scrap(each)
     outputFile.writerow([scrapedData.get('header1', 'header 1 NA'), ...])

Once this script is finished, however, the CSV file is blank. If I just run:

import csv

outputFile = csv.writer(open('myFilepath', 'w'))
outputFile.writerow(['header1', 'header2'...])

a CSV file is produced containing the headers:

header1,header2,..

If I just scrape 1 in data, for example:

outputFile.writerow(['header1', 'header2'...])
scrapedData = scrap(data[0])
outputFile.writerow([scrapedData.get('header1', 'header 1 NA'), ...])

a CSV file will be created including both the headers and the data for data[0]:

header1,header2,..
header1 data for data[0], header1 data for data[0]

Why is this the case?

Upvotes: 1

Views: 1168

Answers (2)

anon
anon

Reputation: 1258

When you open a file with w, it erases the previous data

From the docs

w: open for writing, truncating the file first

So when you open the file after writing scrape data with w, you just get a blank file and then you write the header on it so you only see the header. Try replacing w with a. So the new call to open the file would look like

outputFile = csv.writer(open('myFilepath', 'a'))

You can fine more information about the modes to open the file here

Ref: How do you append to a file?

Edit after DYZ's comment:

You should also be closing the file after you are done appending. I would suggest using the file like the:

with open('path/to/file', 'a') as file:
    outputFile = csv.writer(file)
    # Do your work with the file

This way you don't have to worry about remembering to close it. Once the code exists the with block, the file will be closed.

Upvotes: 2

PMende
PMende

Reputation: 5460

I would use Pandas for this:

import pandas as pd
headers = ['header1', 'header2', ...]
scraped_df = pd.DataFrame(data, columns=headers)
scraped_df.to_csv('filepath.csv')

Here I'm assuming your data object is a list of lists.

Upvotes: 0

Related Questions