Large text file to csv, can't open text file

Question

I'm trying to convert this 3,1 GB text file from https://snap.stanford.edu/data/ into a csv file. All the data is structured like:

name: something
age: something
gender: something

which makes it a pretty large text file with some million lines. I have tried to write a py script to convert it but for some reason it won't read the lines in my for each loop.

Here is the code:

import csv


def trycast(x):
    try:
        return float(x)
    except:
        try:
            return int(x)
        except:
            return x

cols = ['product_productId', 'review_userId', 'review_profileName', 'review_helpfulness', 'review_score', 'review_time', 'review_summary', 'review_text']

f = open("movies.txt", "wb")
w = csv.writer(f)
w.writerow(cols)


doc =  {}

with open('movies.txt') as infile:
    for line in infile:
        line = line.strip()
        if line=="":
            w.writerow([doc.get(col) for col in cols])
            doc = {}
        else:
            idx = line.find(":")
            key, value = tuple([line[:idx], line[idx+1:]])
            key = key.strip().replace("/", "_").lower()
            value = value.strip()
            doc[key] = trycast(value)
    f.close()

I'm not sure if it is because the document is to large, because a regulare notepad program won't be able to open it.

Thanks up front! :-)

Garogolun · Accepted Answer

In the line f = open("movies.txt", "wb") you're opening the file for writing, and thereby deleting all its content. Later on, you're trying to read from that same file. It probably works fine if you change the output filename. (I am not going to download 3.1 GB to test it. ;) )

Large text file to csv, can't open text file

Answers (1)

Related Questions

Large text file to csv, can&#39;t open text file

Answers (1)

Related Questions

Large text file to csv, can't open text file