hyeri
hyeri

Reputation: 693

Reading large CSV file

I'm trying to read a large csv file in Python; it has some 700 attributes and 101533 rows. I tried reading the file using pandas.read_csv command but it gave memory issue then I tried this solution

import numpy as np
with file("data.csv", "rb") as f:
   title = f.readline()  # if your data have a title line.
   data = np.loadtxt(f, delimiter=",") # if your data splitted by ","
   print np.sum(data, axis=0)  # sum along 0 axis to get the sum of every column

but it doesn't work for large data set however works fine for small data set. How can I read this file in python?

enter image description here

Upvotes: 2

Views: 2104

Answers (1)

Kasravnd
Kasravnd

Reputation: 107287

You can use csv module to load your csv file and use itertools.izip() function in order to get a generator of columns then get the first columns by next().

Note that csv.reader() return a reader object which is an iterator like object (one shot iterable), which means that it won't waste your memory and will produce the rows on demand. :

import csv
from itertools import izip
with open("data.csv", "rb") as f:
    reader = csv.reader(f)
    print sum(next(izip(*reader)))

Upvotes: 1

Related Questions