GeoffreyB
GeoffreyB

Reputation: 526

How to make this process work faster?

I'm trying to fill list of arrays from a large csv file (around 250 000 lines), but it's taking ages. I'm sure there is a way to make the process faster, but I don't know how !

Here is the code:

import csv
import numpy as np

energy = []
ondeIG =[]
time =[]
envelope = []

with open('file.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:        
        time = np.hstack([time, row['Time']])
        energy = np.hstack([energy, row['Energy']])
        ondeIG = np.hstack([ondeIG, row['OndeIG']])
        envelope = np.hstack([envelope, row['envelope']])

Thank you !

Upvotes: 1

Views: 87

Answers (2)

AChampion
AChampion

Reputation: 30288

np.hstack() constructs a new ndarray each time which is expensive. You can update the list in-place with append:

with open('file.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:        
        time.append(row['Time'])
        energy.append(row['Energy'])
        ondeIG.append(row['OndeIG'])
        envelope.append(row['envelope'])

Upvotes: 2

P. Camilleri
P. Camilleri

Reputation: 13218

To import data from csv files, have a look at pandas, and more specially at pandas.read_csv()

Here you are taking tremendous time because you rebuild an array (4 arrays, even) at each iteration.

Upvotes: 0

Related Questions