alexvassel
alexvassel

Reputation: 10740

CSV manipulaions with python

For my purposes i have to know the number of lines in the (CSV) file before actually working with rows. I have googled and found that documentation says that i should create an iterator (CSV.reader) for two times (first one for counting and the second for working with rows). Is this the only way or maybe there is some tricky method to do a trick?

Thanks for your answers.

Upvotes: 1

Views: 85

Answers (2)

Artsiom Rudzenka
Artsiom Rudzenka

Reputation: 29131

if your file is not very big than you can try:

from csv import reader

def proceed(size):
    with open(filename) as f:
        data = list(csv.reader(f))
        if len(data) > size:
            return
        else:
            for line in data:
                #do action


weights = {'user1': 4, 'user2': 5}  
for k,v in weights.iteritems():
    proceed(v)

Or as suggested by @georgesl in case when you have a very big file:

def proceed(size):
    if sum(1 for row in csv.reader(open(filename))) > size:
        return
    else:
        for line in csv.reader(open(filename)):
            #do action

Upvotes: 1

aquavitae
aquavitae

Reputation: 19164

I don't know of a way without reading the file, but depending on where your bottlenecks are you could just process N lines, and if there is more discard them, for example:

count = 0
for line in reader:
    count += 1
    if count > N:  # Over the limit so stop processing
        break
    else:
        processed_data += process(line)
else:
    # This block only runs if the loop completed naturally, i.e. count <= N
    return processed_data

If process(line) is expensive, then your best bet may be to use two loops as described in your question.

Upvotes: 1

Related Questions