RedFox
RedFox

Reputation: 1376

how to split csv file data in batches?

I have a csv file, with number of lines is multiples of 16.

After reading, I want iterate and inspect each of 16 rows of data.

ex: following file has lines, which is multiple of 2
1 2 4
4 5 6
4 5 7
3 4 7
6 7 1
3 1 8

then I want to divide these lines into 3 tables

1 2 4
4 5 6

4 5 7
3 4 7

6 7 1
3 1 8

and iterate each of these individual table.

Thanks a lot

Upvotes: 0

Views: 1395

Answers (2)

ThePyGuy
ThePyGuy

Reputation: 18466

If you don't want the entire data at once, and just need specific number of rows at a time, you can consider reading the csv file in chunk rather than reading the entire data at once. Something like this will work:

fileName = 'sample.csv'
batchSize = 16
for df in pd.read_csv(fileName, chunksize=batchSize):
     process the chunk..

Upvotes: 2

Cameron Riddell
Cameron Riddell

Reputation: 13437

There are a lot of ways you can do this. One was is to use numpy to create the groupings and then use groupby to perform the iteration.

print(df)
   a  b  c
0  1  2  4
1  4  5  6
2  4  5  7
3  3  4  7
4  6  7  1
5  3  1  8

groups = np.arange(len(df)) // 2
for idx, subset in df.groupby(groups):
    print(subset)
    print("-" * 10)

# prints:
   a  b  c
0  1  2  4
1  4  5  6
----------
   a  b  c
2  4  5  7
3  3  4  7
----------
   a  b  c
4  6  7  1
5  3  1  8
----------


Upvotes: 3

Related Questions