Reputation: 59
I have a CSV file with hundreds of rows, and I would like to select and export every 3 rows to a new CSV file with the new output CSV file being named after the first row of the selection.
For example in the following CSV file....
1980 10 12
1 2 3 4 5 6 7
4 6 8 1 0 8 6
1981 10 12
2 4 9 7 5 4 1
8 9 3 8 3 7 3
I would like to select the first 3 rows and export to a new CSV named "1980 10 12" based on the first row then select the next 3 rows and export to a new CSV named "1981 10 12" based on the first row of the next 3 rows. I would like to do this using python.
Upvotes: 3
Views: 6654
Reputation:
Using slight iterator trickery:
with open('in.csv', 'r') as infh:
for block in zip(*[infh]*3):
filename = block[0].strip() + '.csv'
with open(filename, 'w') as outfh:
outfh.writelines(block)
On Python 2.X you would use itertools.izip
. The docs actually mention izip(*[iter(s)]*n)
as an idiom for clustering a data series.
Upvotes: 0
Reputation: 180481
import csv
with open("in.csv") as f:
reader = csv.reader(f)
chunks = []
for ind, row in enumerate(reader, 1):
chunks.append(row)
if ind % 3 == 0: # if we have three new rows, create a file using the first row as the name
with open("{}.csv".format(chunks[0][0].strip(), "w") as f1:
wr = csv.writer(f1)
wr.writerows(chunks) # write all rows
chunks = [] # reset chunks to an empty list
Upvotes: 2
Reputation: 1123450
Using the csv
module, plus itertools.islice()
to select 3 rows each time:
import csv
import os.path
from itertools import islice
with open(inputfilename, 'rb') as infh:
reader = csv.reader(infh)
for row in reader:
filename = row[0].replace(' ', '_') + '.csv')
filename = os.path.join(directory, filename)
with open(filename, 'wb') as outfh:
writer = csv.writer(outfh)
writer.writerow(row)
writer.writerows(islice(reader, 2))
The writer.writerows(islice(reader, 2))
line takes the next 2 rows from the reader, copying them across to the writer CSV, after writing the current row (with the date) to the output file first.
You may need to adjust the delimiter
argument for the csv.reader()
and csv.writer()
objects; the default is a comma, but you didn't specify the exact format and perhaps you need to set it to a '\t'
tab instead.
If you are using Python 3, open the files with 'r'
and 'w'
text mode, and set newline=''
for both; open(inputfilename, 'r', newline='')
and open(filename, 'w', newline='')
.
Upvotes: 3