Reputation: 362
I have a CSV file data.csv
and i have divided this file into 2 files test.csv
and train.csv
by the following code:
import csv
import random
with open('datafile.csv', 'r') as data:
with open('test.csv', 'w') as test:
with open('train.csv', 'w') as train:
test_writer = csv.writer(test)
train_writer = csv.writer(train)
for line in csv.reader(data):
if random.random() > 0.85:
test_writer.writerow(line)
else:
train_writer.writerow(line)
this code though worked fine but the headers were present in test.csv
but no headers were found in train.csv
any way to update the code and get the headers in both the files?
Upvotes: 0
Views: 623
Reputation: 52947
The original question actually would've worked, but after the edit it became clear what the issue at hand is: the csv.reader is read line by line and written at random to either of the csv.writers.
You need to read the header first like you did in your 1. example:
import csv
import random
with open('datafile.csv', 'r') as data, \
open('test.csv', 'w') as test, \
open('train.csv', 'w') as train:
test_writer = csv.writer(test)
train_writer = csv.writer(train)
reader = csv.reader(data)
header = next(reader)
test_writer.writerow(header)
train_writer.writerow(header)
for row in reader:
if random.random() > 0.85:
test_writer.writerow(row)
else:
train_writer.writerow(row)
Upvotes: 2