Reputation: 437
I have two csv files, where the first column is a time stamp. I ultimately want to get the difference between the two times for each row.
import csv
import datetime
with open('file1.csv', 'rb')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
for headers in range(2):
next(filereader, None)
for column in filereader:
date = column[0]
parsed_date = datetime.strptime(date, '%H:%M:%S')
with open('file2.csv', 'rb')as csvfile:
filereader2 = csv.reader(csvfile, delimiter=',')
for headers in range(2):
next(filereader2, None)
for column2 in filereader:
date2 = column2[0]
parsed_date = datetime.strptime(date, '%H:%M:%S')
time_delta = (parsed_date - parsed_date2)
As it is now, my code only uses the first instance of parsed_date since I've taken it out of the loop. How do I get all the values? I've tried reading the second csv file inside of the for loop but then my program freezes (I think because it's looping endlessly).
Upvotes: 0
Views: 51
Reputation: 36
I'd recommend reading both csv files and saving the applicable data. Then zip the two together and perform the difference on each tuple from the zip
Upvotes: 1
Reputation: 24
Few notes on your code.
import csv
import datetime
with open('file1.csv', 'rb')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
for headers in range(2):
next(filereader, None)
for column in filereader:
date = column[0]
parsed_date = datetime.strptime(date, '%H:%M:%S') #this var will be recreated on each loop iteration
with open('file2.csv', 'rb')as csvfile:
filereader2 = csv.reader(csvfile, delimiter=',')
for headers in range(2):
next(filereader2, None)
for column2 in filereader:
date2 = column2[0]
parsed_date = datetime.strptime(date, '%H:%M:%S')
time_delta = (parsed_date - parsed_date2) # parsed_date2 - doesn't exist as was never created, time_delta is lost on each iteration
This means that you're constantly loosing data to work with. In order to solve your problem just read files first, and work with read data after:
import csv
import datetime
first_file_dates = []
second_file_dates = []
with open('file1.csv', 'rb')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
for headers in range(2):
next(filereader, None)
for column in filereader:
first_file_dates.append(datetime.strptime(column[0], '%H:%M:%S'))
with open('file2.csv', 'rb')as csvfile:
filereader2 = csv.reader(csvfile, delimiter=',')
for headers in range(2):
next(filereader2, None)
for column in filereader:
second_file_dates.append(datetime.strptime(column[0], '%H:%M:%S'))
for k,v in zip(first_file_dates, second_file_dates):
print(k-v)
Please note, that zip will cut resulting object length to shortest array.
Upvotes: 1
Reputation: 440
This can be completed by reading the two csv's each into a dataframe and then merging on index and creating a 3rd column as the delta.
Upvotes: 0