klex52s
klex52s

Reputation: 437

Problem Looping when Comparing Datetime Objects Between Two CSV Files

I have two csv files, where the first column is a time stamp. I ultimately want to get the difference between the two times for each row.

import csv
import datetime

with open('file1.csv', 'rb')as csvfile:            
    filereader = csv.reader(csvfile, delimiter=',')                             
    for headers in range(2):                                                    
        next(filereader, None)                                                  
    for column in filereader:                                                   
        date = column[0]                                                        
        parsed_date = datetime.strptime(date, '%H:%M:%S')

with open('file2.csv', 'rb')as csvfile:            
    filereader2 = csv.reader(csvfile, delimiter=',')                             
    for headers in range(2):                                                    
        next(filereader2, None)                                                  
    for column2 in filereader:                                                   
        date2 = column2[0]                                                        
        parsed_date = datetime.strptime(date, '%H:%M:%S')
        time_delta = (parsed_date - parsed_date2)

As it is now, my code only uses the first instance of parsed_date since I've taken it out of the loop. How do I get all the values? I've tried reading the second csv file inside of the for loop but then my program freezes (I think because it's looping endlessly).

Upvotes: 0

Views: 51

Answers (3)

Marinus Bokslag
Marinus Bokslag

Reputation: 36

I'd recommend reading both csv files and saving the applicable data. Then zip the two together and perform the difference on each tuple from the zip

Upvotes: 1

Andrei Lifianets
Andrei Lifianets

Reputation: 24

Few notes on your code.

import csv
import datetime

with open('file1.csv', 'rb')as csvfile:            
    filereader = csv.reader(csvfile, delimiter=',')                             
    for headers in range(2):                                                    
        next(filereader, None)                                                  
    for column in filereader:                                                   
        date = column[0]                                                        
        parsed_date = datetime.strptime(date, '%H:%M:%S') #this var will be recreated on each loop iteration

with open('file2.csv', 'rb')as csvfile:            
    filereader2 = csv.reader(csvfile, delimiter=',')                             
    for headers in range(2):                                                    
        next(filereader2, None)                                                  
    for column2 in filereader:                                                   
        date2 = column2[0]                                                        
        parsed_date = datetime.strptime(date, '%H:%M:%S')
        time_delta = (parsed_date - parsed_date2) # parsed_date2 - doesn't exist as was never created, time_delta is lost on each iteration

This means that you're constantly loosing data to work with. In order to solve your problem just read files first, and work with read data after:

import csv
import datetime

first_file_dates = []
second_file_dates = []

with open('file1.csv', 'rb')as csvfile:            
    filereader = csv.reader(csvfile, delimiter=',')                             
    for headers in range(2):                                                    
        next(filereader, None)                                                  
    for column in filereader:                                                   
        first_file_dates.append(datetime.strptime(column[0], '%H:%M:%S'))


with open('file2.csv', 'rb')as csvfile:            
    filereader2 = csv.reader(csvfile, delimiter=',')                             
    for headers in range(2):                                                    
        next(filereader2, None)                                                  
    for column in filereader:                                                   
        second_file_dates.append(datetime.strptime(column[0], '%H:%M:%S'))

for k,v in zip(first_file_dates, second_file_dates):
    print(k-v)

Please note, that zip will cut resulting object length to shortest array.

Upvotes: 1

emendez
emendez

Reputation: 440

This can be completed by reading the two csv's each into a dataframe and then merging on index and creating a 3rd column as the delta.

Upvotes: 0

Related Questions