Paul Nicoara
Paul Nicoara

Reputation: 107

export to csv in python

I have the following case: I need to get the time of a feature in a csv file and compare it with the time of the pictures taken by someone. Then i need to find 2 (or less) matches. I will assign the first two pictures i find in a 2 mins interval from the time of the feature to that feature. I managed to create two dictionaries with the details: feature_hours contains id and time of the feature. photo_hours contains photo_path and time of the photo. sorted_feature and sorted_photo are two lists that sorted the two dictionaries. The problem is that in the output csv file i only have 84 rows completed and some are blank. The feature csv file has 199 features. I think i incremented j too often. But i need a clear look from a pro, because i cannot figure it out. Here is the code:


j=1
sheet1.write(0,71,"id")
sheet1.write(0,72,"feature_time")
sheet1.write(0,73,"Picture1")
sheet1.write(0,74,"Picture_time")
sheet1.write(0,75,"Picture2")
sheet1.write(0,76,"Picture_time")
def write_first_picture():

    sheet1.write(j,71,feature_time[0])
    sheet1.write(j,72,feature_time[1])
    sheet1.write(j,73,photo_time[0])
    sheet1.write(j,74,photo_time[1])

def write_second_picture():

    sheet1.write(j-1,75,photo_time[0])
    sheet1.write(j-1,76,photo_time[1])

def write_pictures():

    if i==1:

        write_first_picture()
    elif i==2:
        write_second_picture()

for feature_time in sorted_features:
    i=0
    for photo_time in sorted_photo:
        if i<2:
            if feature_time[1][0]==photo_time[1][0]:
                if feature_time[1][1]==photo_time[1][1]:
                    if feature_time[1][2]<photo_time[1][2]:
                        i=i+1
                        write_pictures()
                        j=j+1
                    elif int(feature_time[1][1])+1==photo_time[1][1]:
                        i=i+1
                        write_pictures()
                        j=j+1
                    elif int(feature_time[1][1])+2==photo_time[1][1]:
                        i=i+1
                        write_pictures()
                        j=j+1
                    elif int(feature_time[1][0])+1==photo_time[1][0]:
                        if feature_time[1][1]>=58:
                            if photo_time[1][1]<=02:
                                i = i+1
                                write_pictures() 
                                j=j+1

Edit: Here is examples of the two lists: Features list: [('-70', ('10', '27', '03')), ('-73', ('10', '29', '50'))] Photo list: [('20160801_125133-1151969393.jpg', ('12', '52', '04')), ('20160801_125211342753906.jpg', ('12', '52', '16'))]

Upvotes: 0

Views: 176

Answers (1)

A Small Shell Script
A Small Shell Script

Reputation: 620

There is a CSV module for python to help load these files. You could sort the results to try to be more efficient/short-circuit your checks as well. I cannot really tell what the i and j variables are meant to represent, but I am pretty sure you can do something like the following:

import csv

def hmstoseconds(hhmmss):
    # 60 * 60 seconds in an hour, 60 seconds in a min, 1 second in a second
    return sum(x*y for x, y in zip(hhmmss, (60*60, 60, 1)))

features = []
# features model looks like tuple(ID, (HH, MM, SS))
with open("features.csv") as f:
    reader = csv.reader(f)
    features = list(reader)

photos = []
# photos model looks like tuple(filename, (HH, MM, SS))
with open("photos.csv) as f:
    reader = csv.reader(f)
    photos = list(reader)

for feature in features:
    for photo in photos:
        # convert HH, MM, SS to seconds and find within 2 min (60s * 2)
        # .. todo:: instead of nested for loops, we could use filter()
        if abs(hmstoseconds((feature[1]) - hmstoseconds(photo[1])) <=(60 * 2):
            # the photo was taken within 2 min of the feature
            <here, write a photo>

In order to make this more maintainable/readable, you could also use namedtuples to better represent the data models:

import csv
from collections import namedtumple

# model definitions to help with readability/maintainence
# if the order of the indices changes or we add more fields, we just need to 
# change them directly here instead of tracking the indexes everywhere
Feature = namedtuple("feature", "id, date")
Photo = namedtuple("photo", "file, date")

def hmstoseconds(hhmmss):
    # 60 * 60 seconds in an hour, 60 seconds in a min, 1 second in a second
    return sum(x*y for x, y in zip(hhmmss, (60*60, 60, 1)))

def within_two_min(date1, date2):
    # convert HH, MM, SS to seconds for both dates
    # return whether the absolute difference between them is within 2 min (60s * 2)
    return abs(hmstoseconds(date1) - hmstoseconds(date2)) <= 60 * 2 

if __name__ == '__main__':
    # using main here means we avoid any nasty global variables
    # and only execute this code when this file is run directly
    features = []
    with open("features.csv") as f:
        reader = csv.reader(f)
        features = [Feature(f) for f in reader]

    photos = []
    with open("photos.csv) as f:
        reader = csv.reader(f)
        photos = [Photo(p) for p in reader]

    for feature in features:
        for photo in photos:
            # .. todo:: instead of nested for loops, we could use filter()
            if within_two_min(feature.date, photo.date):
                <here, write a photo>

Hopefully this gets you moving in the right direction. I don't fully understand what you were trying to do with i and j and the first/second "write_picture" stuff, but hoping you understand better the scope and access in python.

Upvotes: 1

Related Questions