HBS
HBS

Reputation: 55

Selecting rows in cvs file and write them in another csv file

I have a csv file with 2 columns (titles are value, image). The value list contains values in ascending order (0,25,30...), and the image list contains pathway to images (e.g. X.jpg). Total lines are 81 including the titles (that is, there are 80 values and 80 images)

screenshot of a couple of columns in a spreadsheet

What I want to divide this list 4-ways. Basically the idea is to have a spread of pairs of images.

In the first group I took the image part of every two near rows (2+3, 4+5....), and wrote them in a new csv file. I write each image in a different column. Here's the code:

import csv

f = open('random_sorted.csv')
csv_f = csv.reader(f)

i = 0
prev = ""

#open csv file for writing
with open('first_group.csv', 'wb') as test_file:
    csv_writer = csv.writer(test_file)
    csv_writer.writerow(["image1"] + ["image2"])
    for row in csv_f:
        if i%2 == 0 and i!=0:
            #print prev + "," + row[1]
            csv_writer.writerow([prev] + [row[1]])
        else:
            prev = row[1]
        i = i+1

Here's the output of this:

enter image description here

I want to keep the concept similar with the rest 3 groups(write into a new csv file the paired images and having two columns), but just increase the spread. That is, pair together every 5 rows (i.e. 2+7 etc.), every 7 (i.e. 2+9 etc.), and every 9 rows together. Would love to get some directions as to how to execute it. I was lucky with the first group (just learned about the remainder/divider option in the CodeAcademy course, but can't think of ideas for the other groups.

Upvotes: 0

Views: 1972

Answers (1)

yvespeirsman
yvespeirsman

Reputation: 3099

First collect all the rows in the csv file in a list:

with open('random_sorted.csv') as csvfile:
        csv_reader = csv.reader(csvfile, delimiter=';')
        headers = next(csv_reader)
        rows = [row for row in csv_reader]

Then set your required step size (5, 7 or 9) and identify the rows on the basis of their index in the list of rows:

with open('first_group.csv', 'wb') as test_file:
    csv_writer = csv.writer(test_file)
    csv_writer.writerow(["image1"] + ["image2"])

    step_size = 7 # set step size here
    seen = set() # here we remember images we've already seen
    for x in range(0, len(rows)-step_size):
        img1 = rows[x][1]
        img2 = rows[x+step_size][1]
        if not (img1 in seen or img2 in seen):
            csv_writer.writerow([img1, img2])
            seen.add(img1)
            seen.add(img2)

Upvotes: 1

Related Questions