Wiama Daya
Wiama Daya

Reputation: 45

identical rows into columns csv python

i have the following csv file which i want to convert into the output, basically it reads the the first column value as key and then append the next value into column

input.csv

a.jpg,a,b
a.jpg,c,d
b.jpg,e,f
b.jpg,g,h
c.jpg,i,j
c.jpg,k,l
c.jpg,m,n
c.jpg,o,p
c.jpg,q,r
d.jpg,s,t
d.jpg,u,v
e.jpg,w,x
e.jpg,y,z

output.csv

a.jpg a,b c,d
b.jpg e,f g,h
c.jpg i,j k,l m,n o,p q,r
d.jpg s,t u,v
e.jpg w,x y,z

i tried the following, but still can't figure out how to unpack the value to output csv based on desired format above

import csv

with open('input.csv', 'r', newline='') as input_csv, open('output.csv', 'w', newline='') as output_csv:

filename = {}

for row in csv.reader(input_csv):
    if row[0] not in filename:
        filename[row[0]]=[[row[1],row[2]]]
    else:
        filename[row[0]].append([row[1],row[2]])

for item in filename.items():
    print(item)

the output

('a.jpg', [['a', 'b'], ['c', 'd']])
('b.jpg', [['e', 'f'], ['g', 'h']])
('c.jpg', [['i', 'j'], ['k', 'l'], ['m', 'n'], ['o', 'p'], ['q', 'r']])
('d.jpg', [['s', 't'], ['u', 'v']])
('e.jpg', [['w', 'x'], ['y', 'z']])

Upvotes: 1

Views: 160

Answers (3)

Errol
Errol

Reputation: 610

Check if this will work for you.

dic = {}

for row in csv.reader(input_csv):
    key, *values = row.split(",")
    if key not in dic:
        dic[key] = []
    for value in values:
        dic[key].append(value)

for key, value in dic.items():
    value = ", ".join(value)
    print(f"{key} {value}")

output:

a.jpg a, b, c, d
b.jpg e, f, g, h
c.jpg i, j, k, l, m, n, o, p, q, r
d.jpg s, t, u, v
e.jpg w, x, y, z

Upvotes: 0

Artur
Artur

Reputation: 688

Here is the code without pandas:

import csv

with open('input.csv', 'r', newline='') as input_csv, open('output.csv', 'w', newline='') as output_csv:

    filename = {}

    for row in csv.reader(input_csv):
        if row[0] in filename:
            my_list = [row[1],row[2]]
            filename[row[0]].append(my_list)
        elif row[0] not in filename:
            filename[row[0]]=[[row[1],row[2]]]

    for item in filename:
        output_csv.write('{} {}\n'.format(item,
            ' '.join([','.join(l) for l in filename[item]])))

Basically I've changed only last two lines.

Firstly we need to iterate through the dictionary and then write lines for each entry. The last line concatenates the key with the value. The value itself is a joined list items, made in two steps.

Upvotes: 0

Himanshu
Himanshu

Reputation: 676

You can use pandas for this

import pandas as pd

Read input CSV file as:

df = pd.read_csv('input.csv',header=None)

Group the data by first column:

dff =df.groupby(0).apply(lambda x: list(x[1])+list(x[2]))

Write each group values to an output CSV file

for i in dff.iteritems():
    with open('output.csv','a') as out:
        out.write(','.join([i[0]]+i[1])+'\n')

Upvotes: 1

Related Questions