JMG
JMG

Reputation: 11

Trying to get CSV data into source-target format for a bibliometrics graph (Gephi)

I'm trying to get a csv which is comprised of rows which describe the research groups (from 1 to n groups) which have worked in a particular publication into a csv with combinations of each 2 groups who have collaborated.

The csv that I have is like that: (each row corresponds to a particular publication)

group1;group2;group3
group1;group8
group8;group2;group1

I need to convert it into Gephi edges format, which uses a csv source-target format:

group1;group2
group1;group3
group2;group3
group1;group8
group8;group2
group8;group1
group2;group1   

(do not need all permutations, just combinations, as it's an undirected graph)

I first done it with just one of the rows and got the general idea of how to do it:

b = "group1;group2;group3"

b_split = b.split(";")

print list(combinations(b_split,2))

Result: [('group1', 'group2'), ('group1', 'group3'), ('group2', 'group3')]

But when I try to open the whole csv, it seems the split function doesn't work well.

with open('grups.csv','rb') as origin_file:
    reader = csv.reader(origin_file, delimiter=";")
    a = list(reader)

for row in a:
    c = list(combinations(row,2))

with open('output.csv','wb') as result_file:
    for each in c:
        wr = csv.writer(result_file)
        wr.writerow(each)

But the result I get in the file is just the last line.

Upvotes: 1

Views: 134

Answers (1)

JMG
JMG

Reputation: 11

Got it working by this:

with open('grups.csv','rb') as origin_file:
    reader = csv.reader(origin_file, delimiter=";")
    a = list(reader)

with open('output_grups.csv','wb') as result_file:
    for row in a:
        c = list(combinations(row,2))
        for each in c:
            wr = csv.writer(result_file,delimiter=';',dialect='excel')
            wr.writerow(each)

Upvotes: 0

Related Questions