Reputation: 735
What I am trying to achieve is a csv file that looks like this:
,h.r9,h.r8,h.r5,h.r4,h.r7,h.r6,h.r1,h.r3,h.r2,h.r10
chris,0,1,0,0,0,0,0,1,0,0
bob,0,0,1,1,1,1,0,0,0,1
dick,1,0,1,0,0,0,0,0,1,1
ab,1,2,0,0,0,0,0,0,0,0
emma,0,0,1,0,1,0,1,0,1,0
somewhat unrelated: does anyone know what the actual name for a matrix like this is? That would at least make googling this issue easier.
The data is originally in a dictionary like this:
dict = {}
dict['ab'] = ["h.r1","h.r2"]
dict['bob'] = ["h.r3","h.r4","h.r5","h.r6","h.r10"]
dict['chris'] = ["h.r2","h.r8"]
dict['dick'] = ["h.r1","h.r10","h.r9","h.r3"]
dict['emma'] = ["h.r3","h.r5","h.r7","h.r9"]
So, if a person has the variable in the dictionary there should be a 1 in the CSV otherwise there should be a zero.
The code below builds the csv file, but with all zeros. How would I implement the logic of ones and zeros? I can do this in PHP/HTML, but being very new to Python I'm not not catching on... Also is there a library that may already have this function built in? (numpy perhaps?)
import csv
dict = {}
dict['ab'] = ["h.r1","h.r2"]
dict['bob'] = ["h.r3","h.r4","h.r5","h.r6","h.r10"]
dict['chris'] = ["h.r2","h.r8"]
dict['dick'] = ["h.r1","h.r10","h.r9","h.r3"]
dict['emma'] = ["h.r3","h.r5","h.r7","h.r9"]
mergedlist = []
for k,v in dict.items():
mergedlist = mergedlist + v
mergedlist = list(set(mergedlist))
listlen = len(mergedlist)
mergedlist[0:0] = " "
thekeys = []
for k in dict.keys():
thekeys.append(k)
thecsv = csv.writer(open("your.csv", 'w', newline=''))
thecsv.writerow(mergedlist)
for col in thekeys:
thecsv.writerow([col] + ['0']*listlen) #does the logic go in here somehow??
Upvotes: 1
Views: 321
Reputation: 54380
First you don't want to name you data dict
as it overrides dictionary
, change it to something like dict1
:
#first generate the data body:
>>> ar=[[dict1[jtem].count(item) for item in ['h.r%s' %i for i in range(1, 11)]] for jtem in dict1] #are your values only in h.s1 to h.s10?
>>> ar
[[0, 1, 0, 0, 0, 0, 0, 1, 0, 0], \
[1, 0, 1, 0, 0, 0, 0, 0, 1, 1], \
[0, 0, 1, 1, 1, 1, 0, 0, 0, 1], \
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0], \
[0, 0, 1, 0, 1, 0, 1, 0, 1, 0]]
#Then insert the names:
>>> junk=[item.insert(0, jtem) for item, jtem in zip(ar, dict1.keys())]
>>> ar
[['chris', 0, 1, 0, 0, 0, 0, 0, 1, 0, 0], \
['dick', 1, 0, 1, 0, 0, 0, 0, 0, 1, 1], \
['bob', 0, 0, 1, 1, 1, 1, 0, 0, 0, 1], \
['ab', 1, 1, 0, 0, 0, 0, 0, 0, 0, 0], \
['emma', 0, 0, 1, 0, 1, 0, 1, 0, 1, 0]]
#Then insert the title:
>>> ar.insert(0, ['',]+['h.r%s' %i for i in range(1, 11)])
>>> ar
[['', 'h.r1', 'h.r2', 'h.r3', 'h.r4', 'h.r5', 'h.r6', 'h.r7', 'h.r8', 'h.r9', 'h.r10'], \
['chris', 0, 1, 0, 0, 0, 0, 0, 1, 0, 0], \
['dick', 1, 0, 1, 0, 0, 0, 0, 0, 1, 1], \
['bob', 0, 0, 1, 1, 1, 1, 0, 0, 0, 1], \
['ab', 1, 1, 0, 0, 0, 0, 0, 0, 0, 0], \
['emma', 0, 0, 1, 0, 1, 0, 1, 0, 1, 0]]
Then write it to a CSV file using your favorite method.
Upvotes: 1
Reputation: 599836
I have no idea what most of that code is doing. It can be done in just a couple of lines:
all_values = set()
for person_values in data.values():
all_values = all_values.union(person_values)
all_values_list = list(all_values)
thecsv = csv.writer(open("your.csv", 'w'))
thecsv.writerow(all_values_list)
for name, values in data.items():
row = [1 if i in values else 0 for i in all_values_list]
thecsv.writerow([name] + row)
Note I've renamed your dictionary to data
, as it's not a good idea to overwrite built-in function/class names.
Upvotes: 3
Reputation: 3913
I would do it completely different but using your code I'd change the ['0']*listlen
to:
['1' if x in dict[col] else '0' for x in thekeys]
Upvotes: 1