Reputation: 83
I have a csv file and that is the values of commodities traded between countries, something like this:
Country Comm Value
GER 1 200
GER 2 300
GER 45 354
USA 2 100
USA 85 500
UK 2 240
UK 85 900
I have created a matrix with this data. In this created matrix, rows are countries and columns are commodities' codes and each element shows the value of trade. The number of commodities is 97 and I've used the following code to create that matrix:
rfile = open('file path','r')
rfile.next()
dic_c1_products = {}
for i in rfile :
lns = i.strip().split(',')
c1 = lns[0]
p = lns[1]
value= lns[2]
if not dic_c1_products.has_key(c1):
dic_c1_products[c1] = [(p,value),]
else:
dic_c1_products[c1].append((p,value))
product_count = 97
c1_list = dic_c1_products.keys()
matrix_c1_products = [[0 for col in range(int(product_count)+1)] for row in range(len(c1_list))]
for c1 in dic_c1_products:
for p, v in dic_c1_products[c1]:
matrix_c1_products[c1_list.index(c1)][int(p)] = int(v)
print 'Matirix Done'
Now I want to calculate an index score for each pair of countries (the pair score is: total trade in common over total trade of each country). The created matrix has a form like this:
Countries Commodity1 Commodity2 Commodity45 Commodity85
GER 200 300 45 0
USA 0 100 0 500
UK 0 240 0 900
First I want to sum the total values of the SAME commodities that two countries are trading and then divide this amount to TOTAL trade of those two countries. For example if we consider GER-USA, they both trade commodities number 2, so I want to have summation of these common commodities (300+100) over the summation of total trade of Germany and the United States : (Fist Row:200+300+354)+(Second Row: 100+500) In simple words, if we consider the matrix: First, I want to calculate the total values for GER and USA rows. Second, to calculate the values of the total common commodities which are being traded Third, divide the value of stage two to the value of stage one. For doing this, I have written the following code:
for i in range(len(matrix_c1_products)):
for j in range(i, len(matrix_c1_products)):
dividend=sum([matrix_c1_products[i]])+sum([matrix_c1_products[j]])
for k in matrix_c1_products[i]:
for l in matrix_c1_products[j]:
# print k,l
if int(k)==int(0):
pass
if int(l)==int(0):
pass
else:
commonone.append(k)
commontwo.append(l)
divisor=sum(commonone)+sum(commontwo)
shares=int(divisor/dividend)
print shares, divisor, dividend
but there is a problem with commonone list. I intend to remove zeros from two rows and add the existence values but because of the loop, the same number repeats in the list and the results are not correct. Any help would be appreciated.
Upvotes: 1
Views: 785
Reputation: 107347
As a more pythonic way you can first create a dictionary of your rows that it could be done with following dict comprehension :
hart_dict={i[0]:map(int,i[1:]) for i in spamreader}
{' USA': [0, 100, 0, 500], ' GER': [200, 300, 45, 0], ' UK': [0, 240, 0, 900]}
Then create your pairs with itertools.combinations
:
capirs= list(combinations(next(z),2))
[(' GER', ' USA'), (' GER', ' UK'), (' USA', ' UK')]
And then calculate the sum of commodities :
row_sums=[sum(map(int,i)) for i in z]
[200, 640, 45, 1400]
and at last you can loop over your pairs and calculate your expected result.
import csv
from itertools import combinations,izip
commodities=['Commodity1' ,'Commodity2', 'Commodity45' ,'Commodity85']
with open('ex.csv', 'rb') as csvfile:
spamreader = list(csv.reader(csvfile, delimiter=','))
chart_dict={i[0]:map(int,i[1:]) for i in spamreader}
z=izip(*spamreader)
capirs= list(combinations(next(z),2))
row_sums=[sum(map(int,i)) for i in z]
for i,j in capirs:
for index,com in enumerate(commodities):
print i,j,com,float(chart_dict[i][index]+chart_dict[j][index])/row_sums[index]
Result :
GER USA Commodity1 1.0
GER USA Commodity2 0.625
GER USA Commodity45 1.0
GER USA Commodity85 0.357142857143
GER UK Commodity1 1.0
GER UK Commodity2 0.84375
GER UK Commodity45 1.0
GER UK Commodity85 0.642857142857
USA UK Commodity1 0.0
USA UK Commodity2 0.53125
USA UK Commodity45 0.0
USA UK Commodity85 1.0
Upvotes: 2