Reputation: 180
I am creating a program that calculates correlations between my customer's data. I want to print the correlation values to a CSV so I can further analyze the data.
I have successfully gotten my program to loop through all the customers (12 months of data each) while calculating their individual correlations for multiple arrangements. I can see this if I print to the dialog.
However, when I try to save using Savetxt, I am only getting the final values I calculate.
I think I have placed my for loop in the wrong place, where should it go? I have tried checking out other questions, but it didn't shed too much light onto it.
EDIT: I have attempted aligning the writing with both the outer for loop and the inner for loop as suggested, both yielded the same results.
for x_customer in range(0,len(overalldata),12):
for x in range(0,13,1):
cust_months = overalldata[0:x,1]
cust_balancenormal = overalldata[0:x,16]
cust_demo_one = overalldata[0:x,2]
cust_demo_two = overalldata[0:x,3]
num_acct_A = overalldata[0:x,4]
num_acct_B = overalldata[0:x,5]
#Correlation Calculations
demo_one_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_one)[1,0]
demo_two_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_two)[1,0]
demo_one_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_one)[1,0]
demo_one_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_one)[1,0]
demo_two_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_two)[1,0]
demo_two_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_two)[1,0]
result_correlation = [(demo_one_corr_balance),(demo_two_corr_balance),(demo_one_corr_acct_a),(demo_one_corr_acct_b),(demo_two_corr_acct_a),(demo_two_corr_acct_b)]
result_correlation_combined = emptylist.append([result_correlation])
cust_delete_list = [0,(x_customer),1]
overalldata = numpy.delete(overalldata, (cust_delete_list), axis=0)
numpy.savetxt('correlationoutput.csv', numpy.column_stack(result_correlation), delimiter=',')
print result_correlation
Upvotes: 0
Views: 692
Reputation: 180
I took the advice of the above poster and corrected my code. I am now able to write to a file. However, I am having trouble with the number of iterations complete, I will post that in a different question as it is unrelated. Here is the solution that I used.
for x_customer in range(0,len(overalldata),12):
for x in range(0,13,1):
cust_months = overalldata[0:x,1]
cust_balancenormal = overalldata[0:x,16]
cust_demo_one = overalldata[0:x,2]
cust_demo_two = overalldata[0:x,3]
num_acct_A = overalldata[0:x,4]
num_acct_B = overalldata[0:x,5]
out_mark_channel_one = overalldata[0:x,25]
out_service_channel_two = overalldata[0:x,26]
out_mark_channel_three = overalldata[0:x,27]
out_mark_channel_four = overalldata[0:x,28]
#Correlation Calculations
#Demographic to Balance Correlations
demo_one_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_one)[1,0]
demo_two_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_two)[1,0]
#Demographic to Account Number Correlations
demo_one_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_one)[1,0]
demo_one_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_one)[1,0]
demo_two_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_two)[1,0]
demo_two_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_two)[1,0]
#Marketing Response Channel One
mark_one_corr_acct_a = numpy.corrcoef(num_acct_A, out_mark_channel_one)[1, 0]
mark_one_corr_acct_b = numpy.corrcoef(num_acct_B, out_mark_channel_one)[1, 0]
mark_one_corr_balance = numpy.corrcoef(cust_balancenormal, out_mark_channel_one)[1, 0]
#Marketing Response Channel Two
mark_two_corr_acct_a = numpy.corrcoef(num_acct_A, out_service_channel_two)[1, 0]
mark_two_corr_acct_b = numpy.corrcoef(num_acct_B, out_service_channel_two)[1, 0]
mark_two_corr_balance = numpy.corrcoef(cust_balancenormal, out_service_channel_two)[1, 0]
#Marketing Response Channel Three
mark_three_corr_acct_a = numpy.corrcoef(num_acct_A, out_mark_channel_three)[1, 0]
mark_three_corr_acct_b = numpy.corrcoef(num_acct_B, out_mark_channel_three)[1, 0]
mark_three_corr_balance = numpy.corrcoef(cust_balancenormal, out_mark_channel_three)[1, 0]
#Marketing Response Channel Four
mark_four_corr_acct_a = numpy.corrcoef(num_acct_A, out_mark_channel_four)[1, 0]
mark_four_corr_acct_b = numpy.corrcoef(num_acct_B, out_mark_channel_four)[1, 0]
mark_four_corr_balance = numpy.corrcoef(cust_balancenormal, out_mark_channel_four)[1, 0]
#Result Correlations For Exporting to CSV of all Correlations
result_correlation = [(demo_one_corr_balance),(demo_two_corr_balance),(demo_one_corr_acct_a),(demo_one_corr_acct_b),(demo_two_corr_acct_a),(demo_two_corr_acct_b),(mark_one_corr_acct_a),(mark_one_corr_acct_b),(mark_one_corr_balance),
(mark_two_corr_acct_a),(mark_two_corr_acct_b),(mark_two_corr_balance),(mark_three_corr_acct_a),(mark_three_corr_acct_b),(mark_three_corr_balance),(mark_four_corr_acct_a),(mark_four_corr_acct_b),
(mark_four_corr_balance)]
result_correlation_nan_nuetralized = numpy.nan_to_num(result_correlation)
c.writerow(result_correlation)
result_correlation_combined = emptylist.append([result_correlation])
cust_delete_list = [0,x_customer,1]
overalldata = numpy.delete(overalldata, (cust_delete_list), axis=0)
Upvotes: 0
Reputation: 231605
This portion of the code is just sloppy:
result_correlation = [(demo_one_corr_balance),...]
result_correlation_combined = emptylist.append([result_correlation])
cust_delete_list = [0,(x_customer),1]
overalldata = numpy.delete(overalldata, (cust_delete_list), axis=0)
numpy.savetxt('correlationoutput.csv', numpy.column_stack(result_correlation), delimiter=',')
print result_correlation
You set result_correlation
in the inner most loop, and then you use it in the final save and print. Obviously it will print the result of the last loop.
Meanwhile you append it to result_correlation_combined
, outside of the x
loop, near tend of the x_customer
loop. But you don't do anything with the list.
And finally in the x_customer
loop you play with overalldata
, but I don't see any further use.
Forget about the savetxt
for now, and get the data collection straight.
Upvotes: 1