Joseph P Nardone
Joseph P Nardone

Reputation: 180

Numpy Savetxt Overwriting, Cannot Figure Out Where to Place Loop

I am creating a program that calculates correlations between my customer's data. I want to print the correlation values to a CSV so I can further analyze the data.

I have successfully gotten my program to loop through all the customers (12 months of data each) while calculating their individual correlations for multiple arrangements. I can see this if I print to the dialog.

However, when I try to save using Savetxt, I am only getting the final values I calculate.

I think I have placed my for loop in the wrong place, where should it go? I have tried checking out other questions, but it didn't shed too much light onto it.

EDIT: I have attempted aligning the writing with both the outer for loop and the inner for loop as suggested, both yielded the same results.

for x_customer in range(0,len(overalldata),12):

        for x in range(0,13,1):
                cust_months = overalldata[0:x,1]
                cust_balancenormal = overalldata[0:x,16]
                cust_demo_one = overalldata[0:x,2]
                cust_demo_two = overalldata[0:x,3]
                num_acct_A = overalldata[0:x,4]
                num_acct_B = overalldata[0:x,5]
    #Correlation Calculations
                demo_one_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_one)[1,0]
                demo_two_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_two)[1,0]
                demo_one_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_one)[1,0]
                demo_one_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_one)[1,0]
                demo_two_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_two)[1,0]
                demo_two_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_two)[1,0]

                result_correlation = [(demo_one_corr_balance),(demo_two_corr_balance),(demo_one_corr_acct_a),(demo_one_corr_acct_b),(demo_two_corr_acct_a),(demo_two_corr_acct_b)]

        result_correlation_combined = emptylist.append([result_correlation])
        cust_delete_list = [0,(x_customer),1]
        overalldata = numpy.delete(overalldata, (cust_delete_list), axis=0)

numpy.savetxt('correlationoutput.csv', numpy.column_stack(result_correlation), delimiter=',')
print result_correlation

Upvotes: 0

Views: 692

Answers (2)

Joseph P Nardone
Joseph P Nardone

Reputation: 180

I took the advice of the above poster and corrected my code. I am now able to write to a file. However, I am having trouble with the number of iterations complete, I will post that in a different question as it is unrelated. Here is the solution that I used.

for x_customer in range(0,len(overalldata),12):

        for x in range(0,13,1):
                cust_months = overalldata[0:x,1]

                cust_balancenormal = overalldata[0:x,16]

                cust_demo_one = overalldata[0:x,2]
                cust_demo_two = overalldata[0:x,3]

                num_acct_A = overalldata[0:x,4]
                num_acct_B = overalldata[0:x,5]

                out_mark_channel_one = overalldata[0:x,25]
                out_service_channel_two = overalldata[0:x,26]
                out_mark_channel_three = overalldata[0:x,27]
                out_mark_channel_four = overalldata[0:x,28]


    #Correlation Calculations

                #Demographic to Balance Correlations
                demo_one_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_one)[1,0]
                demo_two_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_two)[1,0]


                #Demographic to Account Number Correlations
                demo_one_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_one)[1,0]
                demo_one_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_one)[1,0]
                demo_two_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_two)[1,0]
                demo_two_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_two)[1,0]

                #Marketing Response Channel One
                mark_one_corr_acct_a = numpy.corrcoef(num_acct_A, out_mark_channel_one)[1, 0]
                mark_one_corr_acct_b = numpy.corrcoef(num_acct_B, out_mark_channel_one)[1, 0]
                mark_one_corr_balance = numpy.corrcoef(cust_balancenormal, out_mark_channel_one)[1, 0]

                #Marketing Response Channel Two
                mark_two_corr_acct_a = numpy.corrcoef(num_acct_A, out_service_channel_two)[1, 0]
                mark_two_corr_acct_b = numpy.corrcoef(num_acct_B, out_service_channel_two)[1, 0]
                mark_two_corr_balance = numpy.corrcoef(cust_balancenormal, out_service_channel_two)[1, 0]

                #Marketing Response Channel Three
                mark_three_corr_acct_a = numpy.corrcoef(num_acct_A, out_mark_channel_three)[1, 0]
                mark_three_corr_acct_b = numpy.corrcoef(num_acct_B, out_mark_channel_three)[1, 0]
                mark_three_corr_balance = numpy.corrcoef(cust_balancenormal, out_mark_channel_three)[1, 0]

                #Marketing Response Channel Four
                mark_four_corr_acct_a = numpy.corrcoef(num_acct_A, out_mark_channel_four)[1, 0]
                mark_four_corr_acct_b = numpy.corrcoef(num_acct_B, out_mark_channel_four)[1, 0]
                mark_four_corr_balance = numpy.corrcoef(cust_balancenormal, out_mark_channel_four)[1, 0]


                #Result Correlations For Exporting to CSV of all Correlations
                result_correlation = [(demo_one_corr_balance),(demo_two_corr_balance),(demo_one_corr_acct_a),(demo_one_corr_acct_b),(demo_two_corr_acct_a),(demo_two_corr_acct_b),(mark_one_corr_acct_a),(mark_one_corr_acct_b),(mark_one_corr_balance),
                                      (mark_two_corr_acct_a),(mark_two_corr_acct_b),(mark_two_corr_balance),(mark_three_corr_acct_a),(mark_three_corr_acct_b),(mark_three_corr_balance),(mark_four_corr_acct_a),(mark_four_corr_acct_b),
                                      (mark_four_corr_balance)]
                result_correlation_nan_nuetralized = numpy.nan_to_num(result_correlation)
                c.writerow(result_correlation)

        result_correlation_combined = emptylist.append([result_correlation])
        cust_delete_list = [0,x_customer,1]
        overalldata = numpy.delete(overalldata, (cust_delete_list), axis=0)

Upvotes: 0

hpaulj
hpaulj

Reputation: 231605

This portion of the code is just sloppy:

                result_correlation = [(demo_one_corr_balance),...]

        result_correlation_combined = emptylist.append([result_correlation])
        cust_delete_list = [0,(x_customer),1]
        overalldata = numpy.delete(overalldata, (cust_delete_list), axis=0)

numpy.savetxt('correlationoutput.csv', numpy.column_stack(result_correlation), delimiter=',')
print result_correlation

You set result_correlation in the inner most loop, and then you use it in the final save and print. Obviously it will print the result of the last loop.

Meanwhile you append it to result_correlation_combined, outside of the x loop, near tend of the x_customer loop. But you don't do anything with the list.

And finally in the x_customer loop you play with overalldata, but I don't see any further use.

Forget about the savetxt for now, and get the data collection straight.

Upvotes: 1

Related Questions