Devil
Devil

Reputation: 61

How to retrieve the columns of DataFrame within the loop in Python?

I have this below output. I have wrote code for this inside while loop for looping. Here i enter 3 then it creates 3 different dataframes with different values. Enter the Number to iter:3

             Paid        CDF     Ultimate    Reserves
OP1    3901463.0        NaN          NaN         NaN 
OP2    5339085.0   1.000000  5339085.000         0.0 
OP3    4909315.0   1.085767  5331516.090    422201.1 
OP4    4588268.0   1.136096  5212272.448    624004.4 
OP5    3873311.0   1.238680  4799032.329    925721.3 
OP6    3691712.0   1.350145  4983811.200   1292099.2 
OP7    3483130.0   1.602168  5579974.260   2096844.3
OP8    2864498.0   2.334476  6685738.332   3821240.3
OP9    1363294.0   4.237940  5777639.972   4414346.0
OP10    344014.0  15.204053  5230388.856   4886374.9
Total        NaN        NaN          NaN  18482831.5


           


             Paid        CDF     Ultimate    Reserves
OP1    3901463.0        NaN          NaN         NaN
OP2    5339085.0   1.000000  5339085.000         0.0
OP3    4909315.0   1.090000  5351153.350    441838.4
OP4    4588268.0   1.137559  5221448.984    633181.0
OP5    3873311.0   1.231368  4768045.841    894734.8
OP6    3691712.0   1.331933  4917360.384   1225648.4
OP7    3483130.0   1.563703  5447615.320   1964485.3
OP8    2864498.0   2.318600  6642770.862   3778272.9
OP9    1363294.0   4.234960  5773550.090   4410256.1
OP10    344014.0  16.958969  5834133.426   5490119.4
Total        NaN        NaN          NaN  18838536.3


            


             Paid        CDF     Ultimate    Reserves
OP1    3901463.0        NaN          NaN         NaN
OP2    5339085.0   1.000000  5339085.000         0.0
OP3    4909315.0   1.072698  5267694.995    358380.0
OP4    4588268.0   1.130229  5184742.840    596474.8
OP5    3873311.0   1.208164  4678959.688    805648.7
OP6    3691712.0   1.267187  4677399.104    985687.1
OP7    3483130.0   1.497767  5217728.740   1734598.7
OP8    2864498.0   2.229342  6384966.042   3520468.0
OP9    1363294.0   4.219405  5751737.386   4388443.4
OP10    344014.0  16.036065  5516608.504   5172594.5
Total        NaN        NaN          NaN  17562295.2

 

Using Above reserve column i have to generate below dataframe like simulation1 ,simulation2 so on till the number of reserve column generated by user input.

                OP1   OP2       OP3       OP4       OP5        OP6        OP7        OP8        OP9       OP10       Total
Simulation1     NaN   0.0    422201.1   624004.4  925721.3  1292099.2  2096844.3  3821240.3  4414346.0  4886374.9  18482831.5
Simulation2     NaN   0.0    441838.4   633181.0  894734.8  1225648.4  1964485.3  3778272.9  4410256.1  5490119.4  18838536.3
Simulation3     NaN   0.0    358380.0   596474.8  805648.7  985687.1   1734598.7  3520468.0  4388443.4  5172594.5  17562295.2

I have below code:

 itno=int(input("Enter the Number to iter:"))
                iter_count=0
                while (iter_count < itno):
                
                        randomdf = scaledDF.copy()
                        choices = randomdf.values[~pd.isnull(randomdf.values)] 
                        randomdf = randomdf.applymap(lambda x: np.random.choice(choices) if not pd.isnull(x) else x)     
                        #print(randomdf,"\n\n") 
                                

                        cumdf = CumulativePaidTriangledf.iloc[:, :-1][:-2].copy()
                        ldfdf = LDFTriangledf.iloc[:, :-1].copy()
                        ResampledDF = randomdf.copy()
                        for colname4, col4 in ResampledDF.iteritems():
                                ResampledDF[colname4] = (ResampledDF[colname4] * (Variencedf[colname4][-1]/(cumdf[colname4]**0.5)))+ldfdf[colname4][-1]
                        #print(ResampledDF,"\n\n")
                      

                #SUMPRODUCT:
                        sumPro = ResampledDF.copy()
                        #cumdf = cumdf.iloc[:, :-1]
                        for colname5,col5 in sumPro.iteritems():
                                sumPro[colname5] = (sumPro[colname5].round(2))*cumdf[colname5]
                        sumPro = sumPro.append(pd.Series(sumPro.sum(), name='SUMPRODUCT'))
                        #print(sumPro)

                #SUM(OFFSET):
                        sumOff = cumdf.apply(lambda x: x.iloc[:cumdf.index.get_loc(x.last_valid_index())].sum())
                        #print(sumOff)

                #Weighted avg:
                        Weighted_avg = sumPro.loc['SUMPRODUCT']/sumOff
                        #print(Weighted_avg)

                        ResampledDF = ResampledDF.append(pd.Series(Weighted_avg, name='Weighted Avg'))
                        #print(ResampledDF,"\n\n")

                        '''for colname6,col6 in ResampledDF.iteritems():
                        ResampledDF[colname6] = ResampledDF[colname6].replace({'0':np.nan, 0:np.nan})
                        print(ResampledDF)'''

                        ResampledDF.loc['Weighted Avg'] = ResampledDF.loc['Weighted Avg'].replace(0, 1)
                        
                        c = ResampledDF.iloc[-1][::-1].cumprod()[::-1]
                        ResampledDF = ResampledDF.append(pd.Series(c,name='CDF'))
                        #print("\n\n",ResampledDF)


                #Getting Calculation of ultimates:
                        s = CumulativePaidTriangledf.iloc[:, :][:-2].copy()
                        ultiCalc = pd.DataFrame()
                        ultiCalc['Paid']= s['Total'] 
                        ultiCalc['CDF'] = np.flip(ResampledDF.loc['CDF'].values)
                        ultiCalc['Ultimate'] = ultiCalc['Paid']*(ultiCalc['CDF']).round(3)
                        ultiCalc['Reserves'] = (ultiCalc['Ultimate'].round(1))-ultiCalc['Paid']
                        ultiCalc.loc['Total'] = pd.Series(ultiCalc['Reserves'].sum(), index = ['Reserves']).round(2)
                        print("\n\n",ultiCalc)
                        iter_count+=1
                     
                        

                #Getting Simulations:
                        simulationDf = pd.DataFrame(columns=['OP1','OP2','OP3','OP4','OP5','OP6','OP7','OP8','OP9','OP10','Total'])
                        simulationDf.loc['Simulation'] = ultiCalc['Reserves']                                              
                        print("\n\n",simulationDf) 

Current Output:

Simulation1            NaN
Simulation2            0.0
Simulation3       353470.7
Simulation4       559768.7
Simulation5       859875.0
Simulation6      1162889.3
Simulation7      1828643.2
Simulation8      3958736.2
Simulation9      4464787.9
Simulation10     5224196.6
Simulation11    18412367.6
Simulation12           NaN
Simulation13           0.0
Simulation14      402563.8
Simulation15      669887.1
Simulation16      883114.9
Simulation17     1185039.6
Simulation18     1859991.4
Simulation19     3511874.5
Simulation20     3875844.8
Simulation21     4481126.4
Simulation22    16869442.5

Upvotes: 1

Views: 107

Answers (1)

jezrael
jezrael

Reputation: 862481

Use list comprehension for loop by list of DataFrames with select column Reserves and join together by DataFrame constructor, last if necessary set index:

dfs = [df1, df2, df3]

df = pd.DataFrame([x['Reserves'] for x in dfs]).reset_index(drop=True)
df.index = 'Simulation' + (df.index + 1).astype(str)

If there is some loop for generate DataFrames like pseudocode:

#create list outside loop
dfs = []
            iter_count=0
            while (iter_count < itno):
            
                    randomdf = scaledDF.copy()
                    choices = randomdf.values[~pd.isnull(randomdf.values)] 
                    randomdf = randomdf.applymap(lambda x: np.random.choice(choices) if not pd.isnull(x) else x)     
                    #print(randomdf,"\n\n") 
           ....
           ....

            #Getting Calculation of ultimates:
                    s = CumulativePaidTriangledf.iloc[:, :][:-2].copy()
                    ultiCalc = pd.DataFrame()
                    ultiCalc['Paid']= s['Total'] 
                    ultiCalc['CDF'] = np.flip(ResampledDF.loc['CDF'].values)
                    ultiCalc['Ultimate'] = ultiCalc['Paid']*(ultiCalc['CDF']).round(3)
                    ultiCalc['Reserves'] = (ultiCalc['Ultimate'].round(1))-ultiCalc['Paid']
                    ultiCalc.loc['Total'] = pd.Series(ultiCalc['Reserves'].sum(), index = ['Reserves']).round(2)
                    print("\n\n",ultiCalc)
                    iter_count+=1

                    #append in loop
                    dfs.append(ultiCalc['Reserves'])

And then outside loops join together:

df = pd.DataFrame([x for x in dfs]).reset_index(drop=True)
df.index = 'Simulation' + (df.index + 1).astype(str)
                 

Upvotes: 2

Related Questions