khouzam
khouzam

Reputation: 273

Perform row multiplication in data frame

I want to perform the following operation in pandas, I wouldn't like to transform my Dataframe in array to perform.

date      A      B     C     D     E    ...
date1     0,03  0,02  0,01   0,01 0,234
date2     0,03  0,02  0,01   0,01 0,234
date3     0,03  0,02  0,01   0,01 0,234
date4     0,03  0,02  0,01   0,01 0,234

the numbers are not the same and have lots of decimal values. I want to create in another data frame the following :

date      value      
date1     (1+0,03)*(1+0,02)*(1+0,01)*(1+0,01)*(1+0,234)
date2     (1+0,03)*(1+0,02)*(1+0,01)*(1+0,01)*(1+0,234)
date3     (1+0,03)*(1+0,02)*(1+0,01)*(1+0,01)*(1+0,234)
date4     (1+0,03)*(1+0,02)*(1+0,01)*(1+0,01)*(1+0,234)

there are cells where the value is null, I want to skip those values. I would show what I have been trying, but what I did was transform to array and perform the operation, I loose my data and can't skip null values.

Upvotes: 2

Views: 56

Answers (1)

jezrael
jezrael

Reputation: 862511

Create index by dates if necessary by DataFrame.set_index, then add 1 for each value and use DataFrame.prod:

#if not numeric values replace , and convert to floats
#df = df.replace(',','.', regex=True)
df1 = df.set_index('date').astype(float).add(1).prod(axis=1).reset_index(name='value')
print (df1)
    date     value
0  date1  1.322499
1  date2  1.322499
2  date3  1.322499
3  date4  1.322499

Test with missing value:

print (df)
    date     A     B     C     D      E
0  date1  0,03  0,02  0,01  0,01    NaN
1  date2  0,03  0,02  0,01  0,01  0,234
2  date3  0,03  0,02  0,01  0,01  0,234
3  date4  0,03  0,02  0,01  0,01  0,234

df = df.replace(',','.', regex=True)

print (df.set_index('date').astype(float).add(1))
          A     B     C     D      E
date                                
date1  1.03  1.02  1.01  1.01    NaN
date2  1.03  1.02  1.01  1.01  1.234
date3  1.03  1.02  1.01  1.01  1.234
date4  1.03  1.02  1.01  1.01  1.234

df1 = df.set_index('date').astype(float).add(1).prod(axis=1).reset_index(name='value')
print (df1)
    date     value
0  date1  1.071717
1  date2  1.322499
2  date3  1.322499
3  date4  1.322499

Upvotes: 3

Related Questions