Jonas Palačionis
Jonas Palačionis

Reputation: 4842

Calculating index values based on previous calculations using the same calculation

I have a df:

pd.DataFrame(index = ['A','B','C'],
             columns = ['1','2','3','4'], 
             data = [[100,60,40,60],
                     [200,10,50,80],
                     [50, np.nan, np.nan, np.nan]])
        1           2           3           4        
A       100         60          40          60
B       200         10          50          80
C       50          

I would like to calculate the remaining C index values, but each calculation is dependent on the previous value like so:

        1           2           3           4        
A       100         60          40          60
B       200         10          50          80
C       50          A2+B2-C1    A3+B3-C2    A4+B4-C3

I checked this answer and tried the following:

new = [df.loc['C'].values]

for i in range(1, len(df.index)):
    new.append(new[i-1]*df.loc['A'].values[i]+df.loc['B'].values[i]-df.loc['C'].values[i-1])
df.loc['C'] = new

But I get :

ValueError: cannot set a row with mismatched columns

Also, the question and answers are quite outdated, maybe there is a new solution for these recursive functions inside pandas dataframe?

Upvotes: 1

Views: 248

Answers (1)

azro
azro

Reputation: 54168

Key is : print your variables to ensure they contains what you think


  • First is that new = [df.loc['C'].values] builds a list with one item being an array, you just want one list

  • Then if the loop you're using new[i-1] *, which isn't present in the schema above

  • you use df.loc['C'].values[i-1] but you don't update it (you save in a list) so you can't expect it to work

    • directly update the DF and use - df.loc['C'].values[i-1]
    • keep the temporaty list and use - new[i - 1]
  • you don't want to append, but overwrite the values (or you'd have need to start new with only one value


With a separate list

new = df.loc['C'].to_list()

for i in range(1, len(df.columns)):
    new[i] = df.loc['A'].values[i] + df.loc['B'].values[i] - new[i - 1]

Without a separate list

for i in range(1, len(df.columns)):
    df.iloc[2, i] = df.iloc[1, i] + df.iloc[0, i] - df.iloc[2, i - 1]

Upvotes: 2

Related Questions