clu
clu

Reputation: 117

Nested loop over columns in a dataframe python

I have the following dataframe

print(df1)

        Date    start          end    delta d1   x_s    y_s      z_s    x_f      y_f    z_f
0   09/01/2017  09/01/2017  06/02/2017  28  28  0.989   0.945   0.626   0.191   0.932   0.280
1   10/01/2017  09/01/2017  06/02/2017  27  28  0.989   0.945   0.626   0.191   0.932   0.280
2   11/01/2017  09/01/2017  06/02/2017  26  28  0.989   0.945   0.626   0.191   0.932   0.280
3   12/01/2017  09/01/2017  06/02/2017  25  28  0.989   0.945   0.626   0.191   0.932   0.280
4   13/01/2017  09/01/2017  06/02/2017  24  28  0.989   0.945   0.626   0.191   0.932   0.280
5   14/01/2017  09/01/2017  06/02/2017  23  28  0.989   0.945   0.626   0.191   0.932   0.280
6   15/01/2017  09/01/2017  06/02/2017  22  28  0.989   0.945   0.626   0.191   0.932   0.280
7   16/01/2017  09/01/2017  06/02/2017  21  28  0.989   0.945   0.626   0.191   0.932   0.280
8   17/01/2017  09/01/2017  06/02/2017  20  28  0.989   0.945   0.626   0.191   0.932   0.280
9   18/01/2017  09/01/2017  06/02/2017  19  28  0.989   0.945   0.626   0.191   0.932   0.280

where df1['delta'] = df1['end'] - df1['Date'] and df1['d1'] = df['end']-df1['start'] I would like to create 3 new columns where it shows the interpolated values between the pairs (x_s, x_f), (y_s, y_f) , (z_s, z_f).

I have tried the following code

def mapper (name):
     return name+'_i'

ss = list(df1[['x_s', 'y_s', 'z_s']])
fs = list(df1[['x_f', 'y_f', 'z_f' ]])
df2 = pd.DataFrame

for s in ss :
    for f in fs:
         df2[s] = df1[s] + (((df1[f] - df1[s])/df1['d1'])*df1['delta'])

df_conc = pd.concat((df1, df2_new), axis=1)

however when I try to run the nested loops I get the following error:

TypeError: 'type' object does not support item assignment

I wonder what I am doing wrong. I would greatly appreciate any hint or suggestion. Thanks a lot in advance!

second attempt:

ss = ('x', 'y', 'z') 

for s in ss: 
   df1[mapper(s)] = pd.Series((df1[s+'_s'] + ((df1[s+'_f'] - df1[s+'_s'])/(df1['d1']))*df1['delta']), name=mapper(s), index=df1.index)  

but still I do not get 3 new columns which loop through the following pairs (x_s, x_f), (y_s, y_f), (z_s, z_f).

Please let me know if you spot what I am doing wrong, thanks a lot in advance!

Upvotes: 2

Views: 2147

Answers (3)

clu
clu

Reputation: 117

def mapper (name):
     return name+'_i'

ss = ('x', 'y', 'z') 

for s in ss: 
   df1[mapper(s)] = pd.Series((df1[s+'_s'] + ((df1[s+'_f'] - df1[s+'_s'])/(df1['d1']))*df1['delta']), name=mapper(s), index=df1.index)

Upvotes: 0

cardamom
cardamom

Reputation: 7421

This should fix it:

for s in ss :
    for f in fs:
        df1[mapper(s)] = pd.Series(df1[s] + (((df1[f] - df1[s])/df1['d1'])*df1['delta']), name=mapper(s), index=df1.index)

I think that does what you want, lose the last concat line. Pandas wants the index passed to it when you add a new column like that see here

Something else you might need is to check the .dtypes of your columns and as needed use pd.to_datetime. This may also be helpful.

I ran the following:

df1.end = pd.to_datetime(df1.end)
df1.start = pd.to_datetime(df1.start)
df1.Date = pd.to_datetime(df1.Date)


df1.delta = df1.delta / pd.offsets.Second(1)
df1.d1 = df1.d1 / pd.offsets.Second(1)

Upvotes: 1

Dan
Dan

Reputation: 45752

I don't think you should be looping. Just let numpy do this all for you in a vectorized manner.

ss = df[['x_s', 'y_s', 'z_s']].values
fs = df[['x_f', 'y_f', 'z_f' ]].values
ss2 = ss + ((ss - fs)/df[['d1']].values)*df[['delta']].values

Note I'm sure you can get rid of some of the .values above but this should illustrate the principle

Upvotes: 1

Related Questions