Reputation: 207
I have a dataframe in which I have to perform some operations. I got it everything all right like this:
ID Value Date Date_diff_cumsum Val Weight
1 0.000000 2017-02-13 20:54:00 0.0 0.000000 nan
1 0.029598 2017-02-13 21:02:00 8.0 0.029598 nan
1 0.273000 2017-02-13 22:33:00 99.0 0.273000 nan
1 0.153000 2017-02-13 23:24:00 150.0 0.15300 nan
I have another dataset in which I have the weights, like this:
ID Value
1 78.0
2 75.0
3 83.0
4 60.0
And I would like to fill my original dataframe's weigth columns with the repetition of thr weight of each ID, like:
ID Value Date Date_diff_cumsum Val Weight
1 0.000000 2017-02-13 20:54:00 0.0 0.000000 78.0
1 0.029598 2017-02-13 21:02:00 8.0 0.029598 78.0
1 0.273000 2017-02-13 22:33:00 99.0 0.273000 78.0
1 0.153000 2017-02-13 23:24:00 150.0 0.15300 78.0
... ... ... ... ... ...
4 .... ..... .... .... ... 60.0
4 .... ..... .... .... ... 60.0
That's because I need to calculate with this formula:
That's my code:
df = df[['ID','Value', 'Date']]
df = df.sort_values(by=['Date'])
df['Date_diff_cumsum'] = df.groupby('ID').Date.diff().dt.seconds / 60.0
df['Date_diff_cumsum'] =
df.groupby('ID').Date_diff_cumsum.cumsum().fillna(0)
df['TempVal'] = df.groupby('ID')['Value'].transform(lambda x:(x-
x.iloc[0]*1000))
How can I do this operation of adding the repetition of the weigth from the second dataframe to the first one? Is there a more efficient way? Because I need to calculate the final result with the same way, but with 3 other dataframes with different names but similar values, for each ID, like:
score = df1[(Val*1000)/(weight*Date_diff_cumsum)]+
df2(Val*1000)/(weight*Date_diff_cumsum)]+...
Thank you very much
edit: now it is working, but whenever I try to find the final dataframe:
score = df1.TempVal + df2.TempVal + df3.TempVal
I get an empty dataframe full with nans. Do you know why? I need to print all the tempVal for each ID and to plot them
Upvotes: 4
Views: 6415
Reputation: 38415
You can use map to map values from df2 to Weight. Since you have already calculated date_diff_cumsum by grouping over ID, you can calculate tempval directly from df1,
df1['Weight'] = df1['ID'].map(df2.set_index('ID')['Value'])
df1['TempVal'] = df1['Value']*1000/(df1['Weight'] * df1['Date_diff_cumsum'])
ID Value Date Date_diff_cumsum Val Weight TempVal
0 1 0.000000 2017-02-13 20:54:00 0.0 0.000000 78.0 NaN
1 1 0.029598 2017-02-13 21:02:00 8.0 0.029598 78.0 0.047433
2 1 0.273000 2017-02-13 22:33:00 99.0 0.273000 78.0 0.035354
3 1 0.153000 2017-02-13 23:24:00 150.0 0.153000 78.0 0.013077
Upvotes: 5
Reputation: 4660
Just map the weights with:
df["Weight"] = df["ID"].map(weights["Value"])
Where weights
is your other dataset (and where you also need to set ID as the index of that dataset).
Upvotes: 0