Reputation: 536
Assume I have two dataframes, df1 and df2, described as follows. See code below that creates each of these dfs.
df1
df2
My Objective
Any help would be greatly appreciated. Thank you!
Code
# Import dependencies
import pandas as pd
import numpy as np
# Create df1
df1 = pd.DataFrame(np.array([['2016-05-03', 1651,2653,3655,4658,5655],
['2017-05-29', 1751,2752,3754,4755, 5759],
['2018-08-22', 1889, 2882,3887, 4884, 5882],
['2019-06-28', 1966, 2965, 3966, 4960, 5963],
['2018-11-15', 1811, 2811, 3811, 4811, 5811],
['2019-12-31', 1912, 2912, 3912, 4912, 5912],
['2016-07-05', 1672, 2678, 3679, 4672, 5674],
['2017-05-15', 1755, 2750, 3759, 4752, 5755],
['2018-06-10', 1860, 2864, 3866, 4866, 5867],
['2019-01-28', 1918, 2910, 3914, 4911, 5918],
['2018-11-30', 1812, 2812, 3812, 4812, 5812],
['2019-01-03', 1915, 2917, 3916, 4916, 5917],]),
columns=['Date', 'Person1', 'Person2', 'Person3', 'Person4',
'Person5',])
# Format df1['Date'] col as datetime
df1['Date'] = pd.to_datetime(df1['Date'])
# Sort df1 by 'Date'
df1 = df1.sort_values(['Date'],ascending=[True]).reset_index(drop=True)
# Create 'df2', which contains measurement data on specific dates.
df2 = pd.DataFrame(np.array([['2017-05-15', 'Person4', '', ''], ['2019-01-28 ', 'Person1', '', ''],
['2018-11-15', 'Person1', '', ''], ['2018-08-22', 'Person3', '', ''],
['2017-05-15', 'Person5', '', ''], ['2016-05-03', 'Person2', '', ''],]),
columns=['Date', 'Person', 'Value_Today', 'Value_2_records_later'])
df2['Date'] = pd.to_datetime(df2['Date'])
# Display dfs
display(df1)
display(df2)
### I DON'T KNOW WHAT CODE I NEED TO SOLVE MY ISSUE ###
# To capture the row that is two rows below, I think I would use the '.shift(-2)' function?
Upvotes: 1
Views: 314
Reputation: 2541
First, copy the values once for Value_2_records_later
,
step1 = df1.set_index('Date')
persons = step1.columns.tolist()
c1 = [('Value_Today', p) for p in persons]
c2 = [('Value_2_records_later', p) for p in persons]
step1.columns = pd.MultiIndex.from_tuples(c1, names=('','Person'))
step1[c2] = step1[c1].shift(-2)
Then stack
to move columns to rows
step1.stack()
Upvotes: 0
Reputation: 71687
MultiIndex.map
:df1
to Date
s1
. The index of this series will be the combination of date and name of the person. Similarly create another series s2
.df2
to Date
and Person
columnsdf2
using the values from s1
and s2
and assign the corresponding results to Value_Today
and Value_2_records_later
s1 = df1.set_index('Date').stack()
s2 = df1.set_index('Date').shift(-2).stack()
ix = df2.set_index(['Date', 'Person']).index
df2['Value_Today'] = ix.map(s1)
df2['Value_2_records_later'] = ix.map(s2)
Result
print(df2)
Date Person Value_Today Value_2_records_later
0 2017-05-15 Person4 4752 4866
1 2019-01-28 Person1 1918 1912
2 2018-11-15 Person1 1811 1915
3 2018-08-22 Person3 3887 3812
4 2017-05-15 Person5 5755 5867
5 2016-05-03 Person2 2653 2750
Upvotes: 1