Extracting data from one DataFrame to another

Question

I have a DataFrame data:

	Date	1Y	2Y	3Y	4Y	5Y	6Y	7Y	8Y	9Y	10Y	C0	C1	C2
0	2006-01-06	1.6630	1.8605	2.0481	2.2259	2.3937	2.5517	2.6998	2.8380	2.9664	3.0848	7.392170	2.451846	-0.100585
1	2006-01-09	1.6531	1.8483	2.0339	2.2099	2.3762	2.5329	2.6800	2.8174	2.9452	3.0634	7.339621	2.433888	-0.098956
2	2006-01-10	1.6436	1.8375	2.0220	2.1969	2.3624	2.5184	2.6649	2.8020	2.9295	3.0476	7.298496	2.421375	-0.096909
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
1707	2012-11-05	2.9475	3.0809	3.0869	3.1859	3.2060	3.4211	3.4938	3.5200	3.5517	3.5872	9.900536	2.556375	-0.542565
1708	2012-11-06	2.9438	3.0764	3.0857	3.1748	3.2000	3.4305	3.5040	3.5223	3.5459	3.5771	9.894910	2.557064	-0.545739

Now I have another DataFrame divis which contains following values:

	Starting Date	Ending Date	Period
0	2006-01-06	2006-07-28	203 days
1	2006-07-28	2006-10-20	84 days
2	2006-10-20	2007-01-19	91 days
3	2007-01-19	2007-07-13	175 days
4	2007-07-13	2007-11-23	133 days

What I need to do now is to find in data the values of C0, C1 and C2 on the starting and ending dates listed in divis, and append these data to divis. For example, look at the first row of divis. The Starting Date is 2006-01-06, so I move to data, find the row of 2006-01-06, which is the first row, get the value C0 on that row(7.392170) and append it on the first row of divis. Again for the ending date, I get

	Starting Date	Ending Date	Period	Starting C0	Ending C0
0	2006-01-06	2006-07-28	203 days	7.392170	8.437521

How to carry out this data-fetching process automatically with Python? Thank you in advance!

paradocslover · Accepted Answer

All you need to do is this:

#Set the date column as the index in table 1. This would allow you to search values in O(1)
df1_indexed = df1.set_index("Date")

def date_getter(x):
    try:
        #search for the values 
        return pd.Series([df1_indexed.loc[x['Starting Date'],'C0'], df1_indexed.loc[x['Ending Date'],'C0']])
    except: #In case the dates weren't found.
        return pd.Series([np.nan,np.nan])


df2[['Starting C0','Ending C0']] = df2[['Starting Date', 'Ending Date']].apply(date_getter,axis=1)

Extracting data from one DataFrame to another

Answers (2)

Related Questions