atlantic0cean
atlantic0cean

Reputation: 155

Extracting data from one DataFrame to another

I have a DataFrame data:

Date 1Y 2Y 3Y 4Y 5Y 6Y 7Y 8Y 9Y 10Y C0 C1 C2
0 2006-01-06 1.6630 1.8605 2.0481 2.2259 2.3937 2.5517 2.6998 2.8380 2.9664 3.0848 7.392170 2.451846 -0.100585
1 2006-01-09 1.6531 1.8483 2.0339 2.2099 2.3762 2.5329 2.6800 2.8174 2.9452 3.0634 7.339621 2.433888 -0.098956
2 2006-01-10 1.6436 1.8375 2.0220 2.1969 2.3624 2.5184 2.6649 2.8020 2.9295 3.0476 7.298496 2.421375 -0.096909
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1707 2012-11-05 2.9475 3.0809 3.0869 3.1859 3.2060 3.4211 3.4938 3.5200 3.5517 3.5872 9.900536 2.556375 -0.542565
1708 2012-11-06 2.9438 3.0764 3.0857 3.1748 3.2000 3.4305 3.5040 3.5223 3.5459 3.5771 9.894910 2.557064 -0.545739

Now I have another DataFrame divis which contains following values:

Starting Date Ending Date Period
0 2006-01-06 2006-07-28 203 days
1 2006-07-28 2006-10-20 84 days
2 2006-10-20 2007-01-19 91 days
3 2007-01-19 2007-07-13 175 days
4 2007-07-13 2007-11-23 133 days

What I need to do now is to find in data the values of C0, C1 and C2 on the starting and ending dates listed in divis, and append these data to divis. For example, look at the first row of divis. The Starting Date is 2006-01-06, so I move to data, find the row of 2006-01-06, which is the first row, get the value C0 on that row(7.392170) and append it on the first row of divis. Again for the ending date, I get

Starting Date Ending Date Period Starting C0 Ending C0
0 2006-01-06 2006-07-28 203 days 7.392170 8.437521

How to carry out this data-fetching process automatically with Python? Thank you in advance!

Upvotes: 0

Views: 936

Answers (2)

paradocslover
paradocslover

Reputation: 3304

All you need to do is this:

#Set the date column as the index in table 1. This would allow you to search values in O(1)
df1_indexed = df1.set_index("Date")

def date_getter(x):
    try:
        #search for the values 
        return pd.Series([df1_indexed.loc[x['Starting Date'],'C0'], df1_indexed.loc[x['Ending Date'],'C0']])
    except: #In case the dates weren't found.
        return pd.Series([np.nan,np.nan])


df2[['Starting C0','Ending C0']] = df2[['Starting Date', 'Ending Date']].apply(date_getter,axis=1)

Upvotes: 1

Nancy_Tayal
Nancy_Tayal

Reputation: 92

# here df represents the 'data' and df2 represents divis
for i in df2.index :
    x=(df[df['Date']==df2.loc[i,"Starting Date"]]['C0'].values[0])    
    y=(df[df['Date']==df2.loc[i,"Ending Date"]]['C0'].values[0])
    df2.loc[i,'Starting C0'] = x
    df2.loc[i,'Ending C0'] = y

Note : This code will work fine until all the dates in divis are present in data, otherwise an exception would be raised. In that case use try except to fill with null or default values or do let me know I'll provide the alternative code.

Upvotes: 1

Related Questions