Reputation: 1667
I have two dataframes which represent stock prices over time and stock related information over time (e.g. fundamental data on the company).
Both dataframes contain monthly data, however they are over different time spans. One is 5 years, the other is 10 years. Also, both do not have the same number of stocks, there is only an 80% overlap.
Below is an example of the dataframes:
days1 = pd.date_range(start='1/1/1980', end='7/1/1980', freq='M')
df1 = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'),index=days1)
days2 = pd.date_range(start='1/1/1980', end='5/1/1980', freq='M')
df2 = pd.DataFrame(np.random.randn(4, 6), columns=list('ABCDEF'),index=days2)
My goal is to reduce both dataframes to the inner joint. That is, so both cover the same time period and contain the same stocks. My index contains the dates, and the column names are the stock names.
I have tried multiple variations of merge()
etc, but those recreate a merged dataframe, I want to keep both dataframes. I have also tried isin()
but I am struggling with accessing the index of each dataframe. For instance:
df3=df1[df1.isin(df2)].dropna()
Does someone have any suggestions?
Upvotes: 1
Views: 94
Reputation: 86
for the column intersection:
column_intersection = df1.columns & df2.columns
for the row intersection:
row_intersection = df1.index & df2.index
then just subset each dataframe:
df1 = df1.loc[row_intersection, column_intersection]
df2 = df2.loc[row_intersection, column_intersection]
Upvotes: 1