Niccola Tartaglia
Niccola Tartaglia

Reputation: 1667

Filter elements from 2 pandas dataframes

I have two dataframes which represent stock prices over time and stock related information over time (e.g. fundamental data on the company).

Both dataframes contain monthly data, however they are over different time spans. One is 5 years, the other is 10 years. Also, both do not have the same number of stocks, there is only an 80% overlap.

Below is an example of the dataframes:

days1 = pd.date_range(start='1/1/1980', end='7/1/1980', freq='M')

df1   = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'),index=days1)

days2 = pd.date_range(start='1/1/1980', end='5/1/1980', freq='M')

df2   = pd.DataFrame(np.random.randn(4, 6), columns=list('ABCDEF'),index=days2)

My goal is to reduce both dataframes to the inner joint. That is, so both cover the same time period and contain the same stocks. My index contains the dates, and the column names are the stock names.

I have tried multiple variations of merge() etc, but those recreate a merged dataframe, I want to keep both dataframes. I have also tried isin() but I am struggling with accessing the index of each dataframe. For instance:

df3=df1[df1.isin(df2)].dropna()

Does someone have any suggestions?

Upvotes: 1

Views: 94

Answers (1)

Rachel Kogan
Rachel Kogan

Reputation: 86

for the column intersection:

column_intersection = df1.columns & df2.columns

for the row intersection:

row_intersection = df1.index & df2.index

then just subset each dataframe:

df1 = df1.loc[row_intersection, column_intersection]
df2 = df2.loc[row_intersection, column_intersection]

Upvotes: 1

Related Questions