Reputation: 294218
consider the two dataframes df1
and df2
df1 = pd.DataFrame(np.zeros((6, 6)), list('abcdef'), list('abcdef'), dtype=int)
df1.iloc[2:4, 2:4] = np.array([[1, 2], [3, 4]])
df1
df2 = pd.DataFrame(np.array([[1, 2], [3, 4]]), list('CD'), list('CD'), dtype=int)
df2
It's clear that df2
is in df1
. How do I test for this in general?
Upvotes: 2
Views: 108
Reputation: 294218
def isin2d(small_df, large_df):
di, dj = small_df.shape
mi, mj = large_df.shape
for i in range(mi - di + 1):
for j in range(mj - dj + 1):
if (small_df.values == large_df.values[i:i + di, j:j + dj]).all():
return True
return False
isin2d(df2, df1)
True
Upvotes: 1
Reputation: 221504
Assuming the dataframes contain 0's
and 1s
only, you can use 2D convolution
and look if any element in the convoluted output is equal to the number of elements in df2
-
from scipy.signal import convolve2d
out = (convolve2d(df1,df2)==df2.size).any()
For a generic case, let me use skimage
module and this smart solution
-
from skimage.util import view_as_windows as viewW
out = ((viewW(df1.values, df2.shape) == df2.values).all(axis=(2,3))).any()
This is basically a template-matching problem and it has been discussed and we have gotten really efficient solutions under this post : How can I check if one two-dimensional NumPy array contains a specific pattern of values inside it?
. That post also gives us the indices of all places in df1
where df2
could be located.
Upvotes: 3