Reputation: 349
I'm struggling with a dataframe related problem. There are two dataframes, df and dff, as below
data = np.array([['', 'col1', 'col2'],
['row1', 1, 2],
['row2', 3, 4]])
df = pd.DataFrame(data=data[1:,1:].astype(int), index=data[1:,0],columns=data[0,1:])
filters=np.array([['', 'col1', 'col2'],
['row1', 1, 1],
['row2', 1, 2],
['row3', 3, 2]])
dff = pd.DataFrame(data=filters[1:,1:].astype(int), index=filters[1:,0],columns=filters[0,1:])
I wish to select rows from df such that their col2 value belongs to a list of values that can be found in dff with matching col1 value. For example, for the col1 value equals to 1, that list should be [1, 2], for the col1 value equals 2, the list is [2].
My best attempt to solve this is
df1 = df[df['col2'].isin(dff[dff['col1']==df['col1']]['col2'])]
But that results in
ValueError: Can only compare identically-labeled Series objects
Any help would be appreciated. Thanks so much.
Upvotes: 1
Views: 1168