Reputation: 2886
Create new pandas dataframe column showing a boolean of either 1 (intersection) or 0 (no intersection) of row values in two different columns: row_mods
and col_mods
. Another column is added to show what those overlap(s) is (are). As in the example below, intersect
takes boolean values, and common
shows the intersecting value(s).
The rendered pandas dataframe is what I have, the drawn portion is what I'm looking for:
# data
n = np.nan
congruent = pd.DataFrame.from_dict(
{'row': ['x','a','b','c','d','e','y'],
'x': [ n, 5, 5, 5, 5, 5, 5],
'a': [ 5, n, -.8,-.6,-.3, .8, .01],
'b': [ 5,-.8, n, .5, .7,-.9, .01],
'c': [ 5,-.6, .5, n, .3, .1, .01],
'd': [ 5,-.3, .7, .3, n, .2, .01],
'e': [ 5, .8,-.9, .1, .2, n, .01],
'y': [ 5, .01, .01, .01, .01, .01, n],
}).set_index('row')
congruent.columns.names = ['col']
memberships = {'a':['vowel'], 'b':['consonant'], 'c':['consonant'], 'd':['consonant'], 'e':['vowel'], 'y':['consonant', 'vowel'], '*':['wildcard']}
# format stacked df
cs = congruent.stack().to_frame()
cs.columns = ['score']
cs.reset_index(inplace=True)
cs.columns = ['row', 'col', 'score']
# filter col entries not found in membership dict keys
cs['elim'] = (cs['row'].isin(memberships.keys())) & (cs['col'].isin(memberships.keys()))
cs_2 = cs[cs['elim'] == True]
# map col entires to membership dict values
cs_2['row_mods'] = cs_2['row'].map(memberships)
cs_2['col_mods'] = cs_2['col'].map(memberships)
How can I perform an intersection across two values in a row across two different columns?
Upvotes: 0
Views: 3933
Reputation: 2438
try this mate:
step1, define function:
def check_row (row_mods, col_mods):
common = []
intersect = 0
for x in col_mods:
if x in row_mods:
intersect = 1
common.append(x)
if (intersect == 0):
common.append(np.nan)
return (intersect, common)
step 2, apply function:
cs_2['intersect'] = ''
cs_2['common'] = ''
for index in cs_2.index:
(intersect, common) = check_row(cs_2.loc[index,'row_mods'], cs_2.loc[index,'col_mods'])
cs_2.loc[index,'intersect'] = intersect
cs_2.loc[index,'common'] = [x for x in common]
hope it helps! if it does upvote/check answer :)
Upvotes: 1
Reputation: 77857
Since you're apparently comfortable with the PANDAS operations, I'll supply just the Python intersection logic:
common = list(set(row_mods).intersection(set(col_mods)))
intersect = len(common) > 0
Briefly, you turn each list of mods into a set, and then use the Python built-in intersection method. Turn the result back into a list.
Does that solve your problem?
Upvotes: 2