Advanced condition lookup in pandas(numpy)

Question

given: a list of elements 'ls' and a big df 'df', all the elements of 'ls' is in the 'df'.

ls = ['a0','a1','a2','b0','b2','c0',...,'c_k']
df = [['a0','b0','c0'],
      ['a0','b0','c1'],
      ['a0','b0','c2'],
      ...
      ['a_i','b_j','c_k']]

goal: I want to collect the rows set of the 'df' that contains the most elements of 'ls', such as ['a0','b0','c0'] is the best one. But at most a row just contain only 2 elements

tried: I tried enumerating 3 or 2 elements in 'ls', but it was too expensive and probably return None since there exist only 2 elements in some row. I tried to use a dictionary to count, but it didn't work either.

I've been puzzling over this problem all day, any help will be greatly appreciated.

Advanced condition lookup in pandas(numpy)

Answers (1)

Related Questions