How to extract rows with the given sequence of values for 2 subsequent columns?

Question

How can I find the sequence of values AAA, BBB among columns COL_1, COL_2, COL_3, ....,COL_X (where X is some number, e.g. 200, so enumeration is not a good solution)?

For example, if there is a dataframe df, then the row 1 and 4 should be the output:

df=
ID    COL_1   COL_2   COL_3   COL_4
1     AAA     BBB     CCC     DDD
2     DDD     AAA     CCC     BBB
3     BBB     AAA     DDD     CCC
4     CCC     AAA     BBB     DDD

Zero · Accepted Answer

You could join the relevant columns of row as string and then search for AAA,BBB pattern.

In [152]: df.filter(regex='COL_').apply(lambda x: 'AAA,BBB' in ','.join(x), axis=1)
Out[152]:
0     True
1    False
2    False
3     True
dtype: bool

If you numeric values, convert them to string using map

In [166]: df.apply(lambda x: 'AAA,BBB' in ','.join(map(str, x)), axis=1)
Out[166]:
0     True
1    False
2    False
3     True
dtype: bool

In [175]: df[df.apply(lambda x: 'AAA,BBB' in ','.join(map(str, x)), axis=1)]
Out[175]:
   ID COL_1 COL_2 COL_3 COL_4
0   1   AAA   BBB   CCC   DDD
3   4   CCC   AAA   BBB    DD

How to extract rows with the given sequence of values for 2 subsequent columns?

Answers (2)

Related Questions