pdubois
pdubois

Reputation: 7790

Regex match with element in list iteratively

I have a list that looks like this:

mylist = [
    'Th2 2w, total RNA (linc-sh36)',
    'SP CD8, total RNA (replicate 1)',
    'DN 2, total RNA (replicate 2)']

What I want to do is to keep entries in that list that match another list:

ctlist = ['DN 1', 'DN 2', 'DN 3', 'DN 4', \
          'DP 1', 'DP 2', 'tTreg', 'CD8', 'CD4', 'iTreg']

So the final output is to produce this:

 SP CD8, total RNA (replicate 1)
 DN 2, total RNA (replicate 2)

I tried this but produce no result:

import re
for mem in mylist:
    for ct in ctlist:
      regex = re.compile(ct)
      match = regex.match(mem)
      if match:
         print mem

What's the right way to do it?

Upvotes: 0

Views: 139

Answers (4)

Vivek Sable
Vivek Sable

Reputation: 10213

, is missing in your mylist value.

mylist = [
    'Th2 2w, total RNA (linc-sh36)',
    'SP CD8, total RNA (replicate 1)',
    'DN 2, total RNA (replicate 2)']

We can create pattern of regular expression at the start of code and then use in for loop.

Code:

mylist = [
    'Th2 2w, total RNA (linc-sh36)',
    'SP CD8, total RNA (replicate 1)',
    'DN 2, total RNA (replicate 2)']

ctlist = ['DN 1', 'DN 2', 'DN 3', 'DN 4', \
          'DP 1', 'DP 2', 'tTreg', 'CD8', 'CD4', 'iTreg']

import re
regex = re.compile("|".join(ctlist))
print [ mem for mem in mylist  if regex.match(mem)]

Output:

python test.py 
['DN 2, total RNA (replicate 2)']

Upvotes: 1

vks
vks

Reputation: 67968

mylist = ['Th2 2w, total RNA (linc-sh36)','SP CD8, total RNA (replicate 1)','DN 2, total RNA (replicate 2)']
ctlist = ['DN 1', 'DN 2', 'DN 3', 'DN 4','DP 1', 'DP 2', 'tTreg', 'CD8', 'CD4', 'iTreg']
print [ x for x in mylist if [y for y in ctlist if y in x ]]

Upvotes: 1

Hackaholic
Hackaholic

Reputation: 19733

You don't need regex here:

>>> mylist
['Th2 2w, total RNA (linc-sh36)', 'SP CD8, total RNA (replicate 1)', 'DN 2, total RNA (replicate 2)']
>>> ctlist
['DN 1', 'DN 2', 'DN 3', 'DN 4', 'DP 1', 'DP 2', 'tTreg', 'CD8', 'CD4', 'iTreg']
>>> [ x for x in mylist for y in ctlist if y in x]
['SP CD8, total RNA (replicate 1)', 'DN 2, total RNA (replicate 2)']

Upvotes: 1

John Zwinck
John Zwinck

Reputation: 249123

The main problem is you forgot the commas in mylist. So your data is not what you think it is. Try adding some print statements and you can easily discover problems like this in your loops.

The second problem is that you need regex.search instead of regex.match, because you are trying to match the entire string, not only the start of mem. However, you don't need regexes at all for what you're doing:

for mem in mylist:
    for ct in ctlist:
        if ct in mem:
            print mem
            break

Upvotes: 2

Related Questions