doingmybest
doingmybest

Reputation: 314

can't extract a specific pattern from a list using Python and re

I have a bunch of lists containing these types of values:

('CDI - AB - MO - id3 - Mobile', '2018-01-12', '48,67')

I'm only interested in the value which start with "id" plus the number(s) following it.

For example, I'd like to extract "id3" from this list

('CDI - AB - MO - id3 - Mobile', '2018-01-12', '48,67') 

or "id33" from

('CDI - AC - MO - id33 - Mobile', '2018-01-12', '48,67')

I'm trying to achieve this goal thanks to the re library

Here is my code:

matchObj = re.search( r'id*', my_list, re.M|re.I)

would you have any suggestions which could help to return the desired value?

Thanks

Upvotes: 1

Views: 38

Answers (1)

heemayl
heemayl

Reputation: 42017

You can use the Regex pattern:

\bid\d+\b
  • \b matches empty strings at word ends (zero-width)

  • id macthes id literally

  • \d+ matches one or more digits


Example:

In [113]: t1 = ('CDI - AB - MO - id3 - Mobile', '2018-01-12', '48,67')

In [114]: t2 = ('CDI - AC - MO - id33 - Mobile', '2018-01-12', '48,67')

In [115]: pat = re.compile(r'\bid\d+\b')

In [119]: [pat.search(i).group() for i in t1 if pat.search(i)]
Out[119]: ['id3']

In [120]: [pat.search(i).group() for i in t2 if pat.search(i)]
Out[120]: ['id33']

Upvotes: 2

Related Questions