user3116355
user3116355

Reputation: 1197

Regex to find multiple matches

i have tried some possibilities of lookahead to find a pattern inside a string, but I am stuck here as I have to check multiple conditions.

I have a string like

string = ''' i was behind the bars for (5.75) years '''
string2 = ''' I travelled for 6 months in Switzerland and some years say 5.2 in England '''
re.search(r'(?=\byears\b)([/d]+\S+)',str,re.I)

This I tried for get dates after years; /S+ is used to get 5.33, 5.44 etc. as there will be a space after the digit combination.

I want a regex to match any digit combination like 5.75, 10.25 etc., even if it is enclosed in brackets or quotes. But I need the digits only. It can be before the word "years" or after it. What would be the best way to use regex in Python to check the multiple possibilities?

Upvotes: 0

Views: 229

Answers (1)

user557597
user557597

Reputation:

This might work.

Update

You're getting an 'invalid expression' error.
I don't see anything invalid unless python doesn't support modifiers in cluster groups.
You might try to take the case modifier out and adding it to the options part in the regex function.

Try this then:

(?:\b(\d+(?:\.\d*)?|\.\d+)\b.*?(?:(?:\r?\n).*?){0,2}\byears?\b|\byears?\b.*?(?:(?:\r?\n).*?){0,2}\b(\d+(?:\.\d*)?|\.\d+)\b)  

Original:

 #  (?i:\b(\d+(?:\.\d*)?|\.\d+)\b.*?(?:(?:\r?\n).*?){0,2}\byears?\b|\byears?\b.*?(?:(?:\r?\n).*?){0,2}\b(\d+(?:\.\d*)?|\.\d+)\b)

 (?i:
      \b 
      (                             # (1 start), Digits
           \d+ 
           (?: \. \d* )?
        |  \. \d+ 
      )                             # (1 end)
      \b 
      .*? 
      (?:                           # 0, 1 or 2 lines
           (?: \r? \n )
           .*? 
      ){0,2}
      \b years? \b                  # Followed by "year(s)"

   |                              # or --

      \b years? \b                  # "year(s)"
      .*?   
      (?:                           # 0, 1 or 2 lines
           (?: \r? \n )
           .*? 
      ){0,2}
      \b 
      (                             # (2 start), Followed by Digits
           \d+ 
           (?: \. \d* )?
        |  \. \d+ 
      )                             # (2 end)
      \b 
 )

Upvotes: 1

Related Questions