Ella
Ella

Reputation: 105

Find years of experience in a string using regexs in Python

How can I write a regex which searches for following in Python:

10+ years
10 years
1 year
10-15 years

So far, I have used this, but its not providing result for all of them.

re_expression = '(\d+).(years|year|Year|Years)'
    exp_temp = re.search(re_expression.decode('utf-8'),description)
    experience_1=''
    if exp_temp:
        experience_1 = exp_temp.groups()

Upvotes: 1

Views: 1293

Answers (3)

The fourth bird
The fourth bird

Reputation: 163217

If you want to match your values and don't need the capturing groups, you might use:

\b(?:\d+-\d+ [yY]ears|[02-9] [Yy]ears|1 [Yy]ear|[1-9]\d+\+? [Yy]ears)\b

See the regex demo

Explanation

  • \b Word boundary
  • (?: Non capturing group
    • \d+-\d+ [yY]ears Match format 10-15 years
    • | Or
    • [02-9] [Yy]ears Match format 0 or 2-9 years
    • | Or
    • 1 [Yy]ear Match format 1 year
    • | Or
    • [1-9]\d+\+? [Yy]ears Match format 10+ years
  • ) Close non capturing group
  • \b Word boundary

Python demo

Upvotes: 2

Pedro Lobito
Pedro Lobito

Reputation: 98881

([\d+-]+)\s+(years?)


import re

x ="""
123 10+ years some text
some text 99 10 years ssss
text 1 year and more text
some text 10-15 years some text
"""

result = re.findall(r"([\d+-]+)\s+(years?)", x, re.IGNORECASE)
print(result)

[('10+', 'years'), ('10', 'years'), ('1', 'year'), ('10-15', 'years')]

Python Demo

Regex Demo


Regex Explanation:

enter image description here

Upvotes: 3

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You may use

r'(\d+(?:-\d+)?\+?)\s*(years?)'

See the regex demo. Compile with re.I flag to enable case insensitive matching.

Details

  • (\d+(?:-\d+)?\+?) - Group 1:
    • \d+ - 1+ digits
    • (?:-\d+)? - an optional group matching - and then 1+ digits
    • \+? - an optional + char
  • \s* - 0+ whitespaces
  • (years?) - Group 2: year or years

Python demo:

import re
rx = re.compile(r"(\d+(?:-\d+)?\+?)\s*(years?)", re.I)
strs = ["10+ years", "10 years", "1 year", "10-15 years"] 
for description in strs:
    exp_temp = rx.search(description)
    if exp_temp:
        print(exp_temp.groups())

Output:

('10+', 'years')
('10', 'years')
('1', 'year')
('10-15', 'years')

Upvotes: 4

Related Questions