RSax
RSax

Reputation: 338

Regex - lazy match first pattern occurrence, but no subsequent matching patterns

I need to return the first percentage, and only the first percentage, from each row in a file.

  1. Each row may have one or two, but not more than two, percentages.
  2. There may or may not be other numbers in the line, such as a dollar amount.
  3. The percentage may appear anywhere in the line.

Ex:

Profits in California were down 10.00% to $100.00, a decrease from 22.6% the prior year.
Profits in New York increased by 0.9%.
Profits in Texas were up 1.58% an increase from last year's 0.58%.

I can write a regex to capture all occurrences:

[0-9]+\.[0-9]+[%]+?

https://regex101.com/r/owZaGE/1

The other SO questions I've perused only address this issue when the pattern is at the front of the line or always preceded by a particular set of characters

What am I missing?

Upvotes: 1

Views: 129

Answers (3)

ggorlen
ggorlen

Reputation: 57115

/^.*?((?:\d+\.)?\d+%)/gm

works with a multiline flag, no negative lookbehind (some engines don't support non-fixed width lookbehinds). Your match will be in the capture group.

Upvotes: 2

S.B
S.B

Reputation: 16526

Mine is similar to you except I allowed numbers like 30% (without decimal points)

\d+(\.\d+)?%

I don't know what language you are using, but in python for getting the first occurrence you can use re.search()

Here is an example:

import re

pattern = r'\d+(\.\d+)?%'
string = 'Profits in California were down 10.00% to $100.00, a decrease from 22.6% the prior year.'

print(re.search(pattern, string).group())

Upvotes: 1

RSax
RSax

Reputation: 338

I was able to solve using a negative lookbehind:

(?<!%.*?)([0-9]+\.[0-9]+[%]+?)

Upvotes: 0

Related Questions