Reputation: 781
I have a text with some lines (200+) in this format:
10684 - The jackpot ? discuss Lev 3 --- ? ---
10755 - Garbage Heap ? discuss Lev 5 --- ? ---
I hant to retrieve the first number (10684 or 10755) only if number after "Lev" is greater than 3.
I'm able to get the first number with this regex: ([0-9]+) -
but without the 'level' restrictions.
How this could be made?
Thanks in advance.
Upvotes: 0
Views: 87
Reputation: 985
In bash use this:
var=">3"
perl -lne '/(\d+) - .*Lev (\d+)/; print $1 if $2'"$var"
This is a good solution to be able to pass the condition by parameter.
Upvotes: 0
Reputation: 189477
A bit of Awk trickery:
awk -F '\? +discuss +Lev' '$2>3 { split($1,a,/ */); print a[1] }' file
Upvotes: 0
Reputation: 54984
A lookahead is really the best thing because it will leave just the number:
/\d+(?=.*Lev (0*[4-9]|[1-9]\d))/
Upvotes: 0
Reputation: 6315
(\d+) - .*?Lev (?:[4-9]|[1-9]\d+)
The first \d+
matches line number as you have done.
The next .*?
is a lazy quantifier, which will not consume too many characters. And the following expression will guide it to the right place. (lazy quantifier is usually more efficient)
The second parenthesis, (?:[4-9]|[1-9]\d+)
, matches either single digital numbers greater than 3 or two digital numbers without leading zero.
Alright stackoverflow doesn't properly show my image. Take this link : http://regexr.com?36n5l
Example Output:
Upvotes: 3
Reputation: 30273
Regular expressions doesn't recognize numbers as numbers (only strings). You can do this though:
([0-9]+) - .*Lev (?:[4-9][^0-9]|[1-9][0-9]+)
Basically, we use the alternation operator (|
) to accept only a single digit greater than 3 (enforced by checking that the following character is not a digit) or a multi-digit number not beginning with a zero.
In case that level number might be the end of the line, though, you might have to do this:
([0-9]+) - .*Lev (?:[4-9](?:[^0-9]|$)|[1-9][0-9]+)
(I'm assuming whatever regex engine you're using can't handle lookaround assertions. In the future, try to always include what language you're using when you're asking a regex question.)
Ah, I just read your edit that the number is always less than 10. Well, that's much easier then:
([0-9]+) - .*Lev [4-9]
Upvotes: 1