user238021
user238021

Reputation: 1221

Match a number in a string with letters and numbers

I need to write a Perl regex to match numbers in a word with both letters and numbers. Example: test123. I want to write a regex that matches only the number part and capture it

I am trying this \S*(\d+)\S* and it captures only the 3 but not 123.

Upvotes: 0

Views: 234

Answers (6)

Axeman
Axeman

Reputation: 29844

Were it a case where a non-digit was required (say before, per your example), you could use the following non-greedy expressions:

/\w+?(\d+)/ or /\S+?(\d+)/ 

(The second one is more in tune with your \S* specification.)

Your expression satisfies any condition with one or more digits, and that may be what you want. It could be a string of digits surrounded by spaces (" 123 "), because the border between the last space and the first digit satisfies zero-or-more non-space, same thing is true about the border between the '3' and the following space.

Chances are that you don't need any specification and capturing the first digits in the string is enough. But when it's not, it's good to know how to specify expected patterns.

Upvotes: 1

ikegami
ikegami

Reputation: 385556

Regex atoms will match as much as they can.

Initially, the first \S* matched "test123", but the regex engine had to backtrack to allow \d+ to match. The result is:

 +------------------- Matches "test12"
 |    +-------------- Matches "3"
 |    |    +--------- Matches ""
 |    |    |
---  ---  ---
\S* (\d+) \S*

All you need is:

my ($num) = "test123" =~ /(\d+)/;

It'll try to match at position 0, then position 1, ... until it finds a digit, then it will match as many digits it can.

Upvotes: 9

Eugene Yarmash
Eugene Yarmash

Reputation: 149736

\S matches any non-whitespace characters, including digits. You want \d+:

my ($number) = 'test123' =~ /(\d+)/;

Upvotes: 1

Alberto Moriconi
Alberto Moriconi

Reputation: 1655

"something122320" =~ /(\d+)/ will return 122320; this is probably what you're trying to do ;)

Upvotes: 1

Toni
Toni

Reputation: 338

The * in your regex are greedy, that's why they "eat" also numbers. Exactly what @Marc said, you don't need them.

perl -e '$_ = "qwe123qwe"; s/(\d+)/$numbers=$1/e; print $numbers . "\n";'

Upvotes: 1

Marc
Marc

Reputation: 11613

I think parentheses signify capture groups, which is exactly what you don't want. Remove them. You're looking for /\d+/ or /[0-9]+/

Upvotes: -1

Related Questions