lolalola
lolalola

Reputation: 3823

regex: find one-digit number

I need to find the text of all the one-digit number.

My code:

$string = 'text 4 78 text 558 [email protected] 5 text 78998 text';
$pattern = '/ [\d]{1} /';

(result: 4 and 5)

Everything works perfectly, just wanted to ask it is correct to use spaces? Maybe there is some other way to distinguish one-digit number.

Thanks

Upvotes: 24

Views: 103208

Answers (5)

Cary Swoveland
Cary Swoveland

Reputation: 110675

If one-digit numbers can be preceded or followed by characters other than digits (e.g., "A1 Sauce" or "He lives in unit 9B") use

(?<!\d)\d(?!\d)

Demo

The regular expression reads, match a digit (\d) that is neither preceded nor followed by digit, (?<!\d) being a negative lookbehind and (?!\d) being a negative lookahead.

Upvotes: 3

user626607
user626607

Reputation:

Search around word boundaries:

\b\d\b

As explained by the others, this will extract single digits meaning that some special characters might not be respected like "." in an ip address. To address that, see F.J and Mike Brant's answer(s).

Upvotes: 8

Andrew Clark
Andrew Clark

Reputation: 208435

First of all, [\d]{1} is equivalent to \d.

As for your question, it would be better to use a zero width assertion like a lookbehind/lookahead or word boundary (\b). Otherwise you will not match consecutive single digits because the leading space of the second digit will be matched as the trailing space of the first digit (and overlapping matches won't be found).

Here is how I would write this:

(?<!\S)\d(?!\S)

This means "match a digit only if there is not a non-whitespace character before it, and there is not a non-whitespace character after it".

I used the double negative like (?!\S) instead of (?=\s) so that you will also match single digits that are at the beginning or end of the string.

I prefer this over \b\d\b for your example because it looks like you really only want to match when the digit is surrounded by spaces, and \b\d\b would match the 4 and the 5 in a string like 192.168.4.5

To allow punctuation at the end, you could use the following:

(?<!\S)\d(?![^\s.,?!])

Add any additional punctuation characters that you want to allow after the digit to the character class (inside of the square brackets, but make sure it is after the ^).

Upvotes: 35

Mike Brant
Mike Brant

Reputation: 71384

It really depends on where the numbers can appear and whether you care if they are adjacent to other characters (like . at the end of a sentence). At the very least, I would use word boundaries so that you can get numbers at the beginning and end of the input string:

$pattern = '/\b\d\b/';

But you might consider punctuation at the end like:

$pattern = '/\b\d(\b|\.|\?|\!)/';

Upvotes: 0

kjetilh
kjetilh

Reputation: 4976

Use word boundaries. Note that the range quantifier {1} (a single \d will only match one digit) and the character class [] is redundant because it only consists of one character.

\b\d\b

Upvotes: 23

Related Questions