Reputation: 5668
I'm using PHP to parse an e-mail and want to get the number after a specific string.
For example, I would want to get the number 033 from a string that looks like:
Account Number: 033
Account Information: Some text here
The content is actually HTML, so the input string is more accurately presented as:
<font face="Arial, Helvetica, sans-serif" color="#000099"><strong><font color="#660000">Account Number</font></strong><font color="#660000">: 033<br><strong>Account Name</strong>: More text here<br>
There is always the word Account Number:
and then the number and then a line break. I have:
preg_match_all('!\d+!', $str, $matches);
But that just gets all the numbers.
Upvotes: 1
Views: 13731
Reputation: 8191
If the number is always after Account Number:
(including that space at the end), then just add that to your regex:
preg_match_all('/Account Number: (\d+)/',$str,$matches);
// The parentheses capture the digits and stores them in $matches[1]
Results:
$matches Array:
(
[0] => Array
(
[0] => Account Number: 033
)
[1] => Array
(
[0] => 033
)
)
Note: If there is HTML present, then that can be included in the regex as well as long as you don't believe the HTML is subject to change. Otherwise, I suggest using an HTML DOM Parser to get to the plain-text version of your string and using a regex from there.
With that said, the following is an example that includes the HTML in the regex and provides the same output as above:
// Notice the delimiter
preg_match_all('@<font face="Arial, Helvetica, sans-serif" color="#000099"><strong><font color="#660000">Account
Number</font></strong><font color="#660000">: (\d+)@',$str,$matches);
Upvotes: 11
Reputation: 47874
@montes is appropriately calling strip_tags()
to sanitize/simplify the input text before using regex to extract the targeted substring. However, the pattern could use some refinement and assuming there is only one Account Number per email, you shouldn't be using preg_match_all()
, but preg_match()
.
i
pattern modifier.^
or $
metacharacters in the pattern, so the m
pattern modifier is useless..
metacharacters in the pattern, so the s
pattern modifier is useless.\K
restarts the fullstring match. This is beneficial because it removes the necessity to use a capture group.Code: (Demo)
$html = '<font face="Arial, Helvetica, sans-serif" color="#000099"><strong><font
color="#660000">Account Number</font></strong><font color="#660000">: 033<br>
<strong>Account Name</strong>: More text here<br>';
echo preg_match('~Account Number:\s*\K\d+~', strip_tags($html), $match)
? $match[0]
: 'No Account Number Found';
Output:
033
Upvotes: 0
Reputation: 606
Taking the HTML as the base:
$str = '<font face="Arial, Helvetica, sans-serif" color="#000099"><strong><font
color="#660000">Account Number</font></strong><font color="#660000">: 033<br>
<strong>Account Name</strong>: More text here<br>';
preg_match_all('!Account Number:\s+(\d+)!ims', strip_tags($str), $matches);
var_dump($matches);
and we get:
array(2) {
[0]=>
array(1) {
[0]=>
string(19) "Account Number: 033"
}
[1]=>
array(1) {
[0]=>
string(3) "033"
}
}
Upvotes: 1
Reputation: 15045
$str = 'Account Number: 033
Account Information: Some text here';
preg_match('/Account Number:\s*(\d+)/', $str, $matches);
echo $matches[1]; // 033
You don't need to use preg_match_all()
also you did not put your match into a backreference by placing it within parentheses.
Upvotes: 3