Reputation: 5148
This is my original string :
$license_str = "<strong>Code#1: </strong>11516331226428373002<br><strong>Code#2: </strong>11512231686337183002<br>";
First I tried to strip html tags so :
$license_str = strip_tags($license_str );
output would be :
Code#1: 11512231686337183002Code#2: 11516331686337183002
Then I ran preg_split to extract two license codes :
$license_code = preg_split("@: @",$license_str,Null,PREG_SPLIT_NO_EMPTY);
output is wrong :
array(3) {
[0]=>
string(6) "Code#1"
[1]=>
string(26) "11512231686337183002Code#2"
[2]=>
string(20) "11512231686337183002"
}
It must return two array value with two license number
Is there a better way to do this?
PS. : Code#1 and Code#2 is dynamic and we could have #3 or any other number
Upvotes: 1
Views: 224
Reputation: 1447
It isn't recommended to parse HTML using string manipulation because of the many edge cases inherent in HTML compliant code. You should rather use an HTML parser.
One approach is to use PHP's DOM extension, like this:
$license_str = "<strong>Code#1: </strong>11516331226428373002<br><strong>Code#2: </strong>11512231686337183002<br>";
$license_codes = [];
$dom = DOMDocument::loadHTML($license_str);
$domlist = (new DOMXPath($dom))->evaluate('//strong[contains(.,"Code#")]/following-sibling::text()');
foreach ($domlist as $domtext) {
$license_codes[] = $domtext->textContent;
}
/*
$license_codes = array (
0 => '11516331226428373002',
1 => '11512231686337183002',
);
*/
This above code will extract the string that follows any HTML <strong>
tag containing the text "Code#".
You can try it out here.
Upvotes: 0
Reputation: 9509
You can use a regex to replace the Code#1
, Code#2
, ... with a #
, and then split the string on that character.
$license_str = "<strong>Code#1: </strong>11516331226428373002<br><strong>Code#2: </strong>11512231686337183002<br>";
$license_str = strip_tags($license_str );
$license_str = preg_replace('/Code#[0-9]+: /', '#', $license_str);
$license_code = explode("#",$license_str);
var_dump($license_code);
Alternatively, the following will apply a single regex on the HTML that extracts the codes, however, it depends on the codes always being between a </strong>
and a <br>
tag:
$matches = array();
$license_str = "<strong>Code#1: </strong>11516331226428373002<br><strong>Code#2: </strong>11512231686337183003<br>";
$license_code = preg_match_all('/<\/strong>(\d*)<br>/', $license_str, $matches);
$matches = $matches[1] ?? false;
var_dump($matches);
Upvotes: 0
Reputation: 64
You can make use of HTML tags instead of stripping them, like this:
<?php
$license_str = "<strong>Code#1: </strong>11516331226428373002<br><strong>Code#2: </strong>11512231686337183002<br>";
if(preg_match_all('#<strong>([^\<]+)\s</strong>([^\<]+)<br>#', $license_str, $matches)){
$license_code = array_combine($matches[1], $matches[2]);
print_r($license_code);
}
Upvotes: 0
Reputation: 3006
You can split the string by pattern '/Code#(\d+):/'
like.
<?php
$license_str = "<strong>Code#1: </strong>11516331226428373002<br><strong>Code#2: </strong>11512231686337183002<br><strong>Code#3: </strong>11512231686337183008<strong>Code#4: </strong>11512231686337183007<br>";
$license_str = strip_tags($license_str );
//split by code#number: like, code#1:, code#2:, code#3: etc.
$result = preg_split("/Code#(\d+):/", $license_str, -1, PREG_SPLIT_NO_EMPTY);
echo "<pre>";
print_r($result);
Upvotes: 2