Reputation: 11
Below is that data I'm trying to parse:
50‐59 1High300.00 Avg300.00
90‐99 11High222.00 Avg188.73
120‐1293High204.00 Avg169.33
The first section is a weight range, next is a count, followed by Highprice, ending with Avgprice.
As an example, I need to parse the data above into an array which would look like
[0]50-59
[1]1
[2]High300.00
[3]Avg300.00
[0]90-99
[1]11
[2]High222.00
[3]Avg188.73
[0]120‐129
[1]3
[2]High204.00
[3]Avg169.33
I thought about creating an array of what the possible weight ranges can be but I can't figure out how to use the values of the array to split the string.
$arr = array("10-19","20-29","30-39","40-49","50-59","60-69","70-79","80-89","90-99","100-109","110-119","120-129","130-139","140-149","150-159","160-169","170-179","180-189","190-199","200-209","210-219","220-229","230-239","240-249","250-259","260-269","270-279","280-289","290-299","300-309");
Any ideas would be greatly appreciated.
Upvotes: 1
Views: 81
Reputation: 47904
This is a pattern you can trust (Pattern Demo):
/^((\d{0,2})0‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})/m
The other answers overlooked the digital pattern in the weight range
substring. The range start integer always ends in 0
, and the range end integer always ends in 9
; the range always spans ten integers.
My pattern will capture the digits that precede the 0
in the starting integer and reference them immediately after the dash, then require that captured number to be followed by a 9
.
I want to point out that your sample input was a little bit tricky because your ‐
is not the standard -
that is between the 0
and =
on my keyboard. This was a sneaky little gotcha for me to solve.
Method (Demo):
$text = '50‐59 1High300.00 Avg300.00
90‐99 11High222.00Avg188.73
120‐1293High204.00 Avg169.33';
preg_match_all(
'/^((\d{0,2})0‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})/m',
$text,
$matches,
PREG_SET_ORDER
);
var_export(
array_map(
fn($captured) => [
'weight range' => $captured[1],
'count' => $captured[3],
'Highprice' => $captured[4],
'Avgprice' => $captured[5]
],
$matches
)
);
Output:
array (
0 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
1 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
2 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
3 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
)
Upvotes: 0
Reputation: 91430
$arr = array('50-59 1High300.00 Avg300.00',
'90-99 11High222.00 Avg188.73',
'120-129 3High204.00 Avg169.33');
foreach($arr as $str) {
if (preg_match('/^(\d+-\d{1,3})\s*(\d+)(High\d+\.\d\d) (Avg\d+\.\d\d)/i', $str, $m)) {
array_shift($m); //remove group 0 (ie. the whole match)
$result[] = $m;
}
}
print_r($result);
Output:
Array
(
[0] => Array
(
[0] => 50-59
[1] => 1
[2] => High300.00
[3] => Avg300.00
)
[1] => Array
(
[0] => 90-99
[1] => 11
[2] => High222.00
[3] => Avg188.73
)
[2] => Array
(
[0] => 120-129
[1] => 3
[2] => High204.00
[3] => Avg169.33
)
)
Explanation:
/ : regex delimiter
^ : begining of string
( : start group 1
\d+-\d{1,3} : 1 or more digits a dash and 1 upto 3 digits ie. weight range
) : end group 1
\s* : 0 or more space character
(\d+) : group 2 ie. count
(High\d+\.\d\d) : group 3 literal High followed by price
(Avg\d+\.\d\d) : Group 4 literal Avg followed by price
/i : regex delimiter and case Insensitive modifier.
To be more generic, you could replace High
and Avg
by [a-z]+
Upvotes: 0
Reputation: 19375
'#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#'
I don't think that will work because of the space you have in the regex between the weight and count. The thing I'm struggling with is a row like this where there is no space.
120‐1293High204.00 Avg169.33
that needs to be parsed like[0]120‐129 [1]3 [2]High204.00 [3]Avg169.33
You are right. That can be remedied by limiting the number of weight digits to three and making the space optional.
'#^(\d+-\d{1,3}) *…
Upvotes: 1
Reputation: 15141
Hope this will work:
$string='50-59 1High300.00 Avg300.00
90-99 11High222.00 Avg188.73
120-129 3High204.00 Avg169.33';
$requiredData=array();
$dataArray=explode("\n",$string);
$counter=0;
foreach($dataArray as $data)
{
if(preg_match('#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#', $data,$matches))
{
$requiredData[$counter][]=$matches[1];
$requiredData[$counter][]=$matches[2];
$requiredData[$counter][]=$matches[3];
$requiredData[$counter][]=$matches[4];
$counter++;
}
}
print_r($requiredData);
Upvotes: 1