Reputation: 897
I would like to scan a large piece of text using PHP and find all matches for a pattern, but then also 2 lines above the match and 2 lines below.
My text looks like this, but with some extra unnecessary text above and below this sample:
1
Description text
123.456.12
10.00
10.00
3
Different Description text
234.567.89
10.00
30.00
#Some footer text that is not needed and will change for each text file#
15
More description text
564.238.02
4.00
60.00
15
More description text
564.238.02
4.00
60.00
#Some footer text that is not needed and will change for each text file#
15
More description text
564.238.02
4.00
60.00
15
More description text
564.238.02
4.00
60.00
Using PHP, I am looking to match each number in bold (always same format - 3 numbers, dot, 3 numbers, dot, 2 numbers) but then also return the previous 2 lines and the next 2 lines and hopefully return an array so that I can use:
$contents[$i]["qty"] = "1";
$contents[$i]["description"] = "Description text";
$contents[$i]["price"] = "10.00";
$contents[$i]["total"] = "10.00";
etc...
Is this possible and would I use regex? Any help or advice would be greatly appreciated!
Thanks
ANSWERED BY vzwick
This is my final code that I used:
$items_array = array();
$counter = 0;
if (preg_match_all('/(\d+)\n\n(\w.*)\n\n(\d{3}\.\d{3}\.\d{2})\n\n(\d.*)\n\n(\d.*)/', $text_file, $matches)) {
$items_string = $matches[0];
foreach ($items_string as $value){
$item = explode("\n\n", $value);
$items_array[$counter]["qty"] = $item[0];
$items_array[$counter]["description"] = $item[1];
$items_array[$counter]["number"] = $item[2];
$items_array[$counter]["price"] = $item[3];
$items_array[$counter]["total"] = $item[4];
$counter++;
}
}
else
{
die("No matching patterns found");
}
print_r($items_array);
Upvotes: 0
Views: 1625
Reputation: 919
(.)+\n+(.)+\n+(\d{3}\.\d{3}\.\d{2})\n+(.)+\n+(.)+
It might be necessary to replace \n with \r\n. Make sure the regex is in a mode when the "." doesn't match with the new line character.
To reference groups by names, use named capturing group:
(?P<name>regex)
example of named capturing groups.
Upvotes: 1
Reputation: 11044
$filename = "yourfile.txt";
$fp = @fopen($filename, "r");
if (!$fp) die('Could not open file ' . $filename);
$i = 0; // element counter
$n = 0; // inner element counter
$field_names = array('qty', 'description', 'some_number', 'price', 'total');
$result_arr = array();
while (($line = fgets($fp)) !== false) {
$result_arr[$i][$field_names[$n]] = trim($line);
$n++;
if ($n % count($field_names) == 0) {
$i++;
$n = 0;
}
}
fclose($fp);
print_r($result_arr);
Edit: Well, regex then.
$filename = "yourfile.txt";
$file_contents = @file_get_contents($filename);
if (!$file_contents) die("Could not open file " . $filename . " or empty file");
if (preg_match_all('/(\d+)\n\n(\w.*)\n\n(\d{3}\.\d{3}\.\d{2})\n\n(\d.*)\n\n(\d.*)/', $file_contents, $matches)) {
print_r($matches[0]);
// do your matching to field names from here ..
}
else
{
die("No matching patterns found");
}
Upvotes: 2
Reputation: 1901
You could load the file in an array, and them use array_slice, to slice each 5 blocks of lines.
<?php
$file = file("myfile");
$finalArray = array();
for($i = 0; $i < sizeof($file); $i = $i+5)
{
$finalArray[] = array_slice($file, $i, 5);
}
print_r($finalArray);
?>
Upvotes: 0