Reputation: 2713
The data contained in the text file (actually a .dat) looks like:
LIN*1234*UP*abcde*33*0*EA
LIN*5678*UP*fghij*33*0*EA
LIN*9101*UP*klmno*33*23*EA
There are actually over 500,000 such lines in the file.
This is what I'm using now:
//retrieve file once
$file = file_get_contents('/data.dat');
$file = explode('LIN', $file);
...some code
foreach ($list as $item) { //an array containing 10 items
foreach($file as $line) { //checking if these items are on huge list
$info = explode('*', $line);
if ($line[3] == $item[0]) {
...do stuff...
break; //stop checking if found
}
}
}
The problem is it runs way too slow - about 1.5 seconds of each iteration. I separately confirmed that it is not the '...do stuff...' that is impacting speed. Rather, its the search for the correct item.
How can I speed this up? Thank you.
Upvotes: 2
Views: 2408
Reputation: 889
When you do file_get_contents
, it loads the stuff into the memory so you can only imagine how resource intensive the process may be. Not to mention you have a nested loop, that's (O)n^2
You can either split the file if possible or use fopen
, fgets
and fclose
to read them line by line.
If I was you, I’d use another language like C++
or Go
if I really need the speeds.
Upvotes: 0
Reputation: 173642
If each item is on its own line, instead of loading the whole thing in memory, it might be better to use fgets()
instead:
$f = fopen('text.txt', 'rt');
while (!feof($f)) {
$line = rtrim(fgets($f), "\r\n");
$info = explode('*', $line);
// etc.
}
fclose($f);
PHP file streams are buffered (~8kB), so it should be decent in terms of performance.
The other piece of logic can be rewritten like this (instead of iterating the file multiple times):
if (in_array($info[3], $items)) // look up $info[3] inside the array of 10 things
Or, if $items
is suitably indexed:
if (isset($items[$info[3]])) { ... }
Upvotes: 3
Reputation: 26076
file_get_contents
loads the whole file into memory as an array & then your code acts on it. Adapting this sample code from the official PHP fgets
documentation should work better:
$handle = @fopen("test.txt", "r");
if ($handle) {
while (($buffer = fgets($handle, 4096)) !== false) {
$file_data = explode('LIN', $buffer);
foreach($file_data as $line) {
$info = explode('*', $line);
$info = array_filter($info);
if (!empty($info)) {
echo '<pre>';
print_r($info);
echo '</pre>';
}
}
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
}
The output of the above code using your data is:
Array
(
[1] => 1234
[2] => UP
[3] => abcde
[4] => 33
[6] => EA
)
Array
(
[1] => 5678
[2] => UP
[3] => fghij
[4] => 33
[6] => EA
)
Array
(
[1] => 9101
[2] => UP
[3] => klmno
[4] => 33
[5] => 23
[6] => EA
)
But still unclear about your missing code since the line that states:
foreach ($list as $item) { //an array containing 10 items
That seems to be another real choke point.
Upvotes: 0