Scheda
Scheda

Reputation: 526

PHP: Speeding up a very large loop

I've seen this asked before, and maybe this is something I need to pass off to another language (ideally not) but I'm stuck trying to optimize looping through a large array.

I have a (potentially) large 2d array that looks something like this.

[
  ['i am a string']
  ['i am also a string']
]

And I need to loop through the array and count all instances of words.

Here's the current loop for that.

$words = [];

foreach ($rows as $row) {
    $text = explode(' ', $row);

    foreach ($text as $word) {
        if (isset($words[$word])) {
            $words[$word]++;
            continue;
        }

        $words[$word] = 1;
    }
}

I've tested this with array_reduce, array_map, converting it to a single (massive) array of words and using array_count_values but so far this foreach loop is the fastest way to do it.

But I'm really hoping there's a faster way that I have yet to discover.

For reference, I'm going through about 250k words in this instance, but that number goes up by the day.

Any help is appreciated!

Upvotes: 3

Views: 5785

Answers (1)

elixenide
elixenide

Reputation: 44851

The first thing that jumps out is your use of foreach instead of count and a for loop. for loops with pre-counting are usually much, much faster than foreach loops. See PHPBench.com for some test results.

Also, you might be better off using a binary tree, rather than an associative array. An associative array with potentially thousands of elements is likely to cause huge memory and performance issues.

Finally, as others have pointed out in comments, cache some of this if possible. That's a huge calculation to perform regularly if you can be sure that at least some of the data doesn't change.

Upvotes: 2

Related Questions