mickmackusa
mickmackusa

Reputation: 48031

Why should native sorting functions be called by array_walk() with explicit sorting function parameters?

I ran a few tests on 2d arrays and found that in some contexts, native sorting functions can give unexpected results when array keys are 2, 5, 10, or 13.

$expected = range(0, 11);
$array = array_fill(0, 50, $expected);  // create a 2d array with 50 rows containing 12 integer values

array_walk($array, 'sort');  // sort the 12 integer values within each row

var_export(
    array_filter(
        $array,
        fn($row) => $row !== $expected  // only keep rows which are sorted incorrectly
    )
);

❌ Sorting rows of alphanumeric strings causes too many unintended sorting outcomes to list.

What is causing these inconsistencies?

Upvotes: 3

Views: 79

Answers (1)

mickmackusa
mickmackusa

Reputation: 48031

The seldom realized issue arises because array_walk($array, 'sort'); or array_walk($array, sort(...)); calls sort() with an unintended second argument.

When array_walk() is implemented in this way, it passes two arguments to the callback function -- $value by reference as well as the $key.

The second argument of sort() is meant to be for a consistent sorting flag, not ever-changing array keys.

Numeric keys of 2, 5, 10, and 13 correspond to sorting flag constants which call for comparing values as strings instead of numeric values.

The fundamental sorting constants list:

Constant Integer Value Description
SORT_REGULAR 0 Standard comparison
SORT_NUMERIC 1 Numeric comparison
SORT_STRING 2 String comparison
SORT_LOCALE_STRING 5 String comparison based on locale settings
SORT_NATURAL 6 Natural order string comparison
SORT_FLAG_CASE 8 Case-insensitive sorting (used with SORT_STRING or SORT_NATURAL)

Relevant combined flags:

Constant Integer Value
SORT_STRING | SORT_FLAG_CASE 10 (2 + 8)
SORT_LOCALE_STRING | SORT_NATURAL 13 (5 + 6)

Even if some Stack Overflow posts work properly in their context, it would be better practice to not allow your codebase to use these potential silent-bug producers.
💣 2021-02-17 answer: array_walk($pairs, 'sort');
💣 2020-04-28 answer: array_walk($sorted, 'sort');
💣 2018-08-02 question: array_walk($a, 'ksort');
💣 2017-05-24 answer: array_walk($array, 'asort');
💣 2015-08-08 answer: array_walk($data, 'sort');
💣 2015-03-12 answer: array_walk($return, 'sort');
💣 2014-11-03 answer: array_walk($arr, 'sort');
💣 2014-08-01 answer: // array_walk($array1, 'sort');
💣 2014-04-14 answer: array_walk($array, 'ksort');
💣 2014-01-16 answer: array_walk($data, sort);
💣 2012-11-06 answer array_walk($array,'krsort');
💣 2012-08-15 answer: array_walk($dates, ksort);
💣 2012-07-09 question: array_walk($counts, 'ksort', SORT_NUMERIC);
💣 2012-01-03 answer: array_walk($complex, 'asort');
💣 2010-12-07 answer: array_walk($array, 'asort');
💣 2009-12-07 question: array_walk($array, 'sort');

⭐ To avoid these bugs, explicitly pass the desired parameter(s) to the sorting function:

  1. array_walk() with explicit callback signature:

    array_walk($array, static fn(&$v) => sort($v));
    
  2. modify by reference in a foreach loop:

    foreach ($array as &$row) {
        sort($row);
    }
    
  3. sort the key-accessed row in a foreach loop:

    foreach (array_keys($array) as $key) {
        sort($array[$key]);
    }
    

🔎 Find them and fix them in your codebase:

  • \barray_walk\([^,]+,\s*?(?:["'][ak]?r?sort["']|[ak]?r?sort\(\.{3}\))[^)]*\)

A few more duds:

💣 Calling natsort() or natcasesort() inside of array_walk() by name will not act buggy; they will simply emit a Fatal error because they only accept one parameter.

Fatal error: Uncaught ArgumentCountError: natsort() expects exactly 1 argument, 2 given

💣 Trying to use array_map() (e.g. $array = array_map('sort', $array);) will return all values as true and emit Warnings.

Warning: sort(): Argument #1 ($array) must be passed by reference, value given

Upvotes: 4

Related Questions