BraveButter
BraveButter

Reputation: 1458

phonetic Case-insensitive search in array

I got an array like this

array
  0 => string 'Schmitt' (length=6)
  1 => string 'Maier' (length=1)
  2 => string 'Müller' (length=7)
  3 => string 'müller' (length=7)
  4 => string 'mueller' (length=7)
  5 => string 'Toll' (length=4)

And I want to get something like this

array
  0 => string 'Schmitt' (length=6)
  1 => string 'Maier' (length=1)
  2 => string 'Müller' (length=7)
  3 => string 'Toll' (length=4)

I would like to check for all umlauts like 'ä' 'ö' 'ü' and it should be case insensitive. The first Letter will be uppercase, but this I will get by myself. Just need help with the phonetic stuff, because I don't want to do a huge if...else thing.

Upvotes: 2

Views: 100

Answers (2)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

You can do it storing the "phonetic version" as a key in the result array (this way you can know if the word has been already added without searching with in_array, you only have to check if the key exists):

$names = ['Schmitt', 'Maier', 'Müller', 'müller', 'mueller', 'Toll'];
$rules = ['ü' => 'ue', 'ä' => 'ae', 'ö' => 'oe', 'ß' => 'ss' ]; // uzw

$result = [];

foreach ($names as $name) {
    $phonetic = strtr(mb_strtolower($name), $rules);
    if ( !isset($result[$phonetic]) )
        $result[$phonetic] = $name; // put mb_ucfirst here
}

$result = array_values($result);

print_r($result);

Since you are dealing with multi-byte characters, you need to use mb_strtolower to avoid errors. For the same reason, if you need to make the first character upper-case, you should use the function posted by plemieux in the php manual:

function mb_ucfirst($str) {
    $fc = mb_strtoupper(mb_substr($str, 0, 1));
    return $fc . mb_substr($str, 1);
}

Note: mb_ucfirst has been added in PHP 8.4.

Upvotes: 2

Mihai Matei
Mihai Matei

Reputation: 24276

You can try something like this:

$replacements = ['ü' => ['ue']];

$names = ['Schmitt', 'Maier', 'Müller', 'müller', 'mueller', 'Toll'];

$names = array_map('strtolower', $names);

$names = array_reduce($names, function ($carry, $name) use ($replacements) {

    foreach ($replacements as $replaceWith => $replaceWhat) {
        $name = str_replace($replaceWhat, $replaceWith, $name);
    }

    if (!in_array($name, $carry)) {
        $carry[] = $name;
    }

    return $carry;

}, []);

$names = array_map('ucfirst', $names);

var_dump($names);

The result would be:

array(4) {
  [0]=>
  string(7) "Schmitt"
  [1]=>
  string(5) "Maier"
  [2]=>
  string(7) "Müller"
  [3]=>
  string(4) "Toll"
}

Upvotes: 1

Related Questions