jon_darkstar
jon_darkstar

Reputation: 16768

Loop over each character in a string

Is there a nice way to iterate on the characters of a string? I'd like to be able to do foreach, array_map, array_walk, array_filter etc. on the characters of a string.

Type casting/juggling didnt get me anywhere (put the whole string as one element of array), and the best solution I've found is simply using a for loop to construct the array. It feels like there should be something better. I mean, if you can index on it shouldn't you be able to iterate as well?

This is the best I've got

function stringToArray($s)
{
    $r = array();
    for($i=0; $i<strlen($s); $i++) 
         $r[$i] = $s[$i];
    return $r;
}

$s1 = "textasstringwoohoo";
$arr = stringToArray($s1); //$arr now has character array

$ascval = array_map('ord', $arr);  //so i can do stuff like this
$foreach ($arr as $curChar) {....}
$evenAsciiOnly = array_filter( function($x) {return ord($x) % 2 === 0;}, $arr);

Is there either:

A) A way to make the string iterable
B) A better way to build the character array from the string (and if so, how about the other direction?)

I feel like im missing something obvious here.

Upvotes: 180

Views: 230533

Answers (10)

mickmackusa
mickmackusa

Reputation: 47894

Depending on your needs/definition of "characters", it may be most helpful to keep multibyte "clusters" intact.

From PHP8.2.18, better handling of multi-component emojis has been implemented with grapheme_ functions.

Code: (Demo)

$text = 'Hey 🙇‍♂️ boy';
for ($i = 0, $len = grapheme_strlen($text); $i < $len; ++$i) {
    echo grapheme_substr($text, $i, 1) . "\n";
}

Output:

H
e
y
 
🙇‍♂️
 
b
o
y

Even using mb_ functions would have produced: (Demo)

H
e
y
 
🙇
‍
♂
️
 
b
o
y

To simplify this task, PHP8.4 has added a new splitting function to the grapheme_ family: grapheme_split().

Code:

$text = 'Hey 🙇‍♂️ boy';
foreach (grapheme_split($text) as $g) {
    echo $g . "\n";
}

Upvotes: 1

Dairy Window
Dairy Window

Reputation: 1425

Expanded from @SeaBrightSystems answer, you could try this:

$s1 = "textasstringwoohoo";
$arr = str_split($s1); //$arr now has character array

Upvotes: 6

SeaBrightSystems
SeaBrightSystems

Reputation: 2712

Use str_split to iterate ASCII strings (since PHP 5.0)

If your string contains only ASCII (i.e. "English") characters, then use str_split.

$str = 'some text';
foreach (str_split($str) as $char) {
    var_dump($char);
}

Use mb_str_split to iterate Unicode strings (since PHP 7.4)

If your string might contain Unicode (i.e. "non-English") characters, then you must use mb_str_split.

$str = 'μυρτιὲς δὲν θὰ βρῶ';
foreach (mb_str_split($str) as $char) {
    var_dump($char);
}

Upvotes: 254

Accountant م
Accountant م

Reputation: 7483

Most of the answers forgot about non English characters !!!

strlen counts BYTES, not characters, that is why it is and it's sibling functions works fine with English characters, because English characters are stored in 1 byte in both UTF-8 and ASCII encodings, you need to use the multibyte string functions mb_*

This will work with any character encoded in UTF-8

// 8 characters in 12 bytes
$string = "abcdأبتث";

$charsCount = mb_strlen($string, 'UTF-8');
for($i = 0; $i < $charsCount; $i++){
    $char = mb_substr($string, $i, 1, 'UTF-8');
    var_dump($char);
}

This outputs

string(1) "a"
string(1) "b"
string(1) "c"
string(1) "d"
string(2) "أ"
string(2) "ب"
string(2) "ت"
string(2) "ث"

Upvotes: 9

Ash
Ash

Reputation: 71

Hmm... There's no need to complicate things. The basics work great always.

    $string = 'abcdef';
    $len = strlen( $string );
    $x = 0;

Forward Direction:

while ( $len > $x ) echo $string[ $x++ ];

Outputs: abcdef

Reverse Direction:

while ( $len ) echo $string[ --$len ];

Outputs: fedcba

Upvotes: 5

Owen
Owen

Reputation: 7577

Iterate string:

for ($i = 0; $i < strlen($str); $i++){
    echo $str[$i];
}

Upvotes: 131

Amir Hossein Baghernezad
Amir Hossein Baghernezad

Reputation: 4085

For those who are looking for the fastest way to iterate over strings in php, Ive prepared a benchmark testing.
The first method in which you access string characters directly by specifying its position in brackets and treating string like an array:

$string = "a sample string for testing";
$char = $string[4] // equals to m

I myself thought the latter is the fastest method, but I was wrong.
As with the second method (which is used in the accepted answer):

$string = "a sample string for testing";
$string = str_split($string);
$char = $string[4] // equals to m

This method is going to be faster cause we are using a real array and not assuming one to be an array.

Calling the last line of each of the above methods for 1000000 times lead to these benchmarking results:

Using string[i]
0.24960017204285 Seconds

Using str_split
0.18720006942749 Seconds

Which means the second method is way faster.

Upvotes: 8

masakielastic
masakielastic

Reputation: 4630

// Unicode Codepoint Escape Syntax in PHP 7.0
$str = "cat!\u{1F431}";

// IIFE (Immediately Invoked Function Expression) in PHP 7.0
$gen = (function(string $str) {
    for ($i = 0, $len = mb_strlen($str); $i < $len; ++$i) {
        yield mb_substr($str, $i, 1);
    }
})($str);

var_dump(
    true === $gen instanceof Traversable,
    // PHP 7.1
    true === is_iterable($gen)
);

foreach ($gen as $char) {
    echo $char, PHP_EOL;
}

Upvotes: 3

Moritur
Moritur

Reputation: 1748

You can also just access $s1 like an array, if you only need to access it:

$s1 = "hello world";
echo $s1[0]; // -> h

Upvotes: 14

Dawid Ohia
Dawid Ohia

Reputation: 16445

If your strings are in Unicode you should use preg_split with /u modifier

From comments in php documentation:

function mb_str_split( $string ) { 
    # Split at all position not after the start: ^ 
    # and not before the end: $ 
    return preg_split('/(?<!^)(?!$)/u', $string ); 
} 

Upvotes: 21

Related Questions