Reputation: 16768
Is there a nice way to iterate on the characters of a string? I'd like to be able to do foreach
, array_map
, array_walk
, array_filter
etc. on the characters of a string.
Type casting/juggling didnt get me anywhere (put the whole string as one element of array), and the best solution I've found is simply using a for loop to construct the array. It feels like there should be something better. I mean, if you can index on it shouldn't you be able to iterate as well?
This is the best I've got
function stringToArray($s)
{
$r = array();
for($i=0; $i<strlen($s); $i++)
$r[$i] = $s[$i];
return $r;
}
$s1 = "textasstringwoohoo";
$arr = stringToArray($s1); //$arr now has character array
$ascval = array_map('ord', $arr); //so i can do stuff like this
$foreach ($arr as $curChar) {....}
$evenAsciiOnly = array_filter( function($x) {return ord($x) % 2 === 0;}, $arr);
Is there either:
A) A way to make the string iterable
B) A better way to build the character array from the string (and if so, how about the other direction?)
I feel like im missing something obvious here.
Upvotes: 180
Views: 230533
Reputation: 47894
Depending on your needs/definition of "characters", it may be most helpful to keep multibyte "clusters" intact.
From PHP8.2.18, better handling of multi-component emojis has been implemented with grapheme_
functions.
Code: (Demo)
$text = 'Hey 🙇♂️ boy';
for ($i = 0, $len = grapheme_strlen($text); $i < $len; ++$i) {
echo grapheme_substr($text, $i, 1) . "\n";
}
Output:
H
e
y
🙇♂️
b
o
y
Even using mb_
functions would have produced: (Demo)
H
e
y
🙇
♂
️
b
o
y
To simplify this task, PHP8.4 has added a new splitting function to the grapheme_
family: grapheme_split().
Code:
$text = 'Hey 🙇♂️ boy';
foreach (grapheme_split($text) as $g) {
echo $g . "\n";
}
Upvotes: 1
Reputation: 1425
Expanded from @SeaBrightSystems answer, you could try this:
$s1 = "textasstringwoohoo";
$arr = str_split($s1); //$arr now has character array
Upvotes: 6
Reputation: 2712
If your string contains only ASCII (i.e. "English") characters, then use str_split.
$str = 'some text';
foreach (str_split($str) as $char) {
var_dump($char);
}
If your string might contain Unicode (i.e. "non-English") characters, then you must use mb_str_split.
$str = 'μυρτιὲς δὲν θὰ βρῶ';
foreach (mb_str_split($str) as $char) {
var_dump($char);
}
Upvotes: 254
Reputation: 7483
Most of the answers forgot about non English characters !!!
strlen
counts BYTES, not characters, that is why it is and it's sibling functions works fine with English characters, because English characters are stored in 1 byte in both UTF-8 and ASCII encodings, you need to use the multibyte string functions mb_*
This will work with any character encoded in UTF-8
// 8 characters in 12 bytes
$string = "abcdأبتث";
$charsCount = mb_strlen($string, 'UTF-8');
for($i = 0; $i < $charsCount; $i++){
$char = mb_substr($string, $i, 1, 'UTF-8');
var_dump($char);
}
This outputs
string(1) "a"
string(1) "b"
string(1) "c"
string(1) "d"
string(2) "أ"
string(2) "ب"
string(2) "ت"
string(2) "ث"
Upvotes: 9
Reputation: 71
Hmm... There's no need to complicate things. The basics work great always.
$string = 'abcdef';
$len = strlen( $string );
$x = 0;
Forward Direction:
while ( $len > $x ) echo $string[ $x++ ];
Outputs: abcdef
Reverse Direction:
while ( $len ) echo $string[ --$len ];
Outputs: fedcba
Upvotes: 5
Reputation: 7577
Iterate string:
for ($i = 0; $i < strlen($str); $i++){
echo $str[$i];
}
Upvotes: 131
Reputation: 4085
For those who are looking for the fastest way to iterate over strings in php, Ive prepared a benchmark testing.
The first method in which you access string characters directly by specifying its position in brackets and treating string like an array:
$string = "a sample string for testing";
$char = $string[4] // equals to m
I myself thought the latter is the fastest method, but I was wrong.
As with the second method (which is used in the accepted answer):
$string = "a sample string for testing";
$string = str_split($string);
$char = $string[4] // equals to m
This method is going to be faster cause we are using a real array and not assuming one to be an array.
Calling the last line of each of the above methods for 1000000
times lead to these benchmarking results:
Using string[i]
0.24960017204285 Seconds
Using str_split
0.18720006942749 Seconds
Which means the second method is way faster.
Upvotes: 8
Reputation: 4630
// Unicode Codepoint Escape Syntax in PHP 7.0
$str = "cat!\u{1F431}";
// IIFE (Immediately Invoked Function Expression) in PHP 7.0
$gen = (function(string $str) {
for ($i = 0, $len = mb_strlen($str); $i < $len; ++$i) {
yield mb_substr($str, $i, 1);
}
})($str);
var_dump(
true === $gen instanceof Traversable,
// PHP 7.1
true === is_iterable($gen)
);
foreach ($gen as $char) {
echo $char, PHP_EOL;
}
Upvotes: 3
Reputation: 1748
You can also just access $s1 like an array, if you only need to access it:
$s1 = "hello world";
echo $s1[0]; // -> h
Upvotes: 14
Reputation: 16445
If your strings are in Unicode you should use preg_split
with /u
modifier
From comments in php documentation:
function mb_str_split( $string ) {
# Split at all position not after the start: ^
# and not before the end: $
return preg_split('/(?<!^)(?!$)/u', $string );
}
Upvotes: 21