user1030151
user1030151

Reputation: 397

convert byte position to character position in php

I have the byte position of a character in an utf-8 string (got it via preg_match and PREG_OFFSET_CAPTURE). But I need the character position. How can I get it?

Here is an example:

I have something like this:

$x = 'öüä nice world';
preg_match('/nice/u', $x, $m, PREG_OFFSET_CAPTURE);
var_dump($m);

which results in:

array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(4) "nice"
    [1]=>
    int(7)
  }
}

So I have the byte position which is 7.

But I need the character position which is 4. Is there a way to convert the byte position to the character position?

This example is highly simplified. It's not an option for me to just use mb_strpos or such things to find the position of the word "nice". I need the regular expression and actually I need preg_match_all instead of preg_match. So I think to convert the position would be the best way for me.

Upvotes: 1

Views: 317

Answers (1)

l'L'l
l'L'l

Reputation: 47219

As mentioned you could build upon one of the examples from a similar question:

$x = 'öüä nice öüä nice öüä nice öüä nice öüä nice';
$r = preg_match_all('/nice/u', $x, $m, PREG_OFFSET_CAPTURE);
for($i = 0; $i < $r; $i++) {
    var_dump(mb_strlen(substr($x, 0, $m[0][$i][1])));
}

Result:

int(4)
int(13)
int(22)
int(31)
int(40)

This shows each character position at which "nice" would immediately follow...

Upvotes: 1

Related Questions