Jannuzzo
Jannuzzo

Reputation: 169

find a specific word in string php

I have a text in PHP stored in the variable $row. I'd like to find the position of a certain group of words and that's quite easy. What's not so easy is to make my code recognize that the word it has found is exactly the word i'm looking for or a part of a larger word. Is there a way to do it?

Example of what I'd like to obtain

CODE:

$row= "some ugly text of some kind i'd like to find in someway"
$token= "some";
$pos= -1;
$counter= substr_count($row, $token);
for ($h=0; $h<$counter; $h++) {
     $pos= strpos($row, $token, $pos+1);
     echo $pos.' ';
}

OUTPUT:

what I obtain:

0 17 47

what I'd like to obtain

0 17

Any hint?

Upvotes: 0

Views: 291

Answers (3)

rm-vanda
rm-vanda

Reputation: 3168

Use preg_match():

if(preg_match("/some/", $row))
// [..]

The first argument is a regex, which can match virtually anything you want to match. But, there are dire warnings about using it to match things like HTML.

Upvotes: -1

Amal
Amal

Reputation: 76646

Use preg_match_all() with word boundaries (\b):

$search = preg_quote($token, '/');
preg_match_all("/\b$search\b/", $row, $m, PREG_OFFSET_CAPTURE);

Here, the preg_quote() statement is used to correctly escape the user input so as to use it in our regular expression. Some characters have special meaning in regular expression language — without proper escaping, those characters will lose their "special meaning" and your regex might not work as intended.

In the preg_match_all() statement, we are supplying the following regex:

/\b$search\b/

Explanation:

  • / - starting delimiter
  • \b - word boundary. A word boundary, in most regex dialects, is a position between a word character (\w) and a non-word character (\W).
  • $search - escaped search term
  • \b - word boundary
  • / - ending delimiter

In simple English, it means: find all the occurrences of the given word some.

Note that we're also using PREG_OFFSET_CAPTURE flag here. If this flag is passed, for every occurring match the appendant string offset will also be returned. See the documentation for more information.

To obtain the results you want, you can simply loop through the $m array and extract the offsets:

$result = implode(' ', array_map(function($arr) {
    return $arr[1];
}, $m[0]));

echo $result;

Output:

0 18

Demo

Upvotes: 3

Max
Max

Reputation: 6801

What you're looking for is a combination of Regex with a word boundaries pattern and the flag to return the offset (PREG_OFFSET_CAPTURE).

PREG_OFFSET_CAPTURE

If this flag is passed, for every occurring match the appendant string offset will also be returned. Note that this changes the value of matches into an array where every element is an array consisting of the matched string at offset 0 and its string offset into subject at offset 1.

$row= "some ugly text of some kind i'd like to find in someway";
$pattern= "/\bsome\b/i";
preg_match_all($pattern, $row, $matches, PREG_OFFSET_CAPTURE);

And we get something like this:

Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => some
                    [1] => 0
                )
            [1] => Array
                (
                    [0] => some
                    [1] => 18
                )
        )
)

And just loop through the matches and extract the offset where the needle was found in the haystack.

// store the positions of the match
$offsets = array();
foreach($matches[0] as $match) {
    $offsets[] = $match[1];
}

// display the offsets
echo implode(' ', $offsets);

Upvotes: 2

Related Questions