Reputation: 509
I am trying to do the following :
grab 5 words before the search phrase (or Y if there is only Y words there) and 5 words after the search phrase (or Y if there is only Y words there) from a block of text (when I say words I mean words or numbers whatever is in the block of text)
eg
The block of text: "Welcome to Stack Overflow! Visit your user page to set your name and email."
if you was to search "visit your" it would return: "Welcome to Stack Overflow! Visit your user page to set your"
I've tried using this
$preg_safe = str_replace(" ", "\s", preg_quote($search));
$pattern = "/(\w*\S\s+){0,8}\S*\b($preg_safe)\b\S*(\s\S+){0,8}/ix";
if(preg_match_all($pattern, $full_text, $matches))
{
$result = str_replace(strtolower($search), "<span class='searched-for'>$search</span>", strtolower($matches[0][0]));
}
else
{
$result = false;
}
And it works if the search phrase is in English, but I need it to work in other languages as well. It doesn't work for an Hebrew search phrase for example.
I've tried to change the pattern to :
$pattern = "(*UTF8)/(\w*\S\s+){0,8}\S*\b($preg_safe)\b\S*(\s\S+){0,8}/i";
But it didn't work.
How can I make it work for other languages?
////////////////// EDIT //////////
As enrico.bacis suggested - I've changed the pattern to :
$pattern = "/(\w\p{Hebrew}*\S\s+){0,20}\S*\b($preg_safe)\b\S*(\s\S+){0,20}/ixu";
Now it works for English and Hebrew search phrases, but the result text is being cut when there is a special character (' for example).
How can I make the pattern return the text around the search phrase even if it contains special characters?
Upvotes: 0
Views: 495
Reputation: 31524
Your problem is on the \w
that is not matching Hebrew characters, in fact \w
is just a shortcut for a so-called "word" character: [A-Za-z0-9_]
.
To make a regex able to capture also Hebrew characters you need only to make two changes:
Add u
to the modifier to manage UTF8 characters (so your modifier will be /ixu
)
Replace [\w\p{Hebrew}]
for every occurrence of \w
in your pattern.
You can also check here for more answers on this topic.
Upvotes: 1