Reputation: 1672
I'm trying to make excerpts from long texts while deleting everything that comes after the last space if the string length is more than 110 characters.
$string = 'Стихи похожи на людей: помнят прошлое и ничего не знают о будущем, хотят жить вечно, a страница уже перелистывается.';
if (mb_strlen($string ) > 110) {
$pos = mb_strpos($string , ' ', 110);
$excerpt = rtrim(mb_substr($string, 0, $pos), '.,—-_!@\'"()*#~').'...';
}
If I print with print_r(mb_strlen($pos));
the result of $pos
is 0
, and its working correctly if I change $pos
to $pos = mb_strpos($quote_content, ' ', 99);
.
The last word in this case is 16 characters long and the whole string is 116 characters long so it makes perfect sense as to why a 99 offset works while anything above will result a $pos
value of 0
thus instead of making an excerpt it just returns ...
(based on the current example).
I have quite a few strings here with different string length and words length so I need a dynamic solution that will work in all cases. Any ideas?
Upvotes: 3
Views: 529
Reputation: 147146
A simple (and fast) way to cut a string to a fixed number of characters is with preg_replace
:
$string = 'Стихи похожи на людей: помнят прошлое и ничего не знают о будущем, хотят жить вечно, a страница уже перелистывается.';
$excerpt = preg_replace('/^(.{1,110})\s.*$/u', '$1...', $string);
echo $excerpt;
Output:
Стихи похожи на людей: помнят прошлое и ничего не знают о будущем, хотят жить вечно, a страница уже...
The regex works by looking for some number of characters ^(.{1,110})\s
(from 1 to 110) from the start of a string and up to a space character. Since the quantifier is greedy it takes as many characters as it can. Those characters are captured in a group. The rest of the string is then matched by .*$
, and the whole string is replaced by the first capture group and three .'s
($1...
), giving just the first portion as desired. The u
flag on the regex means it will count unicode characters correctly. To adjust the length of the excerpt, simply change the 110
to whatever length you need.
Edit
The regex can also be modified to strip off any non-word characters (so you don't end up with the quick brown fox,...
) by modifying it to insist that the last character of the capture group is a word
character and then allowing the following character to be a non-word character:
$string = 'Стихи похожи на людей: помнят прошлое и ничего не знают о будущем, хотят жить вечно, a страница уже перелистывается.';
$excerpt = preg_replace('/^(.{1,23}\w)\W.*$/u', '$1...', $string);
echo $excerpt;
Output:
Стихи похожи на людей...
Upvotes: 1
Reputation: 639
A lazy fix by checking all chars from 110 until space was found
// lazy fix by checking all chars from 110 until space was found
if (mb_strlen($string) > 110) {
$p = 110;
while(!($pos = mb_strpos($string , ' ', $p--))){};
$excerpt = rtrim(mb_substr($string, 0, $pos), '.,—-_!@\'"()*#~') . ' ... ';
}
Upvotes: 0
Reputation: 164
This will cut the string at the last space without cutting words:
$excerpt = mb_substr($string, 0, mb_strrpos($string, ' ', -(mb_strlen($string) - 110)));
strrpos
and mb_strrpos
go backwards so you can search the last occurrence starting from the given position
Upvotes: 1