Gerben Jacobs
Gerben Jacobs

Reputation: 4583

Break up long words in a UTF-8 text, with PHP

Horrible title, I know.

I want to have some kind of wordwrap, but obviously can not use wordwrap() as it messes up UTF-8.. not to mention markup.

My issue is that I want to get rid of stuff like this "eeeeeeeeeeeeeeeeeeeeeeeeeeee" .. but then longer of course. Some jokesters find it funny to put that stuff on my site.

So when I have a string like this "Hello how areeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee you doing?" I want to break up the 'areeee'-thing with the zero width space (​) character.

Strings aren't always the same letter, and strings are always inside larger strings.. so str_len, substr, wordwrap all don't really fit the description.

Who can help me out?

Upvotes: 1

Views: 681

Answers (2)

bretterer
bretterer

Reputation: 5781

Do this in 3 steps

  1. do a split on the string and whitespace
  2. do a str_len/trim on each word in the string
  3. concat the string back together

The downside to this would be that words longer than 10 chars would be broken as well. So I would suggest adding some stuff in here to see if it is the same letter in a row over and over.

EXAMPLE

$string = "Hello how areeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee you doing?";
$strArr = explode(" ",$string);
foreach($strArr as $word) {
if(strlen($word) > 10) {
    $word = substr($word,0,10);
}

$wordArr[] = $word;
}

$newString = implode(" ",$wordArr);
print $newString;  // Prints "Hello how areeeeeeee you doing?"

Upvotes: 1

Aurelio De Rosa
Aurelio De Rosa

Reputation: 22162

Said that this is not a PHP solution, if your problem is the view of your script, why don't you use the simple CSS3 rule called word-wrap?

Let your container is a div with id="example", you can write:

#example
{
  word-wrap: break-word;
}

Upvotes: 1

Related Questions