Amit Verma
Amit Verma

Reputation: 41219

How can I remove duplicate words for each sentence in a string?

I have the following string :

$string = "Russia Russia Today is my favorite TV channel.
           Boom Bust is is my favorite program on RT";

In the above string in the first line the word Russia is followed by a duplicate word and in the second line the word is is followed by a duplicate word.

I want to remove all the duplicate words that are followed by the similar words. So far I have visited some similar questions on Stack Overflow, but they don't seem to help.

I tried this:

<?Php 

    $string = "Russia Russia Today is my favorite TV channel.Boom Bust is is my favorite program on RT";

    $arr = explode( " " , $string );
    $arr = array_unique( $arr );
    echo $string = implode(" " , $arr);

output:

Russia Today is my favorite TV channel. Boom Bust my favorite program on RT";

Notice the missing word is it is missing in the output.

My expacted output should be :

Russia Today is my favorite TV channel. Boom Bust is my favorite program on RT
                                                //^^

Upvotes: 0

Views: 1005

Answers (2)

Rizier123
Rizier123

Reputation: 59691

This should work for you:

Here I first explode() you string by a dot to get the single sentence. Then I explode each sentence into words. After this you can just take all unique words for each sentence and then you can print them again.

<?php

    $string = "Russia Russia Today is my favorite TV channel.Boom Bust is is my favorite program on RT";
    $sentence = explode(".", $string);
    $words = array_map(function($v){
        return explode(" ", $v);
    }, $sentence);

    $uniqueWords = array_map("array_unique", $words);

    foreach($uniqueWords as $v)
        echo implode(" ", $v) . ".<br>";

?>

output:

Russia Today is my favorite TV channel.
Boom Bust is my favorite program on RT.

EDIT:

If you just would want to replace multiple occurrences of words after each other, you could use this:

$str = "That solves the users specific issue, but it wouldn't solve something like Russia Russia Today is my favourite TV channel in Russia. ";
echo $str = preg_replace("/\b(\S+)\b(\s+\g{1}\b)+/", "$1", $str);

output:

That solves the users specific issue, but it wouldn't solve something like Russia Today is my favourite TV channel in Russia. 

Upvotes: 2

Styphon
Styphon

Reputation: 10447

The only way to do this would be to write a custom function. You need to explode the string as you are doing right now, then check each value in sequence. If it's the same as the previous instance, remove it.

Something like this should do it:

function remove_duplicate_words($string)
{
    $arr = explode(" ", $string);
    $prev_word = '';
    foreach ($arr as $key => $val)
    {
        // skip the first word
        if ($key == 0) 
        {
            $prev_word = $val;
            continue;
        }

        if ($prev_word == $val)
        {
            unset($arr[$key]);
        }
        else
        {
            $prev_word = $val;
        }
    }
    return implode(" " , $arr);
}

$string = remove_duplicate_words("Russia Russia Today is my favorite TV channel. Boom Bust is is my favorite program on RT");

Upvotes: 1

Related Questions