Reputation: 156
I'm trying to find all the unique whole words from a body of text. Currently this is what I am using but it doesn't seem to be working:
$textDump = "cat dog monkey cat snake horse"
$wholeWord = "/[\w]*/";
$uniqueWords = (preg_match($wholeWord, $textDump, $matches));
Any help would be appreciated. Thanks!
Upvotes: 1
Views: 189
Reputation: 24551
The answers given so far all assume, that with "find all the unique whole words" you really meant "remove duplicates". Actually your question is not very clear about it, as you don't specify what your desired output is in your example, but I'll take you at your word and provide a solution for "find all the unique whole words".
This means, for the input:
"cat dog monkey cat snake horse"
You will get the output:
"dog monkey snake horse"
Actually, str_word_count
is useful for this too, together with array_count_values
, which actually counts the different values:
$wordCount = array_count_values(str_word_count($textDump,1));
$wordCount
is now:
array(5) {
["cat"] => int(2)
["dog"] => int(1)
["monkey"] => int(1)
["snake"] => int(1)
["horse"] => int(1)
}
Next, remove the words with a word count higher than 1 (note, that the actual words are the array keys, so we use array_keys
to get them:
$uniqueWords = array_keys(
array_filter(
$wordCount,
function($count) {
return $count === 1;
}
)
);
$uniqueWords
is now:
array(4) {
[0] => string(3) "dog"
[1] => string(6) "monkey"
[2] => string(5) "snake"
[3] => string(5) "horse"
}
Complete code:
$textDump = "cat dog monkey cat snake horse";
$wordCount = array_count_values(str_word_count($textDump,1));
$uniqueWords = array_keys(
array_filter(
$wordCount,
function($count) {
return $count === 1;
}
)
);
echo join(' ', $uniqueWords);
//dog monkey snake horse
Upvotes: 1
Reputation: 16495
Why not achieve this using explode();
and array_unique();
in this case?
$text = "cat dog monkey cat snake horse";
$foo = explode(" ", $text);
print_r(array_unique($foo));
Upvotes: 1
Reputation: 14681
You can use str_word_count
$textDump = "cat dog monkey cat snake horse";
$uniqueWords = (str_word_count($textDump, 1);
Upvotes: 2