Waseem Senjer
Waseem Senjer

Reputation: 1048

Validate that input string does not exceed word limit

I want to count the words in a specific string so that I can validate it and prevent users to write more than, for example, 100 words.

I wrote this function, but I don't think it's effective enough. I used the explode function with space as a delimiter, but what if the user puts two spaces instead of one? Can you give me a better way to do that?

function isValidLength($text , $length){
  
   $text  = explode(" " , $text );
   if(count($text) > $length)
          return false;
   else
          return true;
}

Upvotes: 15

Views: 57659

Answers (10)

mickmackusa
mickmackusa

Reputation: 47894

If you need greater utility for defining "a word" in the context of your application, then a call of preg_match_all() returns its matches count. If you need multibyte support then add the unicode pattern modifier. \pL and \pM are letters and letter marks to err on the side of inclusivity. Consider this a starting place and understand that the regex rules of what is "a word" can be tightened or loosened as needed.

This solution is multibyte-safe.

Code: (Demo) (Regex101 Demo)

function isValidLength($text, $length) {
    return $length <= preg_match_all("~[\pL\pM'-]+~u", $text);
}

Alternatively, if it is a required field and you only need to count space-delimited "non-whitespace substrings", then you can just write:

if (preg_match("~^\s*\S+(\s+\S+){0,99}\s*$~", $text)) { ... }

or

if (preg_match("~^\S+(\s+\S+){0,99}$~", trim($text))) { ... }

Upvotes: 1

Francesco Laurita
Francesco Laurita

Reputation: 23552

Maybe str_word_count could help

http://php.net/manual/en/function.str-word-count.php

$Tag  = 'My Name is Gaurav'; 
$word = str_word_count($Tag);
echo $word;

Upvotes: 25

Sean Gallagher
Sean Gallagher

Reputation: 94

I wrote a function which is better than str_word_count because that PHP function counts dashes and other characters as words.

Also my function addresses the issue of double spaces, which many of the functions other people have written don't take account for.

As well this function handles HTML tags. Where if you had two tags nested together and simply used the strip_tags function this would be counted as one word when it's two. For example: <h1>Title</h1>Text or <h1>Title</h1><p>Text</p>

Additionally, I strip out JavaScript first other wise the code within the <script> tags would be counted as words.

Lastly, my function handles spaces at the beginning and end of a string, multiple spaces, and line breaks, return characters, and tab characters.

###############
# Count Words #
###############
function count_words($str)
{
 $str = preg_replace("/[^A-Za-z0-9 ]/","",strip_tags(str_replace('<',' <',str_replace('>','> ',str_replace(array("\n","\r","\t"),' ',preg_replace('~<\s*\bscript\b[^>]*>(.*?)<\s*\/\s*script\s*>~is','',$str))))));
 while(substr_count($str,'  ')>0)
 {
  $str = str_replace('  ',' ',$str);
 }
 return substr_count(trim($str,' '),' ')+1;
}

Upvotes: 0

Fenn-CS
Fenn-CS

Reputation: 905

There are n-1 spaces between n objects so there will be 99 spaces between 100 words, so u can choose and average length for a word say for example 10 characters, then multiply by 100(for 100 words) then add 99(spaces) then you can instead make the limitation based on number of characters(1099).

function isValidLength($text){

if(strlen($text) > 1099)

     return false;

else return true;

}

Upvotes: 0

Amr
Amr

Reputation: 5159

Try this:

function get_num_of_words($string) {
    $string = preg_replace('/\s+/', ' ', trim($string));
    $words = explode(" ", $string);
    return count($words);
}

$str = "Lorem ipsum dolor sit amet";
echo get_num_of_words($str);

This will output: 5

Upvotes: 21

Mackraken
Mackraken

Reputation: 515

str_count_words has his flaws. it will count underscores as separated words like this_is two words:

You can use the next function to count words separated by spaces even if theres more than one between them.

function count_words($str){

    while (substr_count($str, "  ")>0){
        $str = str_replace("  ", " ", $str);
    }
    return substr_count($str, " ")+1;
}


$str = "This   is  a sample_test";

echo $str;
echo count_words($str);
//This will return 4 words;

Upvotes: 4

Behzad-Ravanbakhsh
Behzad-Ravanbakhsh

Reputation: 972

Using substr_count to Count the number of any substring occurrences. for finding number of words set $needle to ' '. int substr_count ( string $haystack , string $needle)

$text = 'This is a test';
echo substr_count($text, 'is'); // 2


echo substr_count($text, ' ');// return number of occurance of words

Upvotes: 0

Michael Irigoyen
Michael Irigoyen

Reputation: 22947

You can use the built in PHP function str_word_count. Use it like this:

$str = "This is my simple string.";
echo str_word_count($str);

This will output 5.

If you plan on using special characters in any of your words, you can supply any extra characters as the third parameter.

$str = "This weather is like el ninã.";
echo str_word_count($str, 0, 'àáã');

This will output 6.

Upvotes: 10

Arnaud Le Blanc
Arnaud Le Blanc

Reputation: 99909

This function uses a simple regex to split the input $text on any non-letter character:

function isValidLength($text, $length) {
    $words = preg_split('#\PL+#u', $text, -1, PREG_SPLIT_NO_EMPTY);
    return count($words) <= $length;
}

This ensures that is works correctly with words separated by multiple spaces or any other non-letter character. It also handles unicode (e.g. accented letters) correctly.

The function returns true when the word count is less than $length.

Upvotes: 4

Jeff Lamb
Jeff Lamb

Reputation: 5865

Use preg_split() instead of explode(). Split supports regular expressions.

Upvotes: 2

Related Questions