Johnny
Johnny

Reputation: 21

How to count all the words in a textfile with multiple space characters

I am trying to write a procedure that counts all the words in a text file in Pascal. I want it to handle multiple space characters, but I have no idea how to do it.

I tried adding a boolean function Space to determine whether a character is a space and then do

while not eof(file) do
begin    
  read(file,char);
  words:=words+1;
  if Space(char) then
    while Space(char) do
      words:=words;

but that doesnt work, and basically just sums up my(probably bad) idea about how the procedure should look like. Any ideas?

Upvotes: 1

Views: 950

Answers (3)

rnso
rnso

Reputation: 24613

Another method could be to read whole file in one string and then use following steps to count words:

{$mode objfpc}
uses sysutils; 

var
  fullstr: string = 'this is   a     test  string. '; 
  ch: char;
  count: integer=0; 

begin 
  {trim string- remove spaces at beginning and end: }
  fullstr:= trim(fullstr); 

  {replace all double spaces with single: }
  while pos('  ', fullstr) > 0 do 
    fullstr := stringreplace(fullstr, '  ', ' ', [rfReplaceAll, rfIgnoreCase]); 

  {count spaces: }
  for ch in fullstr do
    if ch=' ' then 
      count += 1; 

  {add one to get number of words: }
  writeln('Number of words: ',count+1); 
end.

The comments in above code explain the steps.

Output:

Number of words: 5

Upvotes: 0

Basically, as Tom outlines in his answer, you need a state machine with the two states In_A_Word and Not_In_A_Word and then count whenever your state changes from Not_In_A_Word to In_A_Word.

Something along the lines of (pseudo-code):

var
  InWord: Boolean;
  Ch: Char;
begin
  InWord := False;
  while not eof(file) do begin    
    read(file, Ch);
    if Ch in ['A'..'Z', 'a'..'z'] then begin
      if not InWord then begin
        InWord := True;
        Words := Words + 1;
      end;
    end else
      InWord := False
  end;
end;

Upvotes: 2

Tom Brunberg
Tom Brunberg

Reputation: 21045

Use a boolean variable to indicate whether you are processing a word.

Set it true (and increment the counter) on first only non-space character.

Set it false on a space character.

Upvotes: 2

Related Questions