Darkmage
Darkmage

Reputation: 1603

count a specifc word in a text file in C#

If i got a text file

"dont run if you cant hide, or you will be broken in two strings, your a evil man"

and i want to count how many times the word you is in the text file, and put that value in to a int variable.

how do i go about doing somthing like that?

Upvotes: 3

Views: 18600

Answers (7)

spender
spender

Reputation: 120450

Assuming there are regular line breaks then if the file is huge this would be less memory intensive than some other approaches here. Uses Jason's counting method:

        var total = 0;
        using(StreamReader sr=new StreamReader("log.log"))
        {

            while (!sr.EndOfStream)
            {
                var counts = sr
                    .ReadLine()
                    .Split(' ')
                    .GroupBy(s => s)
                    .Select(g => new{Word = g.Key,Count = g.Count()});
                var wc = counts.SingleOrDefault(c => c.Word == "you");
                total += (wc == null) ? 0 : wc.Count;
            }
        }

Or, combining the Scoregraphic's answer here with a IEnumerable method:

    static IEnumerable<string> Lines(string filename)
    {
        using (var sr = new StreamReader(filename))
        {
            while (!sr.EndOfStream)
            {
                yield return sr.ReadLine();
            }
        }
    }

You could get a nifty one-liner

    Lines("log.log")
        .Select(line => Regex.Matches(line, @"(?i)\byou\b").Count)
        .Sum();

[Edited because System.IO.File now supports enumerating the lines of a file, removing need for hand rolled method of doing the same thing described above]

Or using framework method File.ReadLines() you could reduce this to:

File.ReadLines("log.log")
        .Select(line => Regex.Matches(line, @"(?i)\byou\b").Count)
        .Sum();

Upvotes: 6

Dave
Dave

Reputation: 185

Try counting the occurances using indexOf and then moving to the next entry. E.g.

using System;

namespace CountOcc
{
 class Program
 {
  public static void Main(string[] args)
  {
   int         StartPos; // Current pos in file.

   System.IO.StreamReader sr = new System.IO.StreamReader( "c:\\file.txt" );
   String Str = sr.ReadToEnd();

   int Count = 0;
   StartPos = 0;
   do
   {
    StartPos = Str.IndexOf( "Services", StartPos );
    if ( StartPos >= 0 )
    {
     StartPos++;
     Count++;
    }
   } while ( StartPos >= 0 );

   Console.Write("File contained " + Count + " occurances");
   Console.ReadKey(true);
  }
 }
}

Upvotes: 1

Daniel Br&#252;ckner
Daniel Br&#252;ckner

Reputation: 59655

The following method will do the job.

public Int32 GetWordCountInFile(String fileName, String word, Boolean ignoreCase)
{
    return File
        .ReadAllText(fileName)
        .Split(new [] { ' ', '.', ',' })
        .Count(w => String.Compare(w, word, ignoreCase));
}

Maybe you will have to add a few other possible separators to the String.Split() call.

Upvotes: 1

Scoregraphic
Scoregraphic

Reputation: 7200

To say it with a Regex...

Console.WriteLine((new Regex(@"(?i)you")).Matches("dont run if you cant hide, or you will be broken in two strings, your a evil man").Count)

or if you need the word you as stand-alone

Console.WriteLine((new Regex(@"(?i)\byou\b")).Matches("dont run if you cant hide, or you will be broken in two strings, your a evil man").Count)

Edit: Replaced \s+you\s+ with (?i)\byou\b for the sake of correctness

Upvotes: 14

Donut
Donut

Reputation: 112825

Reading from a file:

int count;

using (StreamReader reader = File.OpenText("fileName")
{
   string contents = reader.ReadToEnd();
   MatchCollection matches = Regex.Matches(contents, "\byou\b");
   count = matches.Count;
}

Note that if you use "\byou\b" will match just the word "you" by itself. If you want to match "you" inside of other words (for example, the "you" in "your"), use "you" as the pattern instead of "\byou\b".

Upvotes: 4

Sk93
Sk93

Reputation: 3718

try regular expressions:

Regex r = new Regex("test");
MatchCollection matches = r.Matches("this is a test of using regular expressions to count how many times test is said in a string");
int iCount = matches.Count;

Upvotes: 2

jason
jason

Reputation: 241641

string s = "dont run if you cant hide, or you will be broken in two strings, your a evil man";
var wordCounts = from w in s.Split(' ')
                 group w by w into g
                 select new { Word = g.Key, Count = g.Count() };

int youCount = wordCounts.Single(w => w.Word == "you").Count;
Console.WriteLine(youCount);

Ideally punctuation should be ignored. I'll let you handle a messy detail like that.

Upvotes: 10

Related Questions