FrankK
FrankK

Reputation: 511

The best way to find how many times a string(or substring) occurs in a large string c#

For school i had to make an assignment, which i handed in already, but the code i wrote is awful, i don't like what i ended up with. So, I'm curious, what would be considered the best possible way to solve the following question in C#:

'//4 How many times does “queen” occur in the Alice in Wonderland book? Write some code to count them.'

link to the book (pastebin): book

my code (pastebin): my code (ugly)

please when writing your answer, ignore my code. also, explain what your code does, and why you think it's the best possible solution. The amount of times the word "queen" occurs in the book should be 76.

Upvotes: 4

Views: 211

Answers (5)

MethodMan
MethodMan

Reputation: 18863

or you could use some linq to do the same thing

string words = "Hi, Hi, Hello, Hi, Hello";  //"hello1 hello2 hello546 helloasdf";
var countList = words.Split(new[] { " " }, StringSplitOptions.None);
int count = countList.Where(s => s.Contains("Hi")).Count();

Upvotes: 0

M.kazem Akhgary
M.kazem Akhgary

Reputation: 19179

Shortest way to write. is to use Regex. it will find the matches for you. just get the counts. Also regex have ignore case option so you dont have to use ToLower on big string. So after you read the file

string aliceFile = Path.Combine(Environment.CurrentDirectory, "bestanden\\alice_in_wonderland.txt");
string text = File.ReadAllText(aliceFile);

Regex r = new Regex("queen", RegexOptions.IgnoreCase);
var count = r.Matches(input).Count;

Also because the input is very large but pattern is simple you can use RegexOptions.Compiled to make things faster.

Regex r = new Regex("queen", RegexOptions.IgnoreCase | RegexOptions.Compiled);
var count = r.Matches(input).Count;

Upvotes: 2

Hambone
Hambone

Reputation: 16397

Could also use a regular expression:

 string s = "Hello my baby, Hello my honey, Hello my ragtime gal";
 int count = Regex.Matches(s, "Hello").Count;

Upvotes: 1

AntDC
AntDC

Reputation: 1917

You could write a string extension method to split on more than one character....

public static string[] Split(this string s, string separator)
{
    return s.Split(new string[] { separator }, StringSplitOptions.None);
}

....And just use the string you are searching for as the specrator and then the result is the length of the array -1.

string s = "How now brown cow";
string searchS = "ow";
int count = s.split( seacrchS ).Length- 1;

The actual array returned by split would be ....

["H"," n"," b","n ","c"]

And extension methods ALWAAYS come in handy again in the future.

Upvotes: 1

CompuChip
CompuChip

Reputation: 9232

I won't post the full code, as I think it is useful for you to try this as an exercise, but I would personally go for a solution with the IndexOf overload that takes a starting position.

So something like (note: intentionally incorrect):

int startingPosition = 0;
int numberOfOccurrences = 0;
do {
  startingPosition = fullText.IndexOf("queen", startingPosition);
  numberOfOccurrences++;
} while( matchFound );

Upvotes: 4

Related Questions