Wilson
Wilson

Reputation: 8768

Optimizing counting characters within a string

I just created a simple method to count occurences of each character within a string, without taking caps into account.

static List<int> charactercount(string input)
        {
            char[] characters = "abcdefghijklmnopqrstuvwxyz".ToCharArray();
            input = input.ToLower();

            List<int> counts = new List<int>();
            foreach (char c in characters)
            {
                int count = 0;
                foreach (char c2 in input) if (c2 == c)
                    {
                        count++;
                    }

                counts.Add(count);
             }

            return counts;

        }

Is there a cleaner way to do this (i.e. without creating a character array to hold every character in the alphabet) that would also take into account numbers, other characters, caps, etc?

Upvotes: 1

Views: 1513

Answers (4)

amiry jd
amiry jd

Reputation: 27585

Based on +Ran's answer to avoiding IndexOutOfRangeException:

static readonly int differ = 'a';
int[] CountCharacters(string text) {
    text = text.ToLower();
    var counts = new int[26];

    for (var i = 0; i < text.Length; i++) {
        var charIndex = text[i] - differ;
        // to counting chars between 'a' and 'z' we have to do this:
        if(charIndex >= 0 && charIndex < 26)
            counts[charIndex] += 1;
    }
    return counts;
}

Actually using Dictionary and/or LINQ is not optimized enough as counting chars and working with a low level array.

Upvotes: 0

Ran
Ran

Reputation: 6159

Your code is kind of slow because you are looping through the range a-z instead of just looping through the input.

If you only need to count letters (like your code suggests), the fastest way to do it would be:

int[] CountCharacters(string text)
{
    var counts = new int[26];

    for (var i = 0; i < text.Length; i++)
    {
        var charIndex - text[index] - (int)'a';
        counts[charIndex] = counts[charindex] + 1;
    }

    return counts;
}  

Note that you need to add some thing like verify the character is in the range, and convert it to lowercase when needed, or this code might throw exceptions. I'll leave those for you to add. :)

Upvotes: 0

tvanfosson
tvanfosson

Reputation: 532455

Conceptually, I would prefer to return a Dictionary<string,int> of counts. I'll assume that it's ok to know by omission rather than an explicit count of 0 that a character occurs zero times, you can do it via LINQ. @Oded's given you a good start on how to do that. All you would need to do is replace the Select() with ToDictionary( k => k.Key, v => v.Count() ). See my comment on his answer about doing the case insensitive grouping. Note: you should decide if you care about cultural differences in characters or not and adjust the ToLower method accordingly.

You can also do this without LINQ;

public static Dictionary<string,int> CountCharacters(string input)
{
     var counts = new Dictionary<char,int>(StringComparer.OrdinalIgnoreCase);

     foreach (var c in input)
     {
          int count = 0;
          if (counts.ContainsKey(c))
          {
              count = counts[c];
          }
          counts[c] = counts + 1;
     }

     return counts;
}

Note if you wanted a Dictionary<char,int>, you could easily do that by creating a case invariant character comparer and using that as the IEqualityComparer<T> for a dictionary of the required type. I've used string for simplicity in the example.

Again, adjust the type of the comparer to be consistent with how you want to handle culture.

Upvotes: 2

Oded
Oded

Reputation: 499002

Using GroupBy and Select:

aString.GroupBy(c => c).Select(g => new { Character = g.Key, Num = g.Count() })

The returned anonymous type list will contain each character and the number of times it appears in the string.

You can then filter it in any way you wish, using the static methods defined on Char.

Upvotes: 1

Related Questions