indago
indago

Reputation: 2101

how to find all the double characters in a string in c#

I am trying to get a count of all the double characters in a string using C#,i.e "ssss" should be two doubles not three doubles.

For example right now i have to do a for loop in the string like this

string s="shopkeeper";
for(int i=1;i<s.Length;i++) if(s[i]==s[i-1]) d++;

the value of d at the end should be 1

Is there a shorter way to do this? in linq or regex? and what are the perfomance implications, what is the most effective way? Thanks for your help

I have read [How to check repeated letters in a string c#] and it's helpful, but doesn't address double characters, i am looking for double characters

Upvotes: 2

Views: 2484

Answers (3)

Ivan Stoev
Ivan Stoev

Reputation: 205629

First I would like to mention that there is no "natural" LINQ solution to this problem, so every standard LINQ based solution will be ugly and highly inefficient compared to a simple for loop.

However there is a LINQ "spirit" solution to this and similar problems, like the linked How to check repeated letters in a string c# or if you want for instance finding not doubles, but let say triples, quadruples etc.

The common sub problem is, given a some sequence of elements, generate a new sequence of (value, count) pair groups for the consecutive elements having one and the same value.

It can be done with a custom extension method like this (the name of the method could be different, it's not essential for the point):

public static class EnumerableEx
{
    public static IEnumerable<TResult> Zip<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, int, TResult> resultSelector, IEqualityComparer<TSource> comparer = null)
    {
        if (comparer == null) comparer = EqualityComparer<TSource>.Default;
        using (var e = source.GetEnumerator())
        {
            for (bool more = e.MoveNext(); more;)
            {
                var value = e.Current;
                int count = 1;
                while ((more = e.MoveNext()) && comparer.Equals(e.Current, value)) count++;
                yield return resultSelector(value, count);
            }
        }
    }
}

Using this function in combination with standard LINQ, one can easily solve the original question:

var s = "shhopkeeperssss";
var countDoubles = s.Zip((value, count) => count / 2).Sum();

but also

var countTriples = s.Zip((value, count) => count / 3).Sum();

or

var countQuadruples = s.Zip((value, count) => count / 4).Sum();

or the question from the link

var repeatedChars = s.Zip((value, count) => new { Char = value, Count = count })
    .Where(e => e.Count > 1);

etc.

Upvotes: 0

Enigmativity
Enigmativity

Reputation: 117084

This works:

var doubles =
    text
        .Skip(1)
        .Aggregate(
            text.Take(1).Select(x => x.ToString()).ToList(),
            (a, c) =>
            {
                if (a.Last().Last() == c)
                    a[a.Count - 1] += c.ToString();
                else
                    a.Add(c.ToString());
                return a;
            })
        .Select(x => x.Length / 2)
        .Sum();

I gives me these results:

"shopkeeper" -> 1
"beekeeper" -> 2
"bookkeeper" -> 3
"boookkkeeeper" -> 3
"booookkkkeeeeper" -> 6

Upvotes: 0

Diligent Key Presser
Diligent Key Presser

Reputation: 4253

Try following Regex to extract any double characters: "(.)\1"

UPD: simple example:

foreach (var match in Regex.Matches("shhopkeeper", @"(.)\1"))
   Console.WriteLine(match);

Upvotes: 1

Related Questions