maxspan
maxspan

Reputation: 14177

getting a performance hit for nested for loop in C#

I have an array of string, a total of(100k). I need to check if the same string has occurred previously, if it occurs all I have to do is return that string. I have written the code using nested for loops, but unfortunately I am getting bad performance. It takes 1.9 mins to process the function correctly for (string[100K]) whereas I am expecting it to finish it within a couple of seconds.

My concern is not the logic. My concern is the LOOP. How to increase the efficiency of the loop.

public string StartMatchingProcess(string[]inputArray)
{
    string[] stringArray = inputArray;
    string result = string.Empty;

    for (long i = 0; i < stringArray.Length; i++)
    {
        for (long j = 0; j <= i; j++)
        {
            if(i == j) break;

            if (IsPrefix(stringArray[i], stringArray[j]))
            {
                return stringArray[i];
            }
        }
    }
    Console.WriteLine("GOOD SET");
    return result;
}

public bool IsPrefix(string string1,string string2)
{
    if (AreString1ValuesValid(string1, string2))
    {
        if (string1 == string2.Substring(0, string1.Length))
        {
            Console.WriteLine("BAD SET");
            Console.WriteLine(string1);
            return true;
        }
    }
    else if (AreString2ValuesValid(string1, string2))
    {
        if (string2 == string1.Substring(0, string2.Length))
        {
            Console.WriteLine("BAD SET");
            Console.WriteLine(string1);
            return true;
        }
    }
    return false;
}


public bool AreString1ValuesValid(string string1, string string2)
        => string1.Length <= string2.Length;

public bool AreString2ValuesValid(string string1, string string2)
       => string2.Length <= string1.Length;

Upvotes: 0

Views: 770

Answers (4)

maxspan
maxspan

Reputation: 14177

Here is a complete solution using linq.

public class Node
    {
        public char letter { get; }
        public int Index { get; set; }
        public List<Node> ChildList { get; set; } = new List<Node>();

        public Node(char item, int index)
        {
            Index = index;
            letter = item;
        }
    }       

    public class NoPrefixSet
    {
        public Dictionary<char, Node> ParentNode { get; set; } = new Dictionary<char, Node>();

        public string GenerateNodes(string[] inputArray)
        {
            for (int i = 0; i < inputArray.Length; i++)
            {
                if (IsWordPrefix(inputArray[i]))
                {
                    Console.WriteLine("BAD SET");
                    Console.WriteLine(inputArray[i]);
                    return inputArray[i];

                }
            }
            Console.WriteLine("Good Set");
            return "Good Set";
        }

        private void InsertNodeInParent(char item)
            => ParentNode.Add(item, new Node(item, 0));

        private bool IsWordPrefix(string word)
        {
            //Check parent
            Node parentNode = null;
            bool hasNotInserted = false;
            int similarCounter = 0;
            if (!ParentNode.Any(a => a.Key == word[0]))
            {
                InsertNodeInParent(word[0]);
            }
            parentNode = ParentNode.Where(a => a.Key == word[0]).FirstOrDefault().Value;

            for (int letterIndex = 0; letterIndex < word.Length; letterIndex++)
            {
                if (!parentNode.ChildList.Any(a => a.letter == word[letterIndex]))
                {
                    parentNode.ChildList.Add(new Node(word[letterIndex], letterIndex));
                }
                else 
                {
                    if (!parentNode.ChildList.Where(a => a.letter == word[letterIndex]).First().ChildList.Any() || word.Length == letterIndex+1)
                    {
                        if (similarCounter == letterIndex)
                            return hasNotInserted = true;
                    }
                    similarCounter++;

                }
                parentNode = parentNode.ChildList.Where(a => a.letter == word[letterIndex] && a.Index == letterIndex).First();
            }

            return hasNotInserted;
        }

        public void ReadInput()
        {
            long data = Convert.ToInt64(Console.ReadLine());
            string[] stringArray = new string[data];
            for (long i = 0; i < data; i++)
            {
                stringArray[i] = Console.ReadLine();
            }
            GenerateNodes(stringArray);
        }
    }

Upvotes: 0

Dmitrii Bychenko
Dmitrii Bychenko

Reputation: 186728

Sort the initial array, and you can check neighbors only:

public string StartMatchingProcess(string[] inputArray) {
  if (null == inputArray)
    throw new ArgumentNullException(nameof(inputArray));

  string[] sorted = inputArray.OrderBy(item => item).ToArray();

  for (int i = 1; i < sorted.Length; ++i) {
    string prior = sorted[i - 1];
    string current = sorted[i];

    if (current.StartsWith(prior))
      return prior;
  }

  return "";
}

So, you'll have O(n * log(n)) time complexity vs. O(n**2) (initial solution)

Upvotes: 6

Vadim Martynov
Vadim Martynov

Reputation: 8892

It's really bad idea to use nested loops for this task because you have O(n*n) complexity for the answer and need to make 10.000.000.000 calls of Substring() for 100k array.

There is a specific structures for strings. For example, you can use Trie:

public string StartMatchingProcess(string[] inputArray)
{
    var trie = new Trie();
    foreach(var w in inputArray)
        trie.AddWord(w);
    foreach(var w in inputArray)
        if(trie.HasPrefix(w) || trie.HasWord(w)
            return w;

    return string.Empty;
}

Upvotes: 1

Nick Proud
Nick Proud

Reputation: 395

If you are just trying to determine if your array has duplicate string values, consider LINQ to get the count of the occurences.

        string[] arrayTest = new string[] { "hello", "hello", "world"};
        string myValue = "hello";
        var stringCount = arrayTest.Where(n => n == myValue).Count();
        if (stringCount > 1) return myValue;

In the above, we check to see if "hello" is in the array more than once, and if it is, we return it.

Upvotes: 0

Related Questions