Reputation: 14177
I have an array of string, a total of(100k). I need to check if the same string has occurred previously, if it occurs all I have to do is return that string. I have written the code using nested for loops, but unfortunately I am getting bad performance. It takes 1.9 mins to process the function correctly for (string[100K]) whereas I am expecting it to finish it within a couple of seconds.
My concern is not the logic. My concern is the LOOP. How to increase the efficiency of the loop.
public string StartMatchingProcess(string[]inputArray)
{
string[] stringArray = inputArray;
string result = string.Empty;
for (long i = 0; i < stringArray.Length; i++)
{
for (long j = 0; j <= i; j++)
{
if(i == j) break;
if (IsPrefix(stringArray[i], stringArray[j]))
{
return stringArray[i];
}
}
}
Console.WriteLine("GOOD SET");
return result;
}
public bool IsPrefix(string string1,string string2)
{
if (AreString1ValuesValid(string1, string2))
{
if (string1 == string2.Substring(0, string1.Length))
{
Console.WriteLine("BAD SET");
Console.WriteLine(string1);
return true;
}
}
else if (AreString2ValuesValid(string1, string2))
{
if (string2 == string1.Substring(0, string2.Length))
{
Console.WriteLine("BAD SET");
Console.WriteLine(string1);
return true;
}
}
return false;
}
public bool AreString1ValuesValid(string string1, string string2)
=> string1.Length <= string2.Length;
public bool AreString2ValuesValid(string string1, string string2)
=> string2.Length <= string1.Length;
Upvotes: 0
Views: 770
Reputation: 14177
Here is a complete solution using linq.
public class Node
{
public char letter { get; }
public int Index { get; set; }
public List<Node> ChildList { get; set; } = new List<Node>();
public Node(char item, int index)
{
Index = index;
letter = item;
}
}
public class NoPrefixSet
{
public Dictionary<char, Node> ParentNode { get; set; } = new Dictionary<char, Node>();
public string GenerateNodes(string[] inputArray)
{
for (int i = 0; i < inputArray.Length; i++)
{
if (IsWordPrefix(inputArray[i]))
{
Console.WriteLine("BAD SET");
Console.WriteLine(inputArray[i]);
return inputArray[i];
}
}
Console.WriteLine("Good Set");
return "Good Set";
}
private void InsertNodeInParent(char item)
=> ParentNode.Add(item, new Node(item, 0));
private bool IsWordPrefix(string word)
{
//Check parent
Node parentNode = null;
bool hasNotInserted = false;
int similarCounter = 0;
if (!ParentNode.Any(a => a.Key == word[0]))
{
InsertNodeInParent(word[0]);
}
parentNode = ParentNode.Where(a => a.Key == word[0]).FirstOrDefault().Value;
for (int letterIndex = 0; letterIndex < word.Length; letterIndex++)
{
if (!parentNode.ChildList.Any(a => a.letter == word[letterIndex]))
{
parentNode.ChildList.Add(new Node(word[letterIndex], letterIndex));
}
else
{
if (!parentNode.ChildList.Where(a => a.letter == word[letterIndex]).First().ChildList.Any() || word.Length == letterIndex+1)
{
if (similarCounter == letterIndex)
return hasNotInserted = true;
}
similarCounter++;
}
parentNode = parentNode.ChildList.Where(a => a.letter == word[letterIndex] && a.Index == letterIndex).First();
}
return hasNotInserted;
}
public void ReadInput()
{
long data = Convert.ToInt64(Console.ReadLine());
string[] stringArray = new string[data];
for (long i = 0; i < data; i++)
{
stringArray[i] = Console.ReadLine();
}
GenerateNodes(stringArray);
}
}
Upvotes: 0
Reputation: 186728
Sort the initial array, and you can check neighbors only:
public string StartMatchingProcess(string[] inputArray) {
if (null == inputArray)
throw new ArgumentNullException(nameof(inputArray));
string[] sorted = inputArray.OrderBy(item => item).ToArray();
for (int i = 1; i < sorted.Length; ++i) {
string prior = sorted[i - 1];
string current = sorted[i];
if (current.StartsWith(prior))
return prior;
}
return "";
}
So, you'll have O(n * log(n))
time complexity vs. O(n**2)
(initial solution)
Upvotes: 6
Reputation: 8892
It's really bad idea to use nested loops for this task because you have O(n*n)
complexity for the answer and need to make 10.000.000.000 calls of Substring()
for 100k array.
There is a specific structures for strings. For example, you can use Trie
:
public string StartMatchingProcess(string[] inputArray)
{
var trie = new Trie();
foreach(var w in inputArray)
trie.AddWord(w);
foreach(var w in inputArray)
if(trie.HasPrefix(w) || trie.HasWord(w)
return w;
return string.Empty;
}
Upvotes: 1
Reputation: 395
If you are just trying to determine if your array has duplicate string values, consider LINQ to get the count of the occurences.
string[] arrayTest = new string[] { "hello", "hello", "world"};
string myValue = "hello";
var stringCount = arrayTest.Where(n => n == myValue).Count();
if (stringCount > 1) return myValue;
In the above, we check to see if "hello" is in the array more than once, and if it is, we return it.
Upvotes: 0