Praneet Nadkar
Praneet Nadkar

Reputation: 813

C# - Code optimization to get all substrings from a string

I was working on a code snippet to get all substrings from a given string.

Here is the code that I use

 var stringList = new List<string>();
 for (int length = 1; length < mainString.Length; length++)
 {
    for (int start = 0; start <= mainString.Length - length; start++)
    {
       var substring = mainString.Substring(start, length);
       stringList.Add(substring);
    }
 }

It looks not so great to me, with two for loops. Is there any other way that I can achieve this with better time complexity.

I am stuck on the point that, for getting a substring, I will surely need two loops. Is there any other way I can look into ?

Upvotes: 6

Views: 7698

Answers (4)

Lionia Vasilev
Lionia Vasilev

Reputation: 12748

In some cases you can significantly increase execution speed by reducing object allocations. In this case by using a single char[] and ArraySegment<of char> to process substrings. This will also lead to use of less address space and decrease in garbage collector load.

Relevant excerpt from Using the StringBuilder Class in .NET page on Microsoft Docs:

The String object is immutable. Every time you use one of the methods in the System.String class, you create a new string object in memory, which requires a new allocation of space for that new object. In situations where you need to perform repeated modifications to a string, the overhead associated with creating a new String object can be costly.

Example implementation:

static List<ArraySegment<char>> SubstringsOf(char[] value)
{
    var substrings = new List<ArraySegment<char>>(capacity: value.Length * (value.Length + 1) / 2 - 1);
    for (int length = 1; length < value.Length; length++)
        for (int start = 0; start <= value.Length - length; start++)
            substrings.Add(new ArraySegment<char>(value, start, length));
    return substrings;
}

For more information check Fundamentals of Garbage Collection page on Microsoft Docs, what is the use of ArraySegment class? discussion on StackOverflow, ArraySegment<T> Structure page on MSDN and List<T>.Capacity page on MSDN.

Upvotes: 2

Dmitrii Bychenko
Dmitrii Bychenko

Reputation: 186698

Well, O(n**2) time complexity is inevitable, however you can try impove space consumption. In many cases, you don't want all the substrings being materialized, say, as a List<string>:

public static IEnumerable<string> AllSubstrings(string value) {
  if (value == null) 
    yield break; // Or throw ArgumentNullException

  for (int length = 1; length < value.Length; ++length)
    for (int start = 0; start <= value.Length - length; ++start)
      yield return value.Substring(start, length);
}

For instance, let's count all substrings in "abracadabra" which start from a and longer than 3 characters. Please, notice that all we have to do is to loop over susbstrings without saving them into a list:

int count = AllSubstrings("abracadabra")
  .Count(item => item.StartsWith("a") && item.Length > 3);

If for any reason you want a List<string>, just add .ToList():

var stringList = AllSubstrings(mainString).ToList(); 

Upvotes: 0

TheGeneral
TheGeneral

Reputation: 81493

You do need 2 for loops

Demo here

var input = "asd sdf dfg";
var stringList = new List<string>();

for (int i = 0; i < input.Length; i++)
{
    for (int j = i; j < input.Length; j++)
    {
        var substring = input.Substring(i, j-i+1);
        stringList.Add(substring);
    }
}

foreach(var item in stringList)
{
    Console.WriteLine(item);
}

Update

You cannot improve on the iterations.

However you can improve performance, by using fixed arrays and pointers

Upvotes: 4

Phillip Ngan
Phillip Ngan

Reputation: 16106

The number of substrings in a string is O(n^2), so one loop inside another is the best you can do. You are correct in your code structure.

Here's how I would've phrased your code:

void Main()
{
    var stringList = new List<string>();
    string s = "1234";
    for (int i=0; i <s.Length; i++)
        for (int j=i; j < s.Length; j++)
            stringList.Add(s.Substring(i,j-i+1));
}

Upvotes: 9

Related Questions