Serhat Reis
Serhat Reis

Reputation: 23

C# big string array to string

I have a string array of about 20,000,000 values. And i need to convert it to a string

I've tried:

    string data = "";
    foreach (var i in tm)
    {
        data = data + i;
    }

But that takes too long time

does someone know a faster way?

Upvotes: 2

Views: 2077

Answers (5)

Mert Gülsoy
Mert Gülsoy

Reputation: 2919

For this big data unfortunately memory based methods will fail and this will be a real headache for GC. For this operation create a file and put every string in it. Like this:

using (StreamWriter sw = new StreamWriter("some_file_to_write.txt")){
    for (int i=0; i<tm.Length;i++)
        sw.Write(tm[i]);
}

Try to avoid using "var" on this performance demanding approach. Correction: "var" does not effect perfomance. "dynamic" does.

Upvotes: 0

Corey
Corey

Reputation: 16564

The answer is going to depend on the size of the output string and the amount of memory you have available and usable. The hard limit on string length appears to be 2^31-1 (int.MaxValue) characters, occupying just over 4GB of memory. Whether you can actually allocate that is dependent on your framework version, etc. If you're going to be producing a larger output then you can't put it into a single string anyway.

You've already discovered that naive concatenation is going to be tragically slow. The problem is that every pass through the loop creates a new string, then immediately discards it on the next iteration. This is going to fill up memory pretty quickly, forcing the Garbage Collector to work overtime finding old strings to clear out of memory, not to mention the amount of memory fragmentation and all that stuff that modern programmers don't pay much attention to.

A StringBuiler, is a reasonable solution. Internally it allocates blocks of characters that it then stitches together at the end using pointers and memory copies. Saves a lot of hassles that way and is quite speedy.

As for String.Join... it uses a StringBuilder. So does String.Concat although it is certainly quicker when not inserting separator characters.

For simplicity I would use String.Concat and be done with it.

But then I'm not much for simplicity.

Here's an untested and possibly horribly slow answer using LINQ. When I get time I'll test it and see how it performs, but for now:

string result = new String(lines.SelectMany(l => (IEnumerable<char>)l).ToArray());

Obviously there is a potential overflow here since the ToArray call can potentially create an array larger than the String constructor can handle. Try it out and see if it's as quick as String.Concat.

Upvotes: 1

InBetween
InBetween

Reputation: 32740

Cant check it right now but I'm curious on how this option would perform:

var data = String.Join(string.Empty, tm);

Is Join optimized and ignores concatenation a with String.Empty?

Upvotes: 0

David Haxton
David Haxton

Reputation: 284

So you can do it in LINQ, like such.

string data = tm.Aggregate("", (current, i) => current + i);

Or you can use the string.Join function

string data = string.Join("", tm);

Upvotes: 0

Hatted Rooster
Hatted Rooster

Reputation: 36463

Try StringBuilder:

StringBuilder sb = new StringBuilder();
foreach (var i in tm)
{
    sb.Append(i);
}

To get the resulting String use ToString():

string result = sb.ToString();

Upvotes: 3

Related Questions