DontFretBrett
DontFretBrett

Reputation: 1165

String Concatenation and Threads in .NET

(Out of pure curiosity) In VB.net, I tested concatenating 100k strings and found out one thread alone did it in 23 milliseconds. Two threads (each concatenating 50k) then joining the two at the end took 30 milliseconds. Performance wise, it didn't seem beneficial to utilize multiple threads when dealing with only 100k concatenations. Then I tried 3 million string concatenations and two threads each handling 1.5MM always demolished one thread handling all 3 million. I imagine at some point using 3 threads becomes beneficial, then 4, and so on. Is there a more efficient way to concatenate millions of strings in .NET? Are threads worth using?

At around 1MM string concatenations, it appears multiple threads can improve performance

fyi, this is the code I wrote:

Imports System.Text
Imports System.Threading
Imports System.IO
Public Class Form1
    Dim sbOne As StringBuilder
    Dim sbTwo As StringBuilder
    Dim roof As Integer
    Dim results As DataTable
    Sub clicked(s As Object, e As EventArgs) Handles Button1.Click
        results = New DataTable
        results.Columns.Add("one thread")
        results.Columns.Add("two threads")
        results.Columns.Add("roof")

        For i As Integer = 1 To 3000000 Step 100000
            roof = i
            Dim test() As Double = runTest()
            results.Rows.Add(test(0), test(1), i)
            Console.WriteLine(roof)
        Next

        Dim output As New StringBuilder
        For Each C As DataColumn In results.Columns
            output.Append(C)
            output.Append(Chr(9))
        Next
        output.Append(vbCrLf)
        For Each R As DataRow In results.Rows
            For Each C As DataColumn In results.Columns
                output.Append(R(C))
                output.Append(Chr(9))
            Next
            output.Append(vbCrLf)
        Next
        File.WriteAllText("c:\users\username\desktop\sbtest.xls", output.ToString)
        Console.WriteLine("done")

    End Sub
    Function runTest() As Double()
        Dim sb As New StringBuilder
        Dim started As DateTime = Now
        For i As Integer = 1 To roof
            sb.Append(i)
        Next
        Dim result As String = sb.ToString
        Dim test1 As Double = Now.Subtract(started).TotalMilliseconds

        sbOne = New StringBuilder
        sbTwo = New StringBuilder
        Dim one As New Thread(AddressOf tOne)
        Dim two As New Thread(AddressOf tTwo)
        started = Now
        one.Start()
        two.Start()
        Do While one.IsAlive Or two.IsAlive
        Loop
        result = String.Concat(one.ToString, two.ToString)
        Dim test2 As Double = Now.Subtract(started).TotalMilliseconds
        Return {test1, test2}
    End Function
    Sub tOne()
        For i As Integer = 1 To roof / 2
            sbOne.Append(i)
        Next
    End Sub
    Sub tTwo()
        For i As Integer = roof / 2 To roof
            sbTwo.Append(i)
        Next
    End Sub
End Class

Upvotes: 1

Views: 1114

Answers (2)

SLaks
SLaks

Reputation: 887937

Threads are designed for tasks more expensive than string concatenation.

String concatenation involves allocating and copying memory; it's not a very comnpute-intensive task.
Multi-threading should be used when dealing with computationally intensive tasks, and to avoid blocking the UI thread.

Threading can also be useful to parallize tasks that wait on different things (eg, network IO to multiple slow servers, or network vs. disk IO)

Upvotes: 3

Bradley Uffner
Bradley Uffner

Reputation: 17001

Check out the .EnsureCapacity subroutine on StringBuilder. If you are doing a lot of concatenation and know roughly the number of characters, you can initialize the stringbuilder's buffer all at once instead of letting it happen dynamically. You should see some more improvement.

http://msdn.microsoft.com/en-us/library/system.text.stringbuilder.ensurecapacity.aspx

Upvotes: 2

Related Questions