Liran Friedman
Liran Friedman

Reputation: 4287

How to work properly with strings in C#?

I know there is a rule about strings in C# that says:

When we create a textual string of type string, we can never change its value! When putting different value for a string variable thje first string will stay in memory and variable (which is kind of reference type) just gets the address of the new string.

So doing something like this:

string a = "aaa";
a = a.Trim(); // Creates a new string

is not recommended. But what if I need to do some actions on the string according to user preferences, like so:

string a = "aaa";
if (doTrim)
   a = a.Trim();
if (doSubstring)
   a = a.Substring(...);

etc...

How can I do it without creating new strings on every action ? I thougt about sending the string to a function by ref, like so:

void DoTrim(ref string value)
{
  value = value.Trim(); // also creates new string
}

But this also creates a new string... Can someone please tell me if there is a way of doing it without wasteing memory on each action ?

Upvotes: 3

Views: 222

Answers (5)

David Peterson
David Peterson

Reputation: 741

Sticking my neck out here a bit so I'll preface with saying in most cases Servy's answer is the correct answer. However, if you really do need lower level access and less string allocations, you could consider creating a character buffer (simple array for instance) that is big enough to fit your processed string and allow you direct manipulation of the characters. There are some significant downfalls to this, though. Including that you'll probably have to write your own Substring() and Trim() modifiers, and your buffer will likely be bigger than your input strings in many cases to accommodate unexpected string sizes. Once you are done manipulating your buffer, you could then package the character array up as a String. Since all of your manipulations are done on a single buffer, you should save a lot of allocations.

I would seriously consider if the above is worth the hassle, but if you really need the performance, this is the best solution I can think of.

Upvotes: 1

chiccodoro
chiccodoro

Reputation: 14716

Why do you feel uncomfortable creating new strings? There is a reason for the string API to be designed this way. For example, immutable objects are thread-safe (and they allow for a more functional programming style).

If you replace your simple string code by stringbuilders, your code might be more error-prone in multithreading scenarios (which is quite normal in a web application for example).

StringBuilders are used for concatenating strings, inserting characters, removing characters, etc. But they will need to reallocate and copy their internal characters arrays every now and then, too.

When you speak about memory consumption you have started to micro-optimize your code. Don't.

BTW: Have a look at the LINQ API. What does each operation do? Rats - it creates a new enumerator! A query like foos.Where(bar).Select(baz).FirstOrDefault() could certainly be memory-optimized by just creating a single enumerator object and modifying the criteria it applies when enumerating. </irony>

Upvotes: 0

Fabian Bigler
Fabian Bigler

Reputation: 10915

How can I do it without creating new strings on every action?

You should only worry about that if you're handling big strings or if you're doing many string operations in a short period of time.

Even then, the performance loss due to creating more references is minimal. The Garbage Collector has to collect all the unused string variables, but hey - that only really matters if you're doing MANY string operations.

So rather focus on readability in your code, rather than trying to optimize its performance in the first place.


If you really have to keep the same reference of string, you can simply use a StringBuilder.

Upvotes: 0

Servy
Servy

Reputation: 203848

You are correct in that the operations you're performing are creating new strings, and not mutating a single string.

You are incorrect in that this is generally problematic or something to be avoided.

If your strings are hundreds of thousands of characters, then sure, copying all of those just to remove a few leading spaces, or to add a few characters to the end of it (repeatedly, in a loop, in particular) can actually be a problem.

If your strings aren't large, and you're not performing many (an in thousands of) operations on the string, then you almost certainly don't have a problem.

Now there are a handful of contexts, generally rather rare, that do run into problems with string manipulation. Probably the most common of the problematic contexts is appending a bunch of strings together, as doing so means copying all of the previously appended data for each new addition. If you're in that situation consider using something like a StringBuilder or a single call to string.Concat (the overload accepting a sequence of strings to concat) to perform this operation.

Other contexts are, for example, programs dealing with processing DNA strands. They'll often be taking strings of millions of characters and creating hundreds of thousands of many thousand character long substrings of that string. Using standard C# string operations would therefore result in a lot of unnecessary copying. People writing such programs end up creating objects that can represent a substring of another string without copying the data and instead referring to the existing string's underlying data source with an offset.

Upvotes: 11

openshac
openshac

Reputation: 5165

It will depend on what your exact use case is, but you might want to explore using the StringBuilder class which you can use to build and modify strings.

Upvotes: -1

Related Questions