Reputation: 21292
We're developing a performance-sensitive text serialization class, and we'd like to avoid converting value-types into reference-types wherever possible.
The String.Insert
method appears to require you to provide a string parameter, and does not have an overload allowing a single character to be passed in as a value-type.
We're running into this scenario quite frequently, so I want to make sure there isn't another way to accomplish this without converting the character into it's own string, and then passing it to String.Insert
We've considered treating the parent string as a basic array, and inserting a single character from that angle - but this doesn't seem to work either (unless we're doing something wrong).
The major problem with this approach, is that it appears to require us to use the String.AsCharArray
method, which produces a copy of the string as a separate reference object - which is what we're trying to avoid in the first place.
Upvotes: 3
Views: 620
Reputation: 21292
StringBuilder
appears to be the standard solution.
It provides a more basic string object, as a standard char array, which you can manipulate repeatedly without allocating memory over and over.
Then, when you are done manipulating the StringBuilder
object, you can convert it into a standard string object, allocating memory for the string only once more.
This still allocates memory for the string twice: once for the StringBuilder
, and again for the final string object.
But this is the best you can do with the limitations of the platform.
At least memory allocation is no longer dependent on how many iterations you go through in the serialization processes.
That was the main priority, and StringBuilder
addresses that problem nicely.
<rant>
Passing strings around by-reference (or by-const-reference) was the only method that made any sense in C++, from a performance and functionality standpoint.
So the fact that .NET made strings into immutable reference-types that are passed around by-value just seems so backwards to me as a C++ developer.
They're already reference types, right?
Why can't we just pass around the reference, like any other object?
Geez! :)
My advice to Microsoft:
If your string objects don't support basic string operations, so you have to build a "hack" object StringBuilder
, encapsulating a standard char array that works like a real string object, to provide the extra features, that's a pretty clear sign that your managed string objects are terrible, and need to be corrected themselves.
</rant>
Upvotes: 0
Reputation: 74277
It probably doesn't get much simpler than this:
public static string InsertChar( this string s , char c , int i )
{
// create a buffer of the desired length
int len = s.Length + 1 ;
StringBuilder sb = new StringBuilder( len ) ;
sb.Length = len ;
int j = 0 ; // pointer to sb
int k = 0 ; // pointer to s
// copy the prefix to the buffer
while ( k < i )
{
sb[j++] = s[k++] ;
}
// copy the desired char to the buffer
sb[j++] = c ;
// copy the suffix to the buffer
while ( k < s.Length )
{
sb[j++] = s[k++] ;
}
// stringify it
return sb.ToString();
}
or maybe this
public static string InsertChar( this string s , char c , int i )
{
StringBuilder sb = new StringBuilder( s.Length+1 ) ;
return sb.Append( s , 0 , i ).Append( c ).Append( s , i , s.Length-i ) ;
}
You can probably make it faster by using unsafe code like this (so as to avoid the compares for range checks):
unsafe public static string InsertChar( this string s , char c , int i )
{
if ( s == null ) throw new ArgumentNullException("s");
if ( i < 0 || i > s.Length ) throw new ArgumentOutOfRangeException("i");
char[] buf = new char[s.Length+1];
fixed ( char *src = s )
fixed ( char *tgt = buf )
{
int j = 0 ; // offset in source
int k = 0 ; // offset in target
while ( j < i )
{
tgt[k++] = src[j++];
}
tgt[k++] = c ;
while ( j < s.Length )
{
tgt[k++] = src[j++] ;
}
}
return new string( buf ) ;
}
And if you know the strings are relatively short, you could speed things up a little more by using stackalloc
to allocate the working buffer on the stack instead of on the heap.
Upvotes: 1
Reputation: 4418
which produces a copy of the string as a separate reference object - which is what we're trying to avoid in the first place.
There is no way of modifying a string without creating a new one, except with replace if I'm not mistaken. You're trying to resize a string with already-allocated memory. That's why all string methods return a string and don't modify the original.
Upvotes: 4