Giffyguy
Giffyguy

Reputation: 21292

Is it possible to insert a character into a string without converting the character into its own string first?

We're developing a performance-sensitive text serialization class, and we'd like to avoid converting value-types into reference-types wherever possible.

The String.Insert method appears to require you to provide a string parameter, and does not have an overload allowing a single character to be passed in as a value-type.

We're running into this scenario quite frequently, so I want to make sure there isn't another way to accomplish this without converting the character into it's own string, and then passing it to String.Insert

We've considered treating the parent string as a basic array, and inserting a single character from that angle - but this doesn't seem to work either (unless we're doing something wrong).
The major problem with this approach, is that it appears to require us to use the String.AsCharArray method, which produces a copy of the string as a separate reference object - which is what we're trying to avoid in the first place.

Upvotes: 3

Views: 620

Answers (3)

Giffyguy
Giffyguy

Reputation: 21292

StringBuilder appears to be the standard solution.
It provides a more basic string object, as a standard char array, which you can manipulate repeatedly without allocating memory over and over.
Then, when you are done manipulating the StringBuilder object, you can convert it into a standard string object, allocating memory for the string only once more.

This still allocates memory for the string twice: once for the StringBuilder, and again for the final string object.
But this is the best you can do with the limitations of the platform.

At least memory allocation is no longer dependent on how many iterations you go through in the serialization processes.
That was the main priority, and StringBuilder addresses that problem nicely.

<rant>
Passing strings around by-reference (or by-const-reference) was the only method that made any sense in C++, from a performance and functionality standpoint.
So the fact that .NET made strings into immutable reference-types that are passed around by-value just seems so backwards to me as a C++ developer.
They're already reference types, right?
Why can't we just pass around the reference, like any other object? Geez! :)

My advice to Microsoft:
If your string objects don't support basic string operations, so you have to build a "hack" object StringBuilder, encapsulating a standard char array that works like a real string object, to provide the extra features, that's a pretty clear sign that your managed string objects are terrible, and need to be corrected themselves.
</rant>

Upvotes: 0

Nicholas Carey
Nicholas Carey

Reputation: 74277

It probably doesn't get much simpler than this:

public static string InsertChar( this string s , char c , int i )
{

  // create a buffer of the desired length
  int len = s.Length + 1 ;
  StringBuilder sb = new StringBuilder( len ) ;
  sb.Length = len ;

  int j = 0 ; // pointer to sb
  int k = 0 ; // pointer to s

  // copy the prefix to the buffer
  while ( k < i )
  {
    sb[j++] = s[k++] ;
  }

  // copy the desired char to the buffer
  sb[j++] = c ;

  // copy the suffix to the buffer
  while ( k < s.Length )
  {
    sb[j++] = s[k++] ;
  }

  // stringify it
  return sb.ToString();
}

or maybe this

public static string InsertChar( this string s , char c , int i )
{
  StringBuilder sb = new StringBuilder( s.Length+1 ) ;
  return sb.Append( s , 0 , i ).Append( c ).Append( s , i , s.Length-i ) ;
}

You can probably make it faster by using unsafe code like this (so as to avoid the compares for range checks):

unsafe public static string InsertChar( this string s , char c , int i )
{
  if ( s == null ) throw new ArgumentNullException("s");
  if ( i < 0 || i > s.Length ) throw new ArgumentOutOfRangeException("i");

  char[] buf = new char[s.Length+1];

  fixed ( char *src = s )
  fixed ( char *tgt = buf )
  {
    int j = 0 ; // offset in source
    int k = 0 ; // offset in target

    while ( j < i )
    {
      tgt[k++] = src[j++];
    }

    tgt[k++] = c ;

    while ( j < s.Length )
    {
      tgt[k++] = src[j++] ;
    }

  }

  return new string( buf ) ;
}

And if you know the strings are relatively short, you could speed things up a little more by using stackalloc to allocate the working buffer on the stack instead of on the heap.

Upvotes: 1

Philippe Par&#233;
Philippe Par&#233;

Reputation: 4418

which produces a copy of the string as a separate reference object - which is what we're trying to avoid in the first place.

There is no way of modifying a string without creating a new one, except with replace if I'm not mistaken. You're trying to resize a string with already-allocated memory. That's why all string methods return a string and don't modify the original.

Upvotes: 4

Related Questions