Reputation: 6249
I'm working on a high performance code in which this construct is part of the performance critical section.
This is what happens in some section:
string
is 'scanned' and metadata is stored efficiently.char[][]
.char[][]
should be transferred into a string[]
.Now, I know you can just call new string(char[])
but then the result would have to be copied.
To avoid this extra copy step from happening I guess it must be possible to write directly to the string's internal buffer. Even though this would be an unsafe operation (and I know this bring lots of implications like overflow, forward compatibility).
I've seen several ways of achieving this, but none I'm really satisfied with.
Does anyone have true suggestions as to how to achieve this?
Extra information:
The actual process doesn't include converting to char[]
necessarily, it's practically a 'multi-substring' operation. Like 3 indexes and their lengths appended.
The StringBuilder
has too much overhead for the small number of concats.
EDIT:
Due to some vague aspects of what it is exactly that I'm asking, let me reformulate it.
This is what happens:
char[]
.char[]
is converted to a string
.What I'd like to do is merge step 2 and 3, resulting in:
string
(and the GC can keep its hands off of it during the process by proper use of the fixed
keyword?).And a note is that I cannot change the output type from string[], since this is an external library, and projects depend on it (backward compatibility).
Upvotes: 7
Views: 1846
Reputation: 51349
I think that what you are asking to do is to 'carve up' an existing string in-place into multiple smaller strings without re-allocating character arrays for the smaller strings. This won't work in the managed world.
For one reason why, consider what happens when the garbage collector comes by and collects or moves the original string during a compaction- all of those other strings 'inside' of it are now pointing at some arbitrary other memory, not the original string you carved them out of.
EDIT: In contrast to the character-poking involved in Ben's answer (which is clever but IMHO a bit scary), you can allocate a StringBuilder with a pre-defined capacity, which eliminates the need to re-allocate the internal arrays. See http://msdn.microsoft.com/en-us/library/h1h0a5sy.aspx.
Upvotes: 3
Reputation: 6850
In .NET, there is no way to create an instance of String which shares data with another string. Some discussion on why that is appears in this comment from Eric Lippert.
Upvotes: 0
Reputation: 283694
What happens if you do:
string s = GetBuffer();
fixed (char* pch = s) {
pch[0] = 'R';
pch[1] = 'e';
pch[2] = 's';
pch[3] = 'u';
pch[4] = 'l';
pch[5] = 't';
}
I think the world will come to an end (Or at least the .NET managed portion of it), but that's very close to what StringBuilder
does.
Do you have profiler data to show that StringBuilder
isn't fast enough for your purposes, or is that an assumption?
Upvotes: 2
Reputation: 24344
Just create your own addressing system instead of trying to use unsafe code to map to an internal data structure.
Mapping a string
(which is also readable as a char[]
) to an array of smaller strings is no different from building a list of address information (index & length of each substring). So make a new List<Tuple<int,int>>
instead of a string[]
and use that data to return the correct string from your original, unaltered data structure. This could easily be encapsulated into something that exposed string[]
.
Upvotes: 2