Reputation: 43862
I'm porting some code from Java to C, and so far things have gone well.
However, I have a particular function in Java that makes liberal use of StringBuilder
, like this:
StringBuilder result = new StringBuilder();
// .. build string out of variable-length data
for (SolObject object : this) {
result.append(object.toString());
}
// .. some parts are conditional
if (freezeCount < 0) result.append("]");
else result.append(")");
I realize SO is not a code translation service, but I'm not asking for anyone to translate the above code.
I'm wondering how to efficiently perform this type of mass string concatenation in C. It's mostly small strings, but each is determined by a condition, so I can't combine them into a simple sprintf
call.
How can I reliably do this type of string concatenation?
Upvotes: 5
Views: 662
Reputation: 7271
The cause of poor performance when concatenating strings is the reallocation of memory. Joel Spolsky discusses this in his article Back to basics. He describes the naive method of concatenating strings:
Shlemiel gets a job as a street painter, painting the dotted lines down the middle of the road. On the first day he takes a can of paint out to the road and finishes 300 yards of the road. "That's pretty good!" says his boss, "you're a fast worker!" and pays him a kopeck.
The next day Shlemiel only gets 150 yards done. "Well, that's not nearly as good as yesterday, but you're still a fast worker. 150 yards is respectable," and pays him a kopeck.
The next day Shlemiel paints 30 yards of the road. "Only 30!" shouts his boss. "That's unacceptable! On the first day you did ten times that much work! What's going on?"
"I can't help it," says Shlemiel. "Every day I get farther and farther away from the paint can!"
If you can, you want to know how large your destination buffer needs to be before allocating it. The only realistic way to do this is to call strlen
on all of the strings you want to concatenate. Then allocate the appropriate amount of memory and use a slightly modified version of strncpy
that returns a pointer to the end of the destination buffer.
// Copies src to dest and returns a pointer to the next available
// character in the dest buffer.
// Ensures that a null terminator is at the end of dest. If
// src is larger than size then size - 1 bytes are copied
char* StringCopyEnd( char* dest, char* src, size_t size )
{
size_t pos = 0;
if ( size == 0 ) return dest;
while ( pos < size - 1 && *src )
{
*dest = *src;
++dest;
++src;
++pos;
}
*dest = '\0';
return dest;
}
Note how you have to set the size
parameter to be the number of bytes left until the end of the destination buffer.
Here's a sample test function:
void testStringCopyEnd( char* str1, char* str2, size_t size )
{
// Create an oversized buffer and fill it with A's so that
// if a string is not null terminated it will be obvious.
char* dest = (char*) malloc( size + 10 );
memset( dest, 'A', size + 10 );
char* end = StringCopyEnd( dest, str1, size );
end = StringCopyEnd( end, str2, size - ( end - dest ) );
printf( "length: %d - '%s'\n", strlen( dest ), dest );
}
int main(int argc, _TCHAR* argv[])
{
// Test with a large enough buffer size to concatenate 'Hello World'.
// and then reduce the buffer size from there
for ( int i = 12; i > 0; --i )
{
testStringCopyEnd( "Hello", " World", i );
}
return 0;
}
Which produces:
length: 11 - 'Hello World'
length: 10 - 'Hello Worl'
length: 9 - 'Hello Wor'
length: 8 - 'Hello Wo'
length: 7 - 'Hello W'
length: 6 - 'Hello '
length: 5 - 'Hello'
length: 4 - 'Hell'
length: 3 - 'Hel'
length: 2 - 'He'
length: 1 - 'H'
length: 0 - ''
Upvotes: 2
Reputation: 44250
If operations like these are very frequent, you could implement them in your own buffer class. Example (error handling omitted for brevity ;-):
struct buff {
size_t used;
size_t size;
char *data;
} ;
struct buff * buff_new(size_t size)
{
struct buff *bp;
bp = malloc (sizeof *bp);
bp->data = malloc (size);
bp->size = size;
bp->used = 0;
return bp;
}
void buff_add_str(struct buff *bp, char *add)
{
size_t len;
len = strlen(add);
/* To be implemented: buff_resize() ... */
if (bp->used + len +1 >= bp->size) buff_resize(bp, bp->used+1+len);
memcpy(buff->data + buff->used, add, len+1);
buff->used += len;
return;
}
Upvotes: 1
Reputation: 263487
The performance problem with strcat()
is that it has to scan the destination string to find the terminating \0'
before it can start appending to it.
But remember that strcat()
doesn't take strings as arguments, it takes pointers.
If you maintain a separate pointer that always points to the terminating '\0'
of the string you're appending to, you can use that pointer as the first argument to strcat()
, and it won't have to re-scan it every time. For that matter, you can use strcpy()
rater than strcat()
.
Maintaining the value of this pointer and ensuring that there's enough room are left as an exercise.
NOTE: you can use strncat()
to avoid overwriting the end of the destination array (though it will silently truncate your data). I don't recommend using strncpy()
for this purpose. See my rant on the subject.
If your system supports them, the (non-standard) strcpy()
and strlcat()
functions can be useful for this kind of thing. They both return the total length of the string they tried to create. But their use makes your code less portable; on the other hand, there are open-source implementations that you can use anywhere.
Another solution is to call strlen()
on the string you're appending. This isn't ideal, since it's then scanned twice, once by strcat()
and once by strlen()
-- but at least it avoids re-scanning the entire destination string.
Upvotes: 2
Reputation: 129454
A rather "clever" way to conver a number of "objects" to string is:
char buffer[100];
char *str = buffer;
str += sprintf(str, "%06d", 123);
str += sprintf(str, "%s=%5.2f", "x", 1.234567);
This is fairly efficient, since sprintf returns the length of the string copied, so we can "move" str forward by the return value, and keep filling in.
Of course, if there are true Java Objects, then you'll need to figure out how to make a Java style ToString function into "%somethign" in C's printf family.
Upvotes: 4
Reputation: 29266
Given that the strings look so small, I'd be inclined just to use strcat
and revisit if performance becomes an issue.
You could make your own method that remembers the string length so it doesn't need to iterate through the string to find the end (which is potentially the slow bit of strcat if you are doing lots of appends to long strings)
Upvotes: 0