Reputation: 405955
I know that buffer overruns are one potential hazard to using C-style strings (char arrays). If I know my data will fit in my buffer, is it okay to use them anyway? Are there other drawbacks inherent to C-style strings that I need to be aware of?
EDIT: Here's an example close to what I'm working on:
char buffer[1024];
char * line = NULL;
while ((line = fgets(fp)) != NULL) { // this won't compile, but that's not the issue
// parse one line of command output here.
}
This code is taking data from a FILE pointer that was created using a popen("df")
command. I'm trying to run Linux commands and parse their output to get information about the operating system. Is there anything wrong (or dangerous) with setting the buffer to some arbitrary size this way?
Upvotes: 8
Views: 4051
Reputation: 4307
There are a few disadvantages to C strings:
Upvotes: 21
Reputation: 8815
Another consideration is who will be maintaining your code? What about in two years? Will that person be as comfortable with C-stlye strings as you are? As the STL gets more mature, it seems like people will be increasingly more comfortable with with STL strings than with C-style strings.
Upvotes: 0
Reputation: 45533
In your specific case, it's not the c-string that dangerous, so much as the reading an indeterminate amount of data into a fixed-size buffer. Don't ever use gets(char*) for example.
Looking at your example though, it doesn't seem at all correct - try this:
char buffer[1024];
char * line = NULL;
while ((line = fgets(buffer, sizeof(buffer), fp)) != NULL) {
// parse one line of command output here.
}
This is a perfectly safe use of c-strings, although you'll have to deal with the possibility that line
does not contain an entire line, but was rather truncated to 1023 characters (plus a null terminator).
Upvotes: 3
Reputation: 14961
C strings lack the following aspects of their C++ counterparts:
Upvotes: 17
Reputation: 101476
You may know that today 1024 bytes is enough to contain any input, but you don't know how things will change tomorrow or next year.
If premature optimization is the root of all evil, magic numbers are the stem.
Upvotes: 8
Reputation: 66662
C strings, like many other aspects of C, give you plenty of room to hang yourself. They are simple and fast, but unsafe in the situation where assumptions such as the null terminator can be violated or input can overrun the buffer. To do them reliably you have to observe fairly hygenic coding practices.
There used to be a saying that the canonical definition of a high-level language was "anything with better string handling than C".
Upvotes: 0
Reputation: 4171
Well, to comment on your specific example, you don't know that the data returned by your call to df will fit into your buffer. Never trust un-sanatized input into your application, even when it is supposedly from a known source like df.
For example, if a program named 'df' is placed somewhere in your search path so that it is executed instead of the system df it could be used to exploit your buffer limit. Or if df is replaced by a malicious program.
When reading input from a file use a function that lets you specify the maximum number of bytes to read. Under OSX and Linux fgets() is actually defined as char *fgets(char *s, int size, FILE *stream);
so it would be safe to use on those systems.
Upvotes: 6
Reputation: 3138
This question is not really have an answer.
If you writing in C what over options you have ?
If you writing in C++ why are you asking ? What is the reason not to use C++ primitives ?
The only reason i can think is: Linking C and C++ code and have char * somewhere in interfaces. It sometimes just easy to use char * instead doing conversion back and forward all the time (especially if it's really 'good' C++ code that have 3 different C++ string objects types).
Upvotes: 0
Reputation: 11573
Imho, the hardest point of cstrings is the memory management, because you need to be carefully if you need to pass a copy of a cstring or if you can pass a literal to a function, ie. will the function free the passed string or will it keep a reference longer then for the function call. The same applies to cstring return values.
So without big effort it is not possible to share cstring copys. This ends in many cases with unnecessary copiess of the same cstring in the memory.
Upvotes: 0
Reputation: 39520
Not having the length accessible in constant-time is a serious overhead in many applications.
Upvotes: 14
Reputation: 3295
I think IT IS OKAY to use them, people've been using them for years. But I would rather use std::string if possible because 1) you don't have to be so cautious every time and can think about problems of your domain, instead of thinking that you need to add another parameter every time...memory management and that kinda stuff...it is just safer to code on a higher level... 2) there are probably some other small concerns which are not big deal but still...like people already mentioned...encoding, unicode...all those "related" kinda stuff people creating std::string thought of...:)
Update
I worked on a project for half a year. Somehow I was stupid enough to never compile in release mode before delivery....:) Well...luckily there was just one error I found after 3 hours. It was a very simple string buffer overrun.
Upvotes: 2
Reputation: 28872
c strings have opportunities for misuse, due to the fact that that one has to scan the string to determine where it ends.
strlen - to find the length, scan the string, until you hit the NUL, or access protected memory
strcat - has to scan to find the NUL, in order to determine where to begin concatenating. There is no knowledge within a c string, to tell if there will be a buffer overrun or not.
c strings are risky, but generally faster than string objects.
Upvotes: 0
Reputation: 338316
There is no way to embed NUL characters (if you need them for something) into C style strings.
Upvotes: 6
Reputation: 6116
The memory management etc needed to grow string (char array), if necessary, is kinda boring to reinvent.
Upvotes: 7
Reputation: 338316
Character encoding issues tend to surface when you have an array of bytes instead of a string of characters.
Upvotes: 3