Reputation: 5960
I'm a little confused as how to use size_t
when other data types like int
, unsigned long int
and unsigned long long int
are present in a program. I try to illustrate my confusion minimally. Imagine a program where I use
void *calloc(size_t nmemb, size_t size)
to allocate an array (one- or multidimensional). Let the call to calloc()
be dependent on nrow
and sizeof(unsigned long int)
. sizeof(unsigned long int)
is obviously fine because it returns size_t
. But let nrow
be such that it needs to have type unsigned long int
. What do I do in such a case? Do I cast nrow
in the call to calloc()
from unsigned long int
to size_t
?
Another case would be
char *fgets(char *s, int size, FILE *stream)
fgets()
expects type int
as its second parameter. But what if I pass it an array, let's say save
, as it's first parameter and use sizeof(save)
to pass it the size of the array? Do I cast the call to sizeof()
to int
? That would be dangerous since int
isn't guaranteed to hold all possible returns from sizeof()
.
What should I do in these two cases? Cast, or just ignore possible warnings from tools such as splint
?
Here is an example regarding calloc()
(I explicitly omit error-checking for clarity!):
long int **arr;
unsigned long int mrow;
unsigned long int ncol;
arr = calloc(mrow, sizeof(long int *));
for(i = 0; i < mrow; i++) {
arr[i] = calloc(ncol, sizeof(long int));
}
Here is an example for fgets()
(Error-handling again omitted for clarity!):
char save[22];
char *ptr_save;
unsigned long int mrow
if (fgets(save, sizeof(save), stdin) != NULL) {
save[strcspn(save, "\n")] = '\0';
mrow = strtoul(save, &ptr_save, 10);
}
Upvotes: 3
Views: 5215
Reputation: 47952
My other answer got waaaaaaay too long, so here's a short one.
size_t
. (Similarly, if you have something that's involved in file sizes or offsets, use off_t
.).
assert(size_i_need <= SIZE_MAX);
char *buf = malloc((size_t)size_i_need);
Upvotes: 2
Reputation: 70931
When to cast size_t
You shouldn't.
Use it where it's appropriate.
If in doubt the type suits your program's needs you might go for the useful assertion statement as per Steve Summit's answer and if it fails start over with your program's design.
More on this here by Dan Saks: "Why size_t matters" and "Further insights into size_t"
Upvotes: 2
Reputation: 84561
I'm a little confused as how to use size_t when other data types like int, unsigned long int and unsigned long long int are present in a program.
It is never a good idea to ignore warnings. Warnings are there to direct your attention to areas of your code that may be problematic. It is much better to take a few minutes to understand what the warning is telling you -- and fix it, then to get bit by it later when you hit a corner-case and stumble off into undefined behavior.
size_t
itself is just a data-type like any other. While it can vary, it generally is nothing more than an unsigned int
covering the range of positive values that can be represented by int
including 0
(the type size was intended to be consistent across platforms, the actual bytes on each may differ). Your choice of data-type is a basic and fundamental part of programming. You choose the type based on the range of values your variable can represent (or should be limited to representing). So if whatever you are dealing with can't be negative, then an unsigned
or size_t
is the proper choice. The choice then allows the compiler to help identify areas where your code would cause that to be violated.
When you compile with warnings enabled (e.g. -Wall -Wextra
) which you should use on every compile, you will be warned about possible conflicts in your data-type use. (i.e. comparison between signed
and unsigned
values, etc...) These are important!
Virtually all modern x86 & x86_64 computers use the twos-compliment representation for signed values. In simple terms it means that if the leftmost bit of a signed number is 1
the value is negative. Herein lie the subtle traps you may fall in when mixing/casting or comparing numbers of varying type. If you choose to cast an unsigned
number to a signed
number and that number happens to have the most significant bit populated, your large number just became a very small number.
What should I do in these two cases? Cast, or just ignore possible warnings...
You do what you do each time you are faced with warnings from the compiler. You analyze what is causing the warning, and then you fix it (or if you can't fix it -- (i.e. is comes from some library you don't have access to) -- you understand the warning well enough that you can make an educated decision to disregard it knowing you will not hit any corner-cases that would lead to undefined behavior.
In your examples (while neither should produce warning, they may on some compilers):
arr = calloc (mrow, sizeof(long int *));
What is the range of sizeof(long int *)
? Well -- it's the range of what the pointer size can be. So, what's that? (4 bytes
on x86
or 8 bytes
on x86_64
). So the range of values is 4-8
, yes that can be properly fixed with a cast to size_t if needed, or better just:
arr = calloc (mrow, sizeof *arr);
Looking at the next example:
char save[22];
...
fgets(save, sizeof(save), stdin)
Here again what is the possible range of sizeof save
? From 22 - 22
. So yes, if a warnings is produced complainting about the fact that sizeof
returns long unsigned
and fgets
calls for int
, 22
can be cast to int
.
Upvotes: 4
Reputation: 47952
In general, you're right, you should not ignore the warnings! And in general, if you can, you should shy away from explicit casts, because they can make your code less reliable, or silence warning which are really trying to tell you something important.
Most of the time, I believe, the compiler should do the right thing for you. For example, malloc()
expects a size_t
, and the compiler knows from the function prototype that it does, so if you write
int size_i_need = 10;
char *buf = malloc(size_i_need);
the compiler will insert the appropriate conversion from int to size_t, as necessary. (I don't believe I've had warnings here I had to worry about, either.)
If the variables you're using are already unsigned
, so much the better!
Similarly, if you were to write
fgets(buf, sizeof(buf), ifp);
the compiler will again insert an appropriate conversion. Here, I guess I see what you're getting at, a 64-bit compiler might emit a warning about the downconversion from long to int. Now that I think about it, I'm not sure why I haven't had that problem, because this is a common idiom.
(You also asked about passing unsigned long
to malloc
, and on a machine where size_t
is smaller than long
, I suppose that might get you warnings, too. Is that what you were worried about?)
If you've got a downsize that you can't avoid, and your compiler or some other tool is warning about it, and you want to get rid of the warning safely, you could use a cast and an assertion. That is, if you write
unsigned long long size_i_need = 23;
char *buf = malloc(size_i_need);
this might get a warning on a machine where size_t is 32 bits. So you could silence the warning with a cast (on the assumption that your unsigned long long values will never actually be too big), and then back up your assumption with a call to assert
:
unsigned long long size_i_need = 23;
assert(size_i_need <= SIZE_MAX);
char *buf = malloc((size_t)size_i_need);
In my experience, the biggest nuisance is printing these things out. If you write
printf("int size = %d\n", sizeof(int));
or
printf("string length = %d\n", strlen("abc"));
on a 64-bit machine, a modern compiler will typically (and correctly) warn you that "format specifies type 'int' but the argument has type 'unsigned long'", or something to that effect. You can fix this in two ways: cast the value to match the printf format, or change the printf format to match the value:
printf("int size = %d\n", (int)sizeof(int));
printf("string length = %lu\n", strlen("abc"));
In the first case, you're assuming that sizeof
's result will fit in an int (which is probably a safe bet). In the second case, you're assuming that size_t
is in fact unsigned long
, which may be true on a 64-bit compiler but may not be true on some other. So it's actually safer to use an explicit cast in the second case, too:
printf("string length = %lu\n", (unsigned long)strlen("abc"));
The bottom line is that abstract types like size_t
don't work so well with printf
; this is where we can see that the C++ output style of cout << "string length = " << strlen("abc") << endl
has its advantages.
To solve this problem, there are some special printf
modifiers that are guaranteed to match size_t
and I think off_t
and a few other abstract types, although they're not so well known. (I wasn't sure where to look them up, but while I've been composing this answer, some commenters have already reminded me.) So the best way to print one of these things (if you can remember, and unless you're using old compilers) would be
printf("string length = %zu\n", strlen("abc"));
Bottom line:
int
or plain unsigned
to a function like calloc
that expects size_t
.size_t
to fgets
where size_t
is 64 bits but int
is 32, or passing unsigned long long
to calloc
where size_t
is only 32 bits, you might get warnings. If you can't make the passed-in types smaller (which in the general case you're not going to be able to do), you'll have little choice to silence the warnings but to insert a cast. In this case, to be strictly correct, you might want to add some assertions.With all of that said, I'm not sure I've actually answered your question, so if you'd like further clarification, please ask.
Upvotes: 0