Trollhorn
Trollhorn

Reputation: 229

Any guaranteed minimum sizes for types in C?

Can you generally make any assumptions about the minimum size of a data type?

What I have read so far:

float??? double???

Are the values in float.h and limits.h system dependent?

Upvotes: 18

Views: 13372

Answers (9)

chux
chux

Reputation: 153457

Nine years and still no direct answer about the minimum size for float, double, long double.


Any guaranteed minimum sizes for types in C?

For floating point type ...

From a practical point-of-view, float minimum size is 32-bits and double is 64- bits. C allows double and long double to share similar characteristics, so a long double could be as small as a double: Example1 or 80-bit or 128-bit or ...

I could imagine a C compliant 48-bit double may have existed – yet do not know of any.


Now, let us imagine our rich uncle dies and left us a fortune to pay for the development and cultural promotion for www.smallest_C_float.com.

C specifies:

  1. float finite range is at least [1E-37… 1E+37]. See FLT_MIN, FLT_MAX

  2. (1.0f + FLT_EPSILON) – 1.0f <= 1E-5.

  3. float supports positive and negative values.

    Let X: Digit 1-9 Let Y: Digit 0-9 Let E: value -37 to 36 Let S: + or - Let b: 0 or 1

Our float could minimally represent all the combinations, using base 10, of SX.YYYYY*10^E.

0.0 and ±1E+37 are also needed (3 more). We do not need -0.0, sub-normals, ±infinity nor not-a-numbers.

That is 2*9*105*74 + 3 combinations or 133,200,003 which needs at least 27 bits to encode - somehow. Recall the goal is minimal size.

With a classic base 2 approach, we can assume an implied 1 and get S1.bbbb_bbbb_bbbb_bbbb_b*2e or 2*217*226 combinations or 26 bits.

If we try base 16, we then need about 2*15*16(4 or 5)*57 combinations or at least 26 to 30 bits.

Conclusion: A C float needs at least 26 bits of encoding.


A C’s double need not express a greater exponential range than float, it only has a different minimal precision requirement. 1E-9.

S1.bbbb_bbbb_bbbb_bbbb_ bbbb_ bbbb_ bbbb_bb*2e --> 2*230*226 combinations or 39 bits.


On our imagine-if-you-will computer, we could have a 13-bit char and so encode float, double, long double without padding. Thus we can realize a non-padded 26-bit float and 39-bit double, long double.


1: Microsoft Visual C++ for x86, which makes long double a synonym for double


[Edit] 2020

Additional double requirements may require 41 bits. May have to use 42-bit double and 28-bit float. Will need to review. Uncle will not be happy.

Upvotes: 10

C99 N1256 standard draft

http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf

C99 specifies two types of integer guarantees:

  • minimum size guarantees
  • relative sizes between the types

Relative guarantees

6.2.5 Types:

8 For any two integer types with the same signedness and different integer conversion rank (see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a subrange of the values of the other type.

and 6.3.1.1 Boolean, characters, and integers determines the relative conversion ranks:

1 Every integer type has an integer conversion rank defined as follows:

  • The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char.
  • The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
  • For all integer types T1, T2, and T3, if T1 has greater rank than T2 and T2 has greater rank than T3, then T1 has greater rank than T3

Absolute minimum sizes

Mentioned by https://stackoverflow.com/a/1738587/895245 , here is the quote for convenience.

5.2.4.2.1 Sizes of integer types <limits.h>:

1 [...] Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown [...]

  • UCHAR_MAX 255 // 2 8 − 1
  • USHRT_MAX 65535 // 2 16 − 1
  • UINT_MAX 65535 // 2 16 − 1
  • ULONG_MAX 4294967295 // 2 32 − 1
  • ULLONG_MAX 18446744073709551615 // 2 64 − 1

Floating point

If the __STDC_IEC_559__ macro is defined, then IEEE types are guaranteed for each C type, although long double has a few possibilities: Is it safe to assume floating point is represented using IEEE754 floats in C?

Upvotes: 2

Suppressingfire
Suppressingfire

Reputation: 3286

This is covered in the Wikipedia article:

A short int must not be larger than an int.
An int must not be larger than a long int.

A short int must be at least 16 bits long.
An int must be at least 16 bits long.
A long int must be at least 32 bits long.
A long long int must be at least 64 bits long.

The standard does not require that any of these sizes be necessarily different. It is perfectly valid, for example, if all four types are 64 bits long.

Upvotes: 38

Jed Smith
Jed Smith

Reputation: 15934

Yes, the values in float.h and limits.h are system dependent. You should never make assumptions about the width of a type, but the standard does lay down some minimums. See §6.2.5 and §5.2.4.2.1 in the C99 standard.

For example, the standard only says that a char should be large enough to hold every character in the execution character set. It doesn't say how wide it is.

For the floating-point case, the standard hints at the order in which the widths of the types are given:

§6.2.5.10

There are three real floating types, designated as float, double, and long double. 32) The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double.

They implicitly defined which is wider than the other, but not specifically how wide they are. "Subset" itself is vague, because a long double can have the exact same range of a double and satisfy this clause.

This is pretty typical of how C goes, and a lot is left to each individual environment. You can't assume, you have to ask the compiler.

Upvotes: 13

J S
J S

Reputation: 89

However, the new C99 specifies (in stdint.h) optional types of minimal sizes, like uint_least8_t, int_least32_t, and so on..
(see en_wikipedia_Stdint_h)

Upvotes: 6

S.C. Madsen
S.C. Madsen

Reputation: 5246

If you wan't to check the size (in multiples of chars) of any type on your system/platform really is the size you expect, you could do:

enum CHECK_FLOAT_IS_4_CHARS
{
   IF_THIS_FAILS_FLOAT_IS_NOT_4_CHARS = 1/(sizeof(float) == 4)
};

Upvotes: 4

wallyk
wallyk

Reputation: 57774

Often developers asking this kind of question are dealing with arranging a packed struct to match a defined memory layout (as for a message protocol). The assumption is that the language should directly specify laying out 16-, 24-, 32-bit, etc. fields for the purpose.

That is routine and acceptable for assembly languages and other application-specific languages closely tied to a particular CPU architecture, but is sometimes a problem in a general purpose language which might be targeted at who-knows-what kind of architecture.

In fact, the C language was not intended for a particular hardware implementation. It was specified generally so a C compiler implementer could properly adapt to the realities of a particular CPU. A Frankenstein hardware architecture consisting of 9 bit bytes, 54 bit words, and 72 bit memory addresses is easily—and unambiguously—mapped to C features. (char is 9 bits; short int, int, and long int are 54 bits.)

This generality is why the C specification says something to the effect of "don't expect much about the sizes of ints beyond sizeof (char) <= sizeof (short int) <= sizeof (int) <= sizeof (long int)." That implies that chars could be the same size as longs!

The current reality is—and the future seems to hold—that software demands architectures provide 8-bit bytes and that memory words addressable as individual bytes. This wasn't always so. Not too long ago, I worked on an the CDC Cyber architecture which features 6 bit "bytes" and 60 bit words. A C implementation on that would be interesting. In fact, that architecture is responsible for the weird packing semantics of Pascal—if anyone remembers that.

Upvotes: 3

DigitalRoss
DigitalRoss

Reputation: 146053

Quoting the standard does give what is defined to be "the correct answer" but it doesn't actually reflect the way programs are generally written.

People make assumptions all the time that char is 8 bits, short is 16, int is 32, long is either 32 or 64, and long long is 64.

Those assumptions are not a great idea but you will not get fired for making them.

In theory, <stdint.h> can be used to specify fixed-bit-width types, but you have to scrounge one up for Microsoft. (See here for a MS stdint.h.) One of the problems here is that C++ technically only needs C89 compatibility to be a conforming implementation; even for plain C, C99 is not fully supported even in 2009.

It's also not accurate to say there is no width specification for char. There is, the standard just avoids saying whether it is signed or not. Here is what C99 actually says:

  • number of bits for smallest object that is not a bit-field (byte)
    CHAR_BIT 8
  • minimum value for an object of type signed char
    SCHAR_MIN -127 // -(27 - 1)
  • maximum value for an object of type signed char
    SCHAR_MAX +127 // 27 - 1
  • maximum value for an object of type unsigned char
    UCHAR_MAX 255 // 28 - 1

Upvotes: 1

Pierre
Pierre

Reputation: 35246

Most of the libraries define something like this:

#ifdef MY_ARCHITECTURE_1
typedef unsigned char u_int8_t;
typedef short int16_t;
typedef unsigned short u_int16_t;
typedef int int32_t;
typedef unsigned int u_int32_t;
typedef unsigned char u_char;
typedef unsigned int u_int;
typedef unsigned long u_long;
typedef unsigned short u_short;
#endif

you can then use those typedef in your programs instead of the standard types.

Upvotes: 0

Related Questions