Reputation: 268

Why does itoa expect a signed character instead of an unsigned?

Learning embedded C while working in MPLAB X with a PIC24FJ128GB204.

So far, I've mostly heard that you should use unsigned types as much as possible (especially?) on embedded devices, so I've started to use uint8_t arrays to hold strings. However, if I call itoa from stdlib.h, it expects a pointer to a signed char (int8_t) array:

extern char * itoa(char * buf, int val, int base);

This is made specifically clear when I try to compile after using itoa on an unsigned array:

main.c:317:9: warning: pointer targets in passing argument 1 of 'itoa' differ in signedness
c:\program files (x86)\microchip\xc16\v1.36\bin\bin\../..\include/stdlib.h:131:15: note: expected 'char *' but argument is of type 'unsigned char *'

Searching for implementations of itoa on other platforms, that seems to be the common case.

Why is that?

(I've also noticed that most implementations expect value/pointer/radix whereas -for some reason- the stdlib.h from Microchip expects the pointer first. It took me a while to realize this.)

Upvotes: 1

Answers (4)

chux

Reputation: 154075

char as signed or unsigned is a compromise of decades ago - It made sense then to bring a level of consistency to compilers of the day.

itoa(), although not a standard C library function, follows the convention, in that the string is made up of char followed by a null character.

Many library functions use a string pointer. itoa() does too and handles the internal workings as unsigned char. Keep in mind, a string is to represent text, not numbers - so the signedness of the char in itself is not a great concern. Of course the point of itoa() is to take a number (int) and form a string.

The C library treats char functionally "as if" it were unsigned char in many cases.

int fgetc() returns a value of EOF or in the unsigned char range.
printf() "%c": "the int argument is converted to an unsigned char, and the resulting character is written."
<string.h> "For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value)."
<ctype.h> "In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF.

Upvotes: 6

Lundin

Reputation: 214495

So far, I've mostly heard that you should use unsigned types as much as possible (especially?) on embedded devices

This is true mainly for the reason that (accidentally or intentionally) signed operands mixed with the bitwise operators create havoc. But also there aren't many cases in low level programming where you actually need to use signed types.

For example, MISRA-C enforces you to always use unsigned variables, operands and integer constant unless the intention is to actually use a signed type. So this isn't just something opinion-based, MISRA-C is de facto industry standard for most professional embedded systems.

so I've started to use uint8_t arrays to hold strings

That's ok but it isn't wrong to use char for that purpose either. The only time when it is ok to use char is when you intend to store text. Note that char is especially nasty, because unlike all other types in the language, it has unknown signedness. Each compiler can make char either signed or unsigned and still conform with the C standard. So code relying on char being either signed or unsigned is broken. However, for text strings this doesn't matter since they are always positive.

However, if I call itoa from stdlib.h, it expects a pointer to a signed char (int8_t) array:

Your compiler apparently treats char as signed then. First of all please note that itoa isn't standard C and isn't allowed to exist inside stdlib.h when strict C standard conformance is desired. But more importantly, different compilers might implement the function differently since it isn't standardized.

As it turns out, you can safely cast wildly between the various character types: char, unsigned char, signed char, int8_t and uint8_t (the stdint.h 8 bit types are pretty much dead certain to be character types even though the standard doesn't say so explicitly). The character types specifically have various special rules associated with them, meaning that you can always cast something to a character type.

You can safely cast your uint8_t array to a char*, as long as there are no qualifiers (const etc) present.

Upvotes: 2

John Bode

Reputation: 123558

So far, I've mostly heard that you should use unsigned types as much as possible (especially?) on embedded devices,

Have the people you heard this from explained why? Is that explanation grounded in solid analysis and engineering, or is it pulled out of thin air?

The problem with rules of thumb is that they often get applied unthinkingly in the wrong situation. Use unsigned types when you need to use unsigned types, use signed types when you need to use signed types.

I've started to use uint8_t arrays to hold strings.

Don't. That's not what it's there for.

Plain char may be signed or unsigned, depending on the environment. The character encodings for the basic character set (upper- and lower-case Latin alphabet, decimal digits, and the basic set of graphical characters) are always going to be non-negative, but extended characters may have positive or negative encodings.

6.2.5 Types
...
3 An object declared as type char is large enough to store any member of the basic execution character set. If a member of the basic execution character set is stored in a char object, its value is guaranteed to be nonnegative. If any other character is stored in a char object, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type.

^{C 2011 Online Draft}

The C library functions that handle strings expect pointers to char, not unsigned char or uint8_t or anything else. While it's highly likely that for any platform that offers it uint8_t is simply a typedef name for unsigned char, that's not a guarantee. char must be at least 8 bits wide, but there are platforms where it could be wider (one of the old PDPs used 9-bit bytes and 36-bit words, and depending on the application I can see some special-purpose embedded systems using wonky sizes).

Upvotes: 4

0___________

Reputation: 67820

So far, I've mostly heard that you should use unsigned types as much as possible

Firstly - it is not the truth at all - you should use the correct type. What is the correct type? It is the type which suits the best your needs. How can I know which type is best for me? That depends what you are use it for. It should have a type to store all possible values your program might want to store in it.

So you should not listen this person anymore.

Upvotes: 1

Why does itoa expect a signed character instead of an unsigned?

Answers (4)

Related Questions