Every C language learner once wonders, ** char, unsigned char and signed char are not clear! The trouble of **.
The origin of this is that in functions such as memcpy, memcmp, memset, I noticed that it seems that the argument passed as a general-purpose pointer (void *) type is copied to the unsigned char * type and operated. start from.
memset.c
void *memset(void *dst, int val, size_t len)
{
unsigned char *ptr = dst; //unsigned char*I'm using a mold!
while (len-- > 0)
*ptr++ = val;
return dst;
}
At this time, I was wondering, "Why is it an unsigned char type? Isn't it a char type?", So I did a lot of research.
First of all, `char```, ```unsigned char``` and
`signed char``` are all different things.
I think this is the first point that beginners are most likely to fall into.
The processor must define char as having the same range of values, the same representation, and the same behavior as either signed or unsigned char. Regardless of which is defined, char is a different type than ** signed char and unsigned char and is not compatible with these types. ** ** (Quote: JPCERT GC)
In other words, whether char is singed- or unsigned- is unspecified as a standard, and this is left to be defined by the processing system (compiler). Regardless of which one is defined, it seems that these three types are not compatible.
So, I sometimes see on the net asserting that "the range of char type is -128 to 127", but that is wrong.
Please note that this area has different specifications from int
.
Data type name | Part-Time Job | Other names | Range of values |
---|---|---|---|
int | 2 or 4 | signed | -2,147,483,648 ~ 2,147,483,64 |
unsigned int | 2 or 4 | unsigned | 0 ~ 4,294,967,295 |
char | 1 | - | 0 〜 255 / -128 〜 127 |
signed char | 1 | - | -128 〜 127 |
unsigned char | 1 | - | 0 〜 255 |
Reference: Data type range --Microsoft Docs
From the conclusion, it is basic to use char when dealing with simple character data as a character set, and signed char or unsigned char when dealing with numerical values.
In particular, unsigned char is actively used when the operation target is a general-purpose type (void *, etc.) and you want to access all the bits of the target, such as mem-type functions (described later).
Considering the specifications of the standard library for character string processing, it is preferable to use ** char ** when simply representing character data.
In this case, use singed char or unsigned char. In any case, you have to be careful to keep it within the range of values, especially the range of signed char is as narrow as -128 to 127, so it is not easy to use.
That's when signed char comes in! I wonder if there is a good example ...
At the beginning, in the mem-type function, the argument passed as the general-purpose pointer (void *) type is copied to the unsigned char * type and operated? I introduced the question. A closer look reveals that it has something to do with the standard notation for unsigned char.
C Standard [ISO / IEC 9899: 2011] 6.2.6.
The value stored in the unsigned char type [...] object is expressed in pure binary notation. ... (Pure binary notation) An integer position representation that uses the binary digits 0 and 1, where values represented by consecutive bits are additive, starting at 1 and continuous except for the most significant bit. Is multiplied by the integer power of.
Therefore, it is possible to check the representation of a non-bit field of an object passed as a general-purpose pointer (void *) type, byte by byte.
... It's hard to catch up with the literary system, but unsigned char type objects may not have padding bits, so it is a function that handles the memory area itself like the mem- system function, and it is not the operation target. It seems that the unsigned char notation is most suitable when you want to access all the bits including the bit field.
That is why in functions such as memcpy and memcmp, the arguments passed in void * type are copied to unsigned char type and manipulated.
There seems to be a long way to go to fully understand unsigned char. ..
Recommended Posts