The story that the unsigned char type of C language was swampy than I imagined

Introduction

Every C language learner once wonders, ** char, unsigned char and signed char are not clear! The trouble of **.

The origin of this is that in functions such as memcpy, memcmp, memset, I noticed that it seems that the argument passed as a general-purpose pointer (void *) type is copied to the unsigned char * type and operated. start from.

memset.c


void *memset(void *dst, int val, size_t len)
{
    unsigned char *ptr = dst; //unsigned char*I'm using a mold!
    while (len-- > 0)
        *ptr++ = val;
    return dst;
}

At this time, I was wondering, "Why is it an unsigned char type? Isn't it a char type?", So I did a lot of research.

There are 3 types of char type, all different

First of all, `char```, ```unsigned char``` and `signed char``` are all different things. I think this is the first point that beginners are most likely to fall into.

The processor must define char as having the same range of values, the same representation, and the same behavior as either signed or unsigned char. Regardless of which is defined, char is a different type than ** signed char and unsigned char and is not compatible with these types. ** ** (Quote: JPCERT GC)

In other words, whether char is singed- or unsigned- is unspecified as a standard, and this is left to be defined by the processing system (compiler). Regardless of which one is defined, it seems that these three types are not compatible.

So, I sometimes see on the net asserting that "the range of char type is -128 to 127", but that is wrong. Please note that this area has different specifications from int.

Data type name Part-Time Job Other names Range of values
int 2 or 4 signed -2,147,483,648 ~ 2,147,483,64
unsigned int 2 or 4 unsigned 0 ~ 4,294,967,295
char 1 - 0 〜 255 / -128 〜 127
signed char 1 - -128 〜 127
unsigned char 1 - 0 〜 255

Reference: Data type range --Microsoft Docs

Basic usage of 3 types of char

From the conclusion, it is basic to use char when dealing with simple character data as a character set, and signed char or unsigned char when dealing with numerical values.

In particular, unsigned char is actively used when the operation target is a general-purpose type (void *, etc.) and you want to access all the bits of the target, such as mem-type functions (described later).

1. When dealing with character sets

Considering the specifications of the standard library for character string processing, it is preferable to use ** char ** when simply representing character data.

2. When treating char as a numerical value

In this case, use singed char or unsigned char. In any case, you have to be careful to keep it within the range of values, especially the range of signed char is as narrow as -128 to 127, so it is not easy to use.

That's when signed char comes in! I wonder if there is a good example ...

Features of unsigned char

At the beginning, in the mem-type function, the argument passed as the general-purpose pointer (void *) type is copied to the unsigned char * type and operated? I introduced the question. A closer look reveals that it has something to do with the standard notation for unsigned char.

C Standard [ISO / IEC 9899: 2011] 6.2.6.

The value stored in the unsigned char type [...] object is expressed in pure binary notation. ... (Pure binary notation) An integer position representation that uses the binary digits 0 and 1, where values represented by consecutive bits are additive, starting at 1 and continuous except for the most significant bit. Is multiplied by the integer power of.

Therefore, it is possible to check the representation of a non-bit field of an object passed as a general-purpose pointer (void *) type, byte by byte.

... It's hard to catch up with the literary system, but unsigned char type objects may not have padding bits, so it is a function that handles the memory area itself like the mem- system function, and it is not the operation target. It seems that the unsigned char notation is most suitable when you want to access all the bits including the bit field.

That is why in functions such as memcpy and memcmp, the arguments passed in void * type are copied to unsigned char type and manipulated.

There seems to be a long way to go to fully understand unsigned char. ..

Recommended Posts

The story that the unsigned char type of C language was swampy than I imagined
The story that the return value of tape.gradient () was None
The performance of PHP was better than I expected
Note the range of values that C / C ++ char takes
Code that returns the square root of C language ②
The story of making an Invaders game in C language
[Small story] Erase all the names of C language structures
The story of functional and function pointer types in C language
I don't know the source code of c language bubble sort
The story of Collectors.groupingBy that I want to keep for posterity
I want to convert char type and int type in C language
The story that ARM's processing performance of Open JDK was low
I noticed that there was no if-elseif-else syntax in C language.
The story that the version of python 3.7.7 was not adapted to Heroku
A story that I was addicted to twice with the automatic startup setting of Tomcat 8 on CentOS 8
A story that the behavior changed due to optimization because I did not know the C language regulations
Create your own code that returns the square root of C language
I don't understand the mechanism of the source code of c language bubble sort
Code that displays the approximate execution time of a C language function
After studying C language, the story of implementing the CASL II processing system
I stumbled upon a C language char
[Note] The body of the C language array
Understand the memory class of C language
The story that XGBoost was finally installed
I was scared because there was no method to calculate the number of elements in an array in C language
I tried various things that char type arrays and pointers can and cannot do in C language and summarized them.
The story of IPv6 address that I want to keep at a minimum
AtCoder AGC 041 C --I was addicted to the full search of Domino Quality
The story that the installation of NVIDIA Driver was easily done with just the GUI
Check the size of the structure in C language
I want you to put the story that the error was solved when you stabbed the charger in the corner of your head
Love was born in the union of C
I made 3 patterns of Omikuji in C language
The mystery of integer literals in C language
The mystery of C language integer literals [Commentary]
The stumbling block of C language pointer syntax
Why Java was the target language I hate
[Spring Boot] The story that the bean of the class with ConfigurationProperties annotation was not found
It was too late to run the code that looked cool in C language
The story of porting code from C to Go and getting hooked (and to the language spec)
I don't understand the meaning of usingless machine language code generated by compiling C language
[Python / C] I made a device that wirelessly scrolls the screen of a PC remotely.
A function that swaps the elements of a dynamically allocated string array in C language
A story that was convenient when I tried using the python ip address module
Here is one of the apps with "artificial intelligence" that I was interested in.
Iloilo summary used in the field of C language
A story that reduces the effort of operation / maintenance
[C #] I compared the speeds of Debug.WriteLine and Console.WriteLine.
I felt that I ported the Python code to C ++ 98.
Initialization summary of the structure including the pointer of C language.
[Ruby] Misunderstanding that I was using the module [Beginner]
Let's summarize the pointers of C language (basic edition)
A story that analyzed the delivery of Nico Nama.
This and that of "* p" in C language (* p ++ nightmare)
The story of automatic language conversion of TypeScript / JavaScript / Python
A story that I was addicted to at np.where
Organize the story around the C language array a little
Java language from the perspective of Kotlin and C #
The story that Japanese output was confused with Django
[C language] [Linux] Get the value of environment variable
I was addicted to the record of the associated model