[PYTHON] The rough difference between Unicode and UTF-8 (and their friends)

Unicode and UTF-8.png

What is Unicode

A character set. Manage by adding an integer value called code point to each character. There is.

What are UTF-8, UTF-16, UTF-32?

[Character encoding](https://ja.wikipedia.org/wiki/%E6%96%87%E5%AD%97%E7%AC%A6%E5%8F%B7%E5%8C%96% E6% 96% B9% E5% BC% 8F). Converts the code point integer value to a byte string for computer use.

What is endian?

The order in which bytes are arranged when recording data consisting of multiple bytes in memory or when sending and receiving over a network. Sometimes called ** byte order **. For big endian, arrange from the upper side, and for little endian, arrange from the lower side.

エンディアン.png

Verification code

This is the code when I verified it with Python when writing this article.

In [1]: import unicodedata

In [2]: import binascii

In [3]: unicodedata.name('Depression') #Look up the name
Out[3]: 'CJK UNIFIED IDEOGRAPH-9B31'

In [4]: ord('Depression') #Examine the code point
Out[4]: 39729

In [5]: binascii.hexlify('Depression'.encode('UTF-8')) # UTF-Encode to byte string at 8 and then convert to hexadecimal representation
Out[5]: b'e9acb1'

In [6]: binascii.hexlify('Depression'.encode('UTF-16'))
Out[6]: b'fffe319b'

In [7]: binascii.hexlify('Depression'.encode('UTF-16LE'))
Out[7]: b'319b'

In [8]: binascii.hexlify('Depression'.encode('UTF-16BE'))
Out[8]: b'9b31'

In [9]: binascii.hexlify('Depression'.encode('UTF-32'))
Out[9]: b'fffe0000319b0000'

In [10]: binascii.hexlify('Depression'.encode('UTF-32LE'))
Out[10]: b'319b0000'

In [11]: binascii.hexlify('Depression'.encode('UTF-32BE'))
Out[11]: b'00009b31'

In [12]: binascii.hexlify('Pleasure'.encode('UTF-16'))
Out[12]: b'fffeeb5f1f61'

In [13]: binascii.hexlify('Pleasure'.encode('UTF-16LE'))
Out[13]: b'eb5f1f61'

In [14]: binascii.hexlify('Pleasure'.encode('UTF-16BE'))
Out[14]: b'5feb611f'

Recommended Posts

The rough difference between Unicode and UTF-8 (and their friends)
What is the difference between `pip` and` conda`?
About the difference between "==" and "is" in python
About the difference between PostgreSQL su and sudo
What is the difference between Unix and Linux?
Can BERT tell the difference between "candy (candy)" and "candy (rain)"?
What is the difference between usleep, nanosleep and clock_nanosleep?
EP 3 Know the Differences Between bytes, str, and unicode
How to use argparse and the difference between optparse
Difference between process and job
Difference between "categorical_crossentropy" and "sparse_categorical_crossentropy"
Difference between regression and classification
Difference between np.array and np.arange
Difference between MicroPython and CPython
Difference between ps a and ps -a
Difference between return and print-Python
What is the difference between a symbolic link and a hard link?
Understand the difference between cumulative assignment to variables and cumulative assignment to objects
A rough summary of the differences between Windows and Linux
The difference between foreground and background processes understood by the principle
Difference between Ruby and Python split
Difference between java and python (memo)
Difference between list () and [] in Python
Difference between SQLAlchemy filter () and filter_by ()
Difference between == and is in python
Memorandum (difference between csv.reader and csv.dictreader)
(Note) Difference between gateway and default gateway
Difference between Numpy randint and Random randint
Difference between sort and sorted (memorial)
Difference between python2 series and python3 series dict.keys ()
I investigated the behavior of the difference between hard links and symbolic links
[Python] Difference between function and method
Difference between SQLAlchemy flush () and commit ()
Python --Difference between exec and eval
[Python] Difference between randrange () and randint ()
[Python] Difference between sorted and sorted (Colaboratory)
[Introduction to Python] What is the difference between a list and a tuple?
[Xg boost] Difference between softmax and softprob
difference between statements (statements) and expressions (expressions) in Python
[Django ORM] Difference between values () and only ()
Difference between PHP and Python finally and exit
Difference between @classmethod and @staticmethod in Python
Difference between append and + = in Python list
Difference between nonlocal and global in Python
Difference between linear regression, Ridge regression and Lasso regression
[Python] Difference between class method and static method
Difference between docker-compose env_file and .env file
The subtle relationship between Gentoo and pip
[Python3] Switch between Shift_JIS, UTF-8 and ASCII
About the relationship between Git and GitHub
[Python Iroha] Difference between List and Tuple
[python] Difference between rand and randn output
speed difference between wsgi, Bottle and Flask
Difference between numpy.ndarray and list (dimension, size)
Difference between ls -l and cat command
Difference and compatibility verification between keras and tf.keras # 1
Summary of the differences between PHP and Python
The answer of "1/2" is different between python2 and 3
Difference between using and import on shield language
[python] Difference between variables and self. Variables in class
Bayesian modeling-estimation of the difference between the two groups-