[PYTHON] Basics of Quantum Information Theory: Quantum Error Correction (Classical Linear Code)

\def\bra#1{\mathinner{\left\langle{#1}\right|}} \def\ket#1{\mathinner{\left|{#1}\right\rangle}} \def\braket#1#2{\mathinner{\left\langle{#1}\middle|#2\right\rangle}}

Introduction

In the previous article, I found out what kind of framework quantum error correction is based on. Now that we have the prospect of various quantum error corrections other than Shor's code, we are wondering what other methods are available. But before that, it seems necessary to understand "classical linear code". It is said that the knowledge of classical linear codes is used as a base for the development of various methods of quantum error correction. So, this time, we will study classical linear code. After understanding the whole thing, I would like to confirm the operation of classical linear code by a Python program.

Let me first say what I am trying to explain in the sections that follow. This article is roughly divided into "theory explanation" and "simulation" parts. In "Explanation of theory", after explaining "generator matrix", "parity check matrix", and "sign distance" which are the basic concepts of classical linear code, "Hamming code" and "horizontal and vertical parity" are concrete examples of coding methods. Let's take a look at the "code". In addition, we will cover the slightly more advanced topics of "the limits of Gilbert-Varshamov" and "dual code". "Simulation" simulates the behavior of the Hamming code. While looking at the implementation example and processing result in Python, make sure that the error is corrected.

The following documents were used as references.

  1. Nielsen, Chan "Quantum Computer and Quantum Communication (3)" Ohmsha (2005)
  2. wikipedia-Linear code
  3. [wikipedia --Hamming code](https://ja.m.wikipedia.org/wiki/%E3%83%8F%E3%83%9F%E3%83%B3%E3%82%B0%E7% AC% A6% E5% 8F% B7)
  4. Quantum Native Dojo-Chapter 9 Quantum Error Correction
  5. Proof that the minimum distance of the Hamming code is generally 3
  6. Gilbert-Varshamov limit

Explanation of theory

Generator matrix

Let's start with a simple story. A very primitive version of the classical linear code is a coding method that simply repeats the original information (hereinafter referred to as the "repetition code"). For example, each bit of a binary bit series,

\begin{align}
0 \rightarrow 000 \\
1 \rightarrow 111 \tag{1}
\end{align}

It is transmitted with triple redundancy like. Since the information that was originally 0 becomes 000, for example, even if any of the bits in this is inverted to 010, this is because one of 000 had a bit inversion, and it was originally 0. You can guess that it was (that is, you can correct the error). If you invert 2 bits, it will be corrected by mistake, but if the probability is low, it is better than doing nothing, isn't it? In general, error correction code is basically to make the original bit information redundant and expand it. Making it redundant and expanding is called "encoding". There are various encoding methods.

To put it a little mathematically abstracted, a bit sequence consisting of $ k $ $ \ {0,1 \} $ is a set $ \ {0,1 \} ^ {k} $. It can be said that it is an element of. Similarly, $ n $ bit sequences are elements of the set $ \ {0,1 \} ^ {n} $. Therefore, encoding is a $ k $ dimensional vector $ x \ in \ {0,1 \} ^ {k} $ to a $ n $ dimensional vector $ y \ in \ {0,1 \ Corresponds to the mapping to } ^ {n} $. The code that limits this mapping to linear transformation is called the "linear code" [^ 1]. In addition, the space composed of the code after linear conversion is called "code space".

[^ 1]: In order to clarify the contrast with quantum, the title of this article is "classical linear code", but since it is troublesome in the text, I will simply call it "linear code".

For example, the encoding that makes the previous 1 bit redundant to 3 bits is

G =
\begin{pmatrix}
1\\
1\\
1
\end{pmatrix}  \tag{2}

It can be said that it is a linear transformation defined by the matrix. Applying the transformation matrix $ G $ to $ x = (0), x = (1) $

G
\begin{pmatrix}
0
\end{pmatrix} =
\begin{pmatrix}
1\\
1\\
1
\end{pmatrix}
\begin{pmatrix}
0
\end{pmatrix}
=
\begin{pmatrix}
0\\
0\\
0
\end{pmatrix}, \space
G
\begin{pmatrix}
1
\end{pmatrix}
 =
\begin{pmatrix}
1\\
1\\
1
\end{pmatrix}
\begin{pmatrix}
1
\end{pmatrix}
=
\begin{pmatrix}
1\\
1\\
1
\end{pmatrix}  \tag{3}

And you can see that it is doing the same thing as equation (1). So the repeat code is a linear code. This $ G $ is called a "generator matrix" in the sense that it is a matrix that generates a sign from the original information. Since then, various linear operations will appear, but all algebraic operations will be performed with 2 as the law. Otherwise, each bit will not fit in $ \ {0,1 \} $ [^ 2].

[^ 2]: By the way, in this article, we only deal with codes based on binary numbers, but in general linear code theory, the theory is constructed so that it can be based on any number, and it is based on q-ary numbers. It seems that the code that is set to is especially called "q element linear code".

So, here is one example. What kind of coding operation does the generator matrix $ G $ below represent?

\begin{pmatrix}
1 & 0 \\
1 & 0 \\
1 & 0 \\
0 & 1 \\
0 & 1 \\
0 & 1
\end{pmatrix} \tag{4}

It's easy.

\begin{align}
(0,0)^T \rightarrow (0,0,0,0,0,0)^T  \\
(0,1)^T \rightarrow (0,0,0,1,1,1)^T  \\
(1,0)^T \rightarrow (1,1,1,0,0,0)^T  \\
(1,1)^T \rightarrow (1,1,1,1,1,1)^T  \tag{5}
\end{align}

It is the encoding. Earlier, 1 bit was made redundant to 3 bits, but this time 2 bits are made redundant to 6 bits. Generally, the code that encodes k bits into n bits is called the [n, k] code. The first example is the [3,1] code, and this example is the [6,2] code. As you can see easily, the generator of the [n, k] sign is the $ n \ times k $ matrix [^ 3].

[^ 3]: There are some literatures and texts that describe the encoding operation as $ xG $, with the generator matrix $ G $ as the $ k \ times n $ matrix and the sign $ x $ as the horizontal vector. In the "Explanation of Theory" part of this article, the code $ x $ is a vertical vector and the generator matrix is $ n according to Neilsen, Chan. As a \ times k $ matrix, we will describe the encoding operation as $ Gx $. As I will explain later, in the "Simulation" part, it is reversed for convenience. But for the time being, think of the original information and sign as a vertical vector.

You might have thought that you could freely create any linear code if you set the generator matrix appropriately like this, but that is not the case. In order for the matrix $ G $ to be a generator matrix, each column vector $ \ {g_1, g_2, \ cdots, g_k \} $ that makes up the generator matrix must be linearly independent. Given linear dependency, different information $ x $ and $ x ^ {\ prime} $ may be mapped to the same code $ y $.

What do you mean? This is explained below.

The code $ y $ is from the original information $ x $

y = Gx = \sum_{i=0}^{k} g_{i} x_{i}  \tag{6}

It is generated by applying the generator matrix $ G $, as in. Now, $ x ^ {\ prime} $, which is different from $ x $,

y^{\prime} = Gx^{\prime} = \sum_{i=0}^{k} g_{i} x_{i}^{\prime}  \tag{7}

Suppose it is generated as follows. If $ \ {g_i \} $ is linearly independent, then Eq. (7) minus Eq. (6),

y^{\prime} - y = \sum_{i=0}^{k} g_{i} (x_{i}^{\prime} - x_{i})  \tag{8}

In, the left side is $ 0 $ only when $ x_ {i} ^ {\ prime} --x_ {i} = 0 $ for all $ i $ (from the definition of linear independence). That is, $ y $ and $ y ^ {\ prime} $ are equal only when $ x $ and $ x ^ {\ prime} $ are equal. Conversely, if the vectors $ x $ and $ x ^ {\ prime} $ are different, then $ y $ and $ y ^ {\ prime} $ will always be different.

On the other hand, consider what happens if $ \ {g_i \} $ is linearly dependent. For example, for simplicity

g_1 = g_2 + g_3  \tag{9}

Suppose that the relationship is established. This is linearly dependent [^ 4]. Then, equation (6)

[^ 4]: Now I happen to bring in the 2nd and 3rd $ g_i $ and put that the sum is exactly equal to the 1st $ g_i $, but even if it isn't ( (Even if you bring any combination), the following discussion holds.

\begin{align}
y &= G x = \sum_{i=0}^{k} g_{i} x_{i}  \\
&= (g_2 + g_3) x_1 + g_2 x_2 + g_3 x_3 + \cdots + g_k x_k \\
&= g_2 (x_1 + x_2) + g_3 (x_1 + x_3) + g_4 x_4 + \cdots + g_k x_k \tag{10}
\end{align}

And $ x = (x_1, x_2, x_3, x_4, \ cdots, x_k) ^ T $ and $ x ^ {\ prime} = (\ bar {x_1}, \ bar {x_2}, \ bar {x_3}, x_4, \ cdots, x_k) ^ T $ will generate the same sign (where $ \ bar {x_i} $ is the bit inversion of $ x_i $). So each column vector that makes up the generator matrix must be linearly independent [^ 5].

[^ 5]: This completes "Practice Exercise 10.15" of Nielsen, Chan (probably).

Parity check matrix

Now that we have coded the original information by applying a generator matrix, let's consider detecting the noise added to that code. First, a concrete example is shown. The first [3,1] repeat code generator matrix shown was Eq. (2). For the generated code

H =
\begin{pmatrix}
1 & 1 & 0 \\
0 & 1 & 1
\end{pmatrix}  \tag{11}

What happens if you apply the matrix? Only when the sign $ x $ is $ (0,0,0) ^ T $ and $ (1,1,1) ^ {T} $ (ie only if there are no mistakes),

H x = 0  \tag{12}

Will be. Other than that, it will not be $ 0 $. Therefore, it can be determined that there was an error. Which bit has an error is determined by an algorithm that differs depending on the coding method. In the case of this example, if $ Hx $ is $ (1,0) ^ {T} $, the first bit, if $ (1,1) ^ {T} $, the second bit, $ ( In the case of 0,1) ^ {T} $, it can be determined that the third bit has an error. The matrix $ H $ constructed to check for errors in this way is called the "parity check matrix" and is defined as satisfying the relationship in equation (12). In the case of this example, since it is a repeating code, it is possible to make a judgment by taking a majority vote every 3 bits, but in order to correspond to various linear codes, it is more general to check the result by linear operation. It will be a way of saying.

Next, the [6,2] repeat code generator matrix given as an example was Eq. (4). What is the parity check matrix for this? Wait 1 minute.

How is it? Did you understand?

Now let's say the answer (though I think you could see it).

\begin{pmatrix}
1 & 1 & 0 & 0 & 0 & 0 \\
0 & 1 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 1 & 0 \\
0 & 0 & 0 & 0 & 1 & 1
\end{pmatrix}  \tag{13}

That's right [^ 6]. I think it's easy to see that this is the case. Converting the 6-dimensional vector on the right side of equation (5) with the above matrix will result in $ 0 $, but if there is an error in any one bit, it will not be $ 0 $. It is possible to determine which bit has an error using the same rules as before. And the [n, k] sign generation matrix was the $ n \ times k $ matrix, but the corresponding parity check matrix is the $ (n-k) \ times n $ matrix.

Let's put it a little more general. Suppose noise is added to the encoded $ y $ and it changes like $ y ^ {\ prime} = y + e $. Multiplying this by the parity check matrix gives $ Hy ^ {\ prime} = Hy + He = He $, leaving only the contribution to the error information added by the noise. This $ Hy ^ {\ prime} $ is called "error syndrome". Then, based on the calculation result of this error syndrome, the error is corrected concretely. If the error occurs in the $ j $ th bit, $ e $ is a vector with the $ j $ th component being $ 1 $ and the rest being all $ 0 $. Therefore, the calculation result of $ Hy ^ {\ prime} $ is exactly equal to the column vector of the $ j $ column of $ H $. Therefore, by determining which column in $ H $ the error syndrome is equal to, you can find out which bit is incorrect. If you invert that bit, you have corrected the error.

[^ 6]: This is the answer to "Practice Exercise 10.17" by Nielsen, Chan.

Now, wouldn't it be nice to find the parity check matrix from a generator matrix that you came up with, or conversely, find the generator matrix from a given parity check matrix? (Think!) Actually, you can do it in the following way.

To get the parity check matrix from the generator matrix, choose $ nk $ linear independence vectors $ y_1, y_2, \ cdots, y_ {nk} $ orthogonal to the $ G $ column, and the $ H $ row is $ Set to y_ {1}, y_ {2}, \ cdots, y_ {nk} $. Conversely, to find the generator matrix from the parity check matrix, select $ k $ linearly independent vectors $ y_1, y_2, \ cdots, y_k $ that core $ H $, and $ G $ is $ y_1, Set to have columns y_2, \ cdots, y_k $.

Hmm? It may be like that, so chew it a little.

If you apply $ G $ to the original information $ x $, encode it, and then apply $ H $ as is (without error), you get $ 0 $. It's $ 0 $ for any $ x $, so

HG = 0  \tag{14}

The relationship holds [^ 7]. If you want to find $ H $ from the given $ G $, you have to satisfy equation (14), so for all the column vectors that make up $ G $ and all the row vectors that make up $ H $. All inner products must be $ 0 $. In other words, it must be orthogonal. The entire code space is $ n $ dimension, and the column vectors that make up $ G $ are $ k $. If you choose a linearly independent vector that is orthogonal to this, it will inevitably be $ n-k $ pieces. So, I think you have an image of getting $ H $ from $ G $. Next, we want to find $ G $ from the given $ H $. First, consider the space where the "core" of $ H $ is placed. The core of the linear operator $ A $ is the subspace of the entire vector $ x $ such that $ Ax = 0 $. In this case, the space that calculates $ H $ to become $ 0 $ is the code space itself generated by $ G $, so if you find a linearly independent vector that forms the core of $ H $, that is the sign. It is a linear independent vector that stretches the space. Since $ H $ is a $ n-k $ row for a space of $ n $ dimension as a whole, we can get $ k $ of linear independence vectors [^ 8]. Then set it as a $ G $ column vector and you're done with $ G $. Earlier, I mentioned that the parity check matrix is a $ (n-k) \ times n $ matrix, but I think you can see why this is so.

[^ 7]: This is "Practice Exercise 10.18" from Nielsen, Chan. I think it is obvious from the definition of the parity check matrix.

[^ 8]: It may be easier to understand if you think of $ Hx = 0 $ as a system of equations consisting of $ n-k $ equations with $ n $ variables. Since $ n-k $ degrees of freedom are bound by $ n-k $ degrees of freedom, the net degrees of freedom are $ k $ dimensions after all.

That's okay to finish this section, but there's one more important thing about the parity check matrix. That is, the parity check matrix can always be rewritten into the "canonical form" while preserving its effect. The standard form is

H = (A|I_{n-k})  \tag{15}

It is a matrix of the form. Where $ A $ represents some $ (n-k) \ times k $ matrix and $ I_ {n-k} $ represents the $ n-k $ dimension identity matrix. Equation (15) is a matrix that looks like they are merged side by side. If $ H $ is a parity check matrix, then $ H x = 0 $ holds for the vector $ x $ on the code space. If you look at this as a system of $ n-k $ equations consisting of $ n $ variables, you can see why it can be rewritten in the form of equation (15). Remember junior high school math. When solving the simultaneous equations, I think that I gradually made it into a simple form by multiplying both sides by something while staring at each equation (that is, it is an addition and subtraction method), but I imagine the same operation. please. However, since the number of equations is small compared to the number of variables, the exact solution cannot be obtained. The answer is to represent each of the $ n-k $ variables as the linear sum of the other $ k $ variables. That's exactly what Equation (15) is assigned to $ Hx = 0 $.

When the parity check matrix is standardized in this way, the generator matrix can be obtained directly from here. Using $ I_k $ as the identity matrix of the $ k $ dimension

G =
\begin{pmatrix}
I_k \\
-A
\end{pmatrix}  \tag{16}

Just do. Since $ HG = 0 $, this is certainly a generator matrix for the parity check matrix in equation (15) [^ 9]. How is it? It's easy. Earlier, I explained the core of operators and linear independence, but this is a moment!

[^ 9]: $ -A $ in equation (16) can be replaced with $ A $ by removing the minus if it is based on a binary number like this time, that is, if we consider a binary linear code. All calculations are based on 2.

Then, using the generator matrix created this way, the original information is the very first $ k $ pieces of the encoded $ n $ dimensional vector. If you were able to correct the error in the previous procedure, you can restore the original information by simply extracting the first $ k $ (easy!).

Sign distance

Now that we have explained the generator matrix and the parity check matrix, we will now explain another important concept, the "sign distance".

Define the distance $ d (y, y ^ {\ prime}) $ in the two vectors $ y, y ^ {\ prime} $ on the code space $ \ {0,1 \} ^ {n} $ can do. Compare the elements of $ y $ and $ y ^ {\ prime} $ and use the sum of the different things as the distance (however, the sum is calculated normally, not the calculation based on 2). The distance defined in this way is called the "Hamming distance". Also, the "Hamming weight" $ wt (y) $ with the sign $ y $ is defined as $ wt (y) \ equiv d (y, 0) $ as the distance from the $ 0 $ vector.

Using this Hamming distance, we define the "sign distance" $ d (C) $ of the code space $ C $ obtained by the coding specified by $ y = G x $ as follows.

d(C) = \min_{x,y \in C, x \neq y} d(x,y)  \tag{17}

In other words, the minimum value of the Hamming distance between two arbitrarily selected signs is the sign distance (so it is also called the "minimum distance" of the sign). Since $ d (x, y) = wt (x + y) $ holds, [^ 10], equation (17) is

[^ 10]: $ d (x, y) $ was the sum of the different elements of $ x, y $. This is equivalent to doing $ x + y (mod \ space 2) $ and then counting the number of $ 1 $ included as an element. On the other hand, $ wt (x + y) $ is, by definition, exactly the number of $ 1 $ included as its element. Therefore, $ d (x + y) = wx (x + y) $ holds.

d(C) = \min_{x \in C, x \neq 0} wt(x)  \tag{18}

It is the same even if it is rewritten as.

Since the sign distance is a very important feature that represents the features of the sign space, the [n, k] sign whose sign distance is $ d = d (C) $ is especially [n, k, d]. It is often described in.

One of the reasons why it is so important is that it shows how many bits of error correction is possible. [3,1] Recalling the repeating sign, the distance of this sign is 3 by the current definition. Since it is made redundant three times, if you correct the error by majority vote, you can definitely handle up to 1 bit error. [5,1] In the case of a repeating code, the distance is 5, and if the error is corrected by the majority voting method, up to 2 bit errors can be handled. By extending that, if the sign distance is a sign of $ 2t + 1 $, you can definitely correct the error of the $ t $ bit. Therefore, the sign distance is closely related to the number of bits that can be corrected for errors. In general, a unique code word $ that satisfies the damaged code $ y ^ {\ prime} $ with $ d (y, y ^ {\ prime}) \ leq t $ by a code at a distance of $ 2t + 1 $ Simply decrypting as y $ can correct errors up to the $ t $ bit.

There are two more important things to say about sign distances, which I'll discuss in turn.

First, the first.

There is a way to estimate the sign distance from a given [n, k] sign. Suppose that selecting any $ d-1 $ pieces from the column vectors that make up the parity check matrix is always linearly independent, and selecting $ d $ pieces may result in linear dependence. At this time, the sign distance is equal to $ d $ [^ 11]. Let's prove it.

[^ 11]: This is "Practice Exercise 10.20" from Nielsen, Chan.

[Proof]

Let $ H $ be the [n, k] sign parity check matrix, and the column vectors that make up it be $ \ {h_1, h_2, \ cdots, h_n \} $. If any sign on the code space C is $ x $, then $ H x = 0 $, so

h_1 x_1 + h_2 x_2 + \cdots + h_n x_n = 0  \tag{19}

Is established. Now for a positive integer $ s (\ leq n) $

\begin{align}
& x_{i_1}, x_{i_2}, \cdots , x_{i_s} \neq 0 \\
& x_{i_{s+1}}, x_{i_{s+2}}, \cdots , x_{i_n} = 0 \tag{20}
\end{align}

And put $ x $

x = (x_{i_1}, x_{i_2}, \cdots , x_{i_s}, \cdots , x_{i_n})^{T}  \tag{21}

Decide like this. If $ x $ is a code on the code space, substitute it in equation (19) and

h_1 x_1 + h_2 x_2 + \cdots + h_s x_s = 0  \tag{22}

Must hold. By assumption, $ s $ must be greater than $ d-1 $ because any $ d-1 $ of $ {h_i} $ is linearly independent. This is because if $ s $ is less than or equal to $ d-1 $, then $ x_1, x_2, \ cdots, x_s $ must all be $ 0 $ from the definition of linear independence. So we know that it should be $ s \ geq d $. Since $ s $ is the number of $ 1 $ contained in the sign (ie, the Hamming weight), it must be greater than or equal to $ d $, that is, the distance $ d (C) $ of this sign is Must meet $ d (C) \ geq d $.

Now that we want to prove $ d (C) = d $, we just need to show that there is at least one code $ x $ with a Hamming weight of $ wt (x) = d $. The premise is that there are $ d $ linearly dependent column vectors in the column vectors that make up $ H $, so let's call them $ h_ {i_1}, h_ {i_2}, \ cdots, h_ { Let's say i_d} $. Then

a_{i_1} h_{i_1} + a_{i_2} h_{i_2} + \cdots + a_{i_d} h_{i_d} = 0  \tag{23}

There will be $ a_ {i_1}, a_ {i_2}, \ cdots, a_ {i_d} \ neq 0 $ (from the definition of linear dependency). Using this $ a_ {i_1}, a_ {i_2}, \ cdots, a_ {i_d} $,

x = (0, \cdots , 0, a_{i_1}, 0, \cdots , 0, a_{i_2}, 0, \cdots , 0, a_{i_d}, 0 , \cdots , 0) \tag{24}

By defining, this $ x $ satisfies $ Hx = 0 $, so it can be said to be an element of the code space we are thinking about. And the Hamming weight $ wt (x) $ of this $ x $ is $ d $. Now we have proved $ d (C) = d $. (End of proof)

Next is the second important thing.

The [n, k, d] code must satisfy the relationship $ n-k \ geq d-1 $. In other words, once $ n $ and $ k $ are determined, the possible $ d $ range is automatically determined. Alternatively, you can decide on $ k $ and $ d $ to automatically determine the possible $ n $ range. This relational expression is called the "singleton limit" [^ 12]. Let's prove it.

[^ 12]: This is "Practice Exercise 10.21" from Nielsen, Chan.

[Proof]

The [n, k] sign parity check matrix is a $ (n-k) \ times n $ matrix, so its rank is

rank(H) \leq n-k  \tag{25}

must be. And the rank is the maximum number of column vectors (or row vectors) that make up the matrix that are linearly independent [^ 13]. This means that if you choose $ rank (H) + 1 $ column vectors from $ H $, they will always be linearly dependent. From what we proved earlier, the maximum sign distance is equal to the maximum number of linearly independent column vectors contained in $ H $ plus $ 1 $ [^ 14]. That is,

d \leq rank(H) + 1  \tag{26}

is. Eliminating $ rank (H) $ from equations (25) and (26)

n-k \geq d-1 \tag{27}

Is established. (End of proof)

[^ 13]: Hmm? If you think that, please refer to the book of suitable linear algebra.

[^ 14]: It may be difficult to understand by reading this area. Including the meaning of the precondition proved earlier, "If you select any $ d-1 $ pieces, they are always linearly independent, and if you select $ d $ pieces, they may become linearly dependent." Please take a closer look. "Choose any $ d-1 $ pieces will always be linearly independent" is different from choosing $ d $ pieces to be linearly dependent. This means that the maximum value of $ d-1 $ is limited by the maximum number of linearly independent vectors in $ H $ (that is, $ rank (H) $).

Concrete example

Now that we have a basic explanation, I would like to introduce two specific examples of coding methods. One is the "Hamming code" and the other is the "horizontal and vertical parity code".

Hamming code

The Hamming code is a $ [2 ^ {r} -1, 2 ^ {r} -r-1] $ code characterized by the integer $ r \ geq 2 $. The parity check matrix $ H $ when $ r = 3 $ is shown below.

H =
\begin{pmatrix}
0 & 0 & 0 & 1 & 1 & 1 & 1 \\
0 & 1 & 1 & 0 & 0 & 1 & 1 \\
1 & 0 & 1 & 0 & 1 & 0 & 1
\end{pmatrix}  \tag{28}

It may look like $ 0,1 $ random with no regularity, but it's not. If you look at the column vectors in order from the left, it looks like an integer from $ 1 $ to $ 8 $ converted to a 3-bit binary number. Since the [n, k] code parity check matrix was a $ (n-k) \ times n $ matrix, the Hamming code parity check matrix is a $ r \ times (2 ^ {r} -1) $ matrix. If $ r = 3 $, it is a $ 3 \ times 7 $ matrix, so equation (28) is certainly like that.

Let's fix this to the standard form. All you have to do is swap the column vectors [^ 15].

[^ 15]: Corresponds to changing the definition of the order of signs.

H =
\begin{pmatrix}
0 & 1 & 1 & 1 & 1 & 0 & 0 \\
1 & 0 & 1 & 1 & 0 & 1 & 0 \\
1 & 1 & 0 & 1 & 0 & 0 & 1
\end{pmatrix}  \tag{29}

It will be. In this form, the corresponding generator matrix $ G $ is instantly known.

G =
\begin{pmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
0 & 1 & 1 & 1 \\
1 & 0 & 1 & 1 \\
1 & 1 & 0 & 1
\end{pmatrix}  \tag{30}

It will be.

As I explained earlier, the sign distance can be found from the parity check matrix. The form of equation (28) is easier to understand, so take a look there. Choosing any two column vectors is always linearly independent. For example, choosing the first and second columns is linearly independent (and any other two you choose). However, if you choose the next third column, it will be linearly dependent. So, the distance of this [7,3] -Hamming code is 3 [^ 16].

[^ 16]: This is "Practice Exercise 10.22" from Nielsen, Chan.

Since the maximum number of bits that can be error-corrected with this code is $ t $ of $ 3 = 2t + 1 $, it can be seen that any 1-bit error can be corrected.

By encoding with the standard equation (30), calculating the error syndrome with equation (29), and comparing it with the column vectors of the parity check code, you can see which bit the error occurred. So, bit-invert it [^ 17].

[^ 17]: It may be better to calculate the error syndrome using equation (28) instead of the canonical form. It seems easy to implement because the number obtained by converting the binary sequence of the error syndrome into a decimal number is equal to the bit number where the error occurred. When calculating the error syndrome using the standard form, it is necessary to determine the number of columns in the parity check matrix each time. You might think, but I think it's practical to create a look-up table in advance (although it's a bit cumbersome to implement).

Decryption was considered in standard form, so the original information can be restored by extracting the first 4 bits.

Horizontal vertical parity check code

Horizontal-vertical parity check code is a coding method that blocks the original information and retains the parity. In the $ \ {0,1 \} $ series, if the number of $ 1 $ is even, the parity is defined as $ 0 $ (or even parity), and if it is odd, the parity is defined as $ 1 $ (or odd parity). To do. This parity is retained separately from the original information data. For example, suppose the original information is a 6-digit $ \ {0,1 \} $ series. At this time, arrange the information in a block of $ 2 \ times 3 $. For example

1 0 1
0 1 1

will do. Now, look at the vertical and horizontal sequences, calculate the parity of each, and arrange them on the right or bottom. Then

1 0 1 | 0
0 1 1 | 0
-------
1 1 0

It will be. This total of 10 $ \ {0,1 \} $ is transmitted as a one-dimensional series. If you scan the original information from top left to bottom right and then send the row parity part and the column parity part together, the entire code sequence is $ (1,0,1,0,1, It will be 1,0,0,1,1,0) ^ {T} $, which is the [10,6] sign.

If one bit of the original information is incorrect, it will be inconsistent with the parity part. For example, if the bit in the rightmost corner is inverted,

1 0 1 | 0
0 1 0 | 0
-------
1 1 0

It will be in such a state. If you do so, the parity calculation in the second row and the parity calculation in the third column will not match. You can see where the error is occurring from the rows and columns that do not match. Also, suppose that an error occurs in the parity part and the parity value on the first line is inverted.

1 0 1 | 1
0 0 0 | 0
-------
1 0 1

It will be like this. Then, the parity calculation of the column is all correct, but only the parity of the first row has a strange value. In this case, it can be determined that the parity itself on the first line is inverted.

The code is composed in this way, but it is actually a kind of linear code.

First, consider the generator matrix.

Generates a code in which the original information is $ (1,0,1,0,0,0) ^ {T} $ and the parity part $ (1,0,1,0,1) $ is merged with it. Just do it. Since the original information part does not change as it is, it is sufficient to put an identity matrix. The parity part is based on the original information.

A =
\begin{pmatrix}
1 & 1 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 1 & 1 \\
1 & 0 & 0 & 1 & 0 & 0 \\
0 & 1 & 0 & 0 & 1 & 0 \\
0 & 0 & 1 & 0 & 0 & 1
\end{pmatrix}  \tag{31}

You can generate it by multiplying this matrix (of course, the calculation is mod 2, just in case). The first row of $ A $ is the parity calculation of the first row of the original information block, the second row of $ A $ is the parity calculation of the second row of the original information block, and in the following order, the parity of the first column. It is for calculation, parity calculation in the second column, and parity calculation in the third column. Therefore, the generator matrix $ G $ combines the identity matrix with this $ A $.

G =
\begin{pmatrix}
1 & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 1 \\
1 & 1 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 1 & 1 \\
1 & 0 & 0 & 1 & 0 & 0 \\
0 & 1 & 0 & 0 & 1 & 0 \\
0 & 0 & 1 & 0 & 0 & 1
\end{pmatrix}  \tag{32}

It means that. Since it is in the standard form, the parity check matrix is easy.

H =
\begin{pmatrix}
1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 1 & 1 & 0 & 1 & 0 & 0 & 0 \\
1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\
0 & 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 1 & 0 \\
0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 1
\end{pmatrix}  \tag{33}

Will be.

By calculating the error syndrome using this $ H $ and seeing which column of $ H $ the result corresponds to, the error bit can be identified, so if that bit is inverted. You can correct errors. After that, the original information can be restored by extracting the first 6 bits.

Gilbert-Varshamov Limits

Now that we have a basic understanding of linear codes, including concrete examples, let's take up a slightly more advanced topic. There is an interesting theorem called "Gilbert-Varshamov's Limits", so I'll prove it [^ 18].

[^ 18]: [Practice Exercise 10.23] by Nielsen, Chan (However, the wording has been changed with reference to other documents). .. It says, "The proof is easy, so I'll leave it for the exercises," but it's not easy at all. I often come across it when reading math books. Don't believe the words "easily understand" or "obvious".

[Theorem: The limit of Gilbert-Varshamov]

For $ 0 \ leq \ delta \ leq \ frac {1} {2} $, if a sufficiently large code length $ n $ is taken, the code distance $ d = \ delta n $, the code rate $ R \ The sign of geq 1 --H (\ delta) $ exists. Where $ H $ is a binary entropy, defined by $ H (x) \ equiv -x \ log (x)-(1-x) \ log (1-x) $.

In other words, this theorem realizes the distance $ d = \ delta n $ of a given code and the coding rate is $ 1-H (\ delta) $ or more if a sufficiently large $ n $ is taken. It means that the code always exists. Let's prove it.

[Proof of the theorem]

First, after defining a "Hamming sphere", I will prove two lemmas related to it.


Definition: Hammmng Sphere

For $ x \ in \ {0,1 \} ^ {n} $ and the real number $ r $, the "Hamming sphere" $ B (x, r) $ centered on $ x $ is

B(x,r) = \{y \in \{ 0,1 \}^{n}: d(x,y) \leq r \}  \tag{34}

It shall be defined in. Here, $ d (x, y) $ is the Hamming distance, so $ B (x, r) is $. In short, the difference between each element is less than $ r $ around $ x $. Think of it as a gathering. If you count how many of these points there are, it is the volume of the Hamming sphere. The volume of the Hamming sphere is actually the same regardless of the center $ x $ as long as the radius $ r $ is determined. Hmm? You may think that, no matter what value $ x $ is, the number of Hamming spheres when at most $ r $ is taken out from the total number of $ n $ elements and inverted is the Hamming sphere. You can see that it is equal to the volume of. Its volume $ Vol (n, r) $ is

Vol(n,r) = \sum_{j=0}^{r}
\begin{pmatrix}
n \\
j
\end{pmatrix}  \tag{35}

It can be calculated with.


Lemma (1)

For $ n $, $ 0 \ leq p \ leq \ frac {1} {2} $, which is large enough

Vol(n,pn) \leq 2^{H(p)n}  \tag{36}

Is established.


Proof of Lemma (1)

\begin{align}
\frac{Vol(n,pn)}{2^{H(p)n}} &= \frac{\sum_{j=0}^{pn} \begin{pmatrix} n \\ j \end{pmatrix}}{p^{-pn} \cdot (1-p)^{-(1-p)n}} \\
&= \sum_{j=0}^{pn} \begin{pmatrix} n \\ j \end{pmatrix} p^{pn} (1-p)^{(1-p)n} \\
&= \sum_{j=0}^{pn} \begin{pmatrix} n \\ j \end{pmatrix} (1-p)^{n} (\frac{p}{1-p})^{pn} \tag{37}
\end{align}

Here, $ p \ leq \ frac {1} {2} $ is $ \ frac {p} {1-p} \ leq 1 $, so

\begin{align}
\frac{Vol(n,pn)}{2^{H(p)n}} &\leq \sum_{j=0}^{pn} \begin{pmatrix} n \\ j \end{pmatrix} (1-p)^{n} (\frac{p}{1-p})^{j} \\
&= \sum_{j=0}^{pn} \begin{pmatrix} n \\ j \end{pmatrix} (1-p)^{n-j} p^{j} \\
&= ((1-p)+p)^{n} = 1^{n} = 1 \tag{38}
\end{align}

From this, equation (36) can be derived. (End of proof of lemma (1))


Lemma (2)

|C| \geq \frac{2^{n}}{Vol(n,d-1)} \tag{39}

Is established. here,|C|Is the code spaceCThe total number of signs contained in.


Proof of Lemma (2)

For every $ x \ in \ {0,1 \} ^ {n} $ not included in $ C $, there is at least one sign $ c_x \ in C $,

d(x,c_x) \leq d-1 \tag{40}

Is established. This is because if it does not hold, it is okay to include $ x $ in the code space $ C $ (because the distance is invariant), which contradicts the assumption.

Then the whole set $ \ {0,1 \} ^ {n} $ is equal to the union of all spheres with radius $ d-1 $ centered on $ c \ in C $. .. That is,

\{ 0,1 \}^{n} = \bigcup_{c \in C} B(c,d-1) \tag{41}

Therefore,

\begin{align}
2^{n} &= | \{ 0,1 \}^{n} | = | \bigcup_{c \in C} B(c,d-1)| \\
&\leq \bigcup_{c \in C} | B(c,d-1) | \\
&= |C| \sum_{j=0}^{d-1} \begin{pmatrix} n \\ j \end{pmatrix} \\
&= |C| Vol(n,d-1)  \tag{42}
\end{align}

Next,

|C| \geq \frac{2^{n}}{Vol(n,d-1)} \tag{39}

You can see that holds true. (End of proof of lemma (2))

Now, let's prove the main body. It's a moment when you come here. Using the lemmas (1) and (2), where the code rate is $ R $, $ d = \ delta n $, and the binary entropy is $ H (\ cdot) $,

\begin{align}
R &= \frac{\log |C|}{n} \geq \frac{n - \log Vol(n,d-1)}{n} \\
&\geq \frac{n - H(\delta - 1/n)n}{n} \\
&= 1 - H(\delta - \frac{1}{n}) \\
&\geq 1 - H(\delta)  \tag{43}
\end{align}

Is established. (End of proof of theorem)

This theorem can also be rephrased as follows.

If you choose $ k $ for a sufficiently large $ n $,

\frac{k}{n} \geq 1 - H(\frac{2t}{n})  \tag{44}

There is a [n, k] error code that protects against $ t $ bit errors that satisfy [^ 19]. You can see from the relationship of $ d = 2t + 1 $ between the sign distance $ d $ and the number of error-prone bits $ t $.

[^ 19]: Nielsen, Chan is explained in this way.

Dual sign

As a conclusion of the theory explanation, I will explain what is called "dual code". It is an important key concept in constructing a quantum code called CSS code.

Suppose $ C $ is a [n, k] code with a generator matrix $ G $ and a parity check matrix $ H $. At this time, $ C $ has a dual [n, nk] code $ C ^ {\ perp} $, a generator matrix $ H ^ {T} $, and a parity check matrix $ G ^ {T} $. Defined as a thing. That is, the dual of $ C $ consists of codes that are orthogonal to all the codes contained in $ C $. If $ C \ subseteq C ^ {\ perp} $, the code $ C $ is said to be "weak self-dual", and if $ C = C ^ {\ perp} $, it is said to be "strictly self-dual".

Here, I will explain two properties that hold in relation to the dual code.

First is the first.

A sign with a generator matrix $ G $ can be said to be a weak self-dual only when $ G ^ {T} G = 0 $ [^ 20]. That is,

G^{T}G = 0 \Leftrightarrow C \subseteq C^{\perp}  \tag{45}

Is established. Let's prove it.

[^ 20]: This is "Practice Exercise 10.24" from Nielsen, Chan.

[Proof]

(\Rightarrow)

By doing $ y = Gx $ for all $ x \ in \ {0,1 \} ^ {k} $, all $ y \ in C $ will be generated without exception. If you choose any $ y \ in C $ and apply the $ C ^ {T} $ parity check matrix, you get $ G ^ {T} y = G ^ {T} Gx $. Since $ G ^ {T} G = 0 $ was assumed, $ G ^ {T} y = 0 $. Then, the arbitrarily selected $ y $ is also an element of $ C ^ {\ perp} $. Therefore, $ C \ subseteq C ^ {\ perp} $ holds.

(\Leftarrow)

If it is $ y \ in C $, it is also $ y \ in C ^ {\ perp} $ from the premise. Therefore, applying the $ C ^ {\ perp} $ parity check matrix to $ y $ yields $ 0 $. That is, $ G ^ {T} y = 0 $. Since $ y = Gx $, then $ G ^ {T} Gx = 0 $, which must hold for any $ x $, so $ G ^ {T} G = 0 $.

(End of proof)

Next is the second one.

For the linear code $ C $

\begin{align}
x \in C^{\perp} &\Rightarrow \sum_{y \in C} (-1)^{xy} = |C| \\
x \notin C^{\perp} &\Rightarrow \sum_{y \in C} (-1)^{xy} = 0  \tag{46}
\end{align}

Is established [^ 21]. Let's prove it.

[^ 21]: This is "Practice Exercise 10.25" from Nielsen, Chan. Where $ xy $ is written is the inner product of $ x $ and $ y $. So you should really write $ x ^ {T} y $, but omit $ T $.

[Proof]

First, the first half.

Let the $ C $ generator and parity check matrices be $ G, H $, and the $ C ^ {\ perp} $ generator and parity check matrices be $ H ^ {T}, G ^ {T} $. Information before being mapped to $ y \ in C $ before being mapped to $ a \ in \ {0,1 \} ^ {k} $, $ x \ in C ^ {\ perp} $ Let the information be $ b \ in \ {0,1 \} ^ {nk} $. At this time, $ x = H ^ {T} b, y = Ga $ and $ xy = b ^ {T} HG a = 0 $, so

x \in C^{\perp} \Rightarrow \sum_{y \in C} (-1)^{xy} = \sum_{y \in C} 1 = |C|  \tag{47}

Is established.

Next is the second half.

For any $ y \ in C $, write $ y = Ga $ using the $ a \ in \ {0,1 \} ^ {k} $ and the $ C $ generator matrix $ G $. I can. Here, $ a $

a = (a_1, a_2, \cdots , a_k)^{T}  \tag{48}

Expressed by its components like, G,

G = (g_1, g_2, \cdots , g_k)  \tag{49}

Let's use the column vectors $ g_1, g_2, \ cdots, g_k $ to represent it. Then, the inner product of any $ x \ notin C ^ {\ perp} $ and $ y $ is

xy = xGa = a_1 (x g_1) + a_2 (x g_2) + \cdots + a_k (x g_k)  \tag{50}

Can be written. Here, at least one of $ (x g_ {i}) $ is always $ 1 $. This is because if they are all $ 0 $, then $ x $ and $ y $ will be orthogonal and the assumption of $ x \ notin C ^ {\ perp} $ will be denied. Now suppose $ (x g_1) $ is $ 1 $ (no matter which $ (x g_ {i}) $ is 1, the following argument holds). Then

xy = xGa = a_1 + a_2 (x g_2) + \cdots + a_k (x g_k)  \tag{51}

Next,

\begin{align}
\sum_{y \in C} (-1)^{xy}
&= \sum_{a_1, a_2, \cdots , a_k = 0,1} (-1)^{a_1} (-1)^{a_2 (xg_2) + \cdots + a_k (xg_k)} \\
&= \sum_{a_2, \cdots , a_k = 0,1} (1+(-1)) (-1)^{a_2 (xg_2) + \cdots + a_k (xg_k)} \\
&= 0 \tag{52}
\end{align}

(End of proof)

simulation

The explanation of the theory has become quite long, but now that it is complete, let's check the operation of the linear code with a Python program. I should have understood the three codes, "repetition code", "Hamming code", and "horizontal and vertical code", so I'd like to try them all, but I'm pretty tired, so I'll do only one. Anything was fine, but somehow I tried to use "Hamming code".

Hamming code implementation

It randomly generates information data, encodes it with a Hamming code, and inverts one randomly selected bit. Correct it and decrypt it to make sure the error is correct. The whole Python code is below.

import numpy as np

def decimal_to_binlist(decimal, digits): # ex) 6,3 --> [1,1,0]

    bin_str = "{:0{digits}b}".format(decimal, digits=digits)
    return  [int(s) for s in list(bin_str)]
    
def binlist_to_decimal(bin_list): # ex) [0,1,1] --> 3

    return int("".join([str(i) for i in bin_list]), 2)

def make_hamming_matrix(r):

    # parity check matrix (H)
    A = []
    for x in range(1,2**r):
        bin_list = decimal_to_binlist(x, r)
        if sum(bin_list) == 1: continue
        A.append(bin_list)
    A = np.array(A)
    I_H = np.eye(r, dtype=int)
    H = np.concatenate([A, I_H])
    
    # represent integer each row of H matrix (for error correction algorithm)
    H_int = [binlist_to_decimal(row) for row in H]
    
    # generator matrix (G)
    I_G = np.eye(2**r-r-1, dtype=int)
    G = np.concatenate([I_G, A], 1)
    
    return G, H, H_int

def generate_data(k, N):  # random k-bits data

    for _ in range(N):
        yield np.random.randint(2, size=k)

def add_noise(d_in):  # bit flip to one bit (select randomly)

    idx = np.random.randint(len(d_in))
    err = np.array([1 if i == idx else 0 for i in range(len(d_in))])
    d_out = (d_in + err) % 2
    return d_out
    
def correct_error(d_in, H_int):

    d_out = d_in.copy()
    p = (d_out @ H) % 2
    x = binlist_to_decimal(p)
    err_idx = H_int.index(x)
    d_out[err_idx] = (d_out[err_idx] + 1) % 2  # bit flip (recover)
    return d_out
    
if __name__ == '__main__':

    r = 3
    n = 2**r - 1
    k = 2**r - r - 1
    N = 10

    G, H, H_int  = make_hamming_matrix(r)

    print("* input(random) -> encode -> add noise(random 1-bit flip) -> correct -> decode:")
    err_count = 0
    for x in generate_data(k, N):
        y = (x @ G)%2
        y_error = add_noise(y)
        y_correct = correct_error(y_error, H_int)
        x_correct = y_correct[0:k]  # decode (= extract 1st to k-th elements)
        print("{0:} -> {1:} -> {2:} -> {3:} -> {4:}".format(x,y,y_error,y_correct,x_correct))
        
        if sum((x+x_correct)%2) == 1: # if x != x_correct --> 1
            err_count += 1

    err_rate = err_count / N
    print("* error rate = {0:} (count:{1:} / total:{2:})".format(err_rate, err_count, N))

I will explain what you are doing. Look at the main processing section.

r = 3
n = 2**r - 1
k = 2**r - r - 1
N = 10

Set the parameters with. If you set the value of r (3 in the above example, but you can change it arbitrarily), the [n, k] sign n and k are automatically determined. N specifies how many data should be randomly generated.

G, H, H_int  = make_hamming_matrix(r)

Now, calculate the Hamming code generator matrix G and the parity check matrix H. Here is one note. In the above theoretical explanation, the [n, k] sign generation matrix is the $ n \ times k $ matrix, the parity check matrix is the $ (nk) \ times n $ matrix, and the sign generation is $ Gx $. I have written. However, for the convenience of the program, each is transposed. Therefore, the sign generation is in the form of multiplying from the reverse, such as $ xG $. When using numpy, it is more natural to write a vector as a horizontal vector instead of a vertical vector, so I did this. It may be a little confusing, but please switch (sorry).

H_int is a list of the bit array of each row of the parity check matrix (each column in the notation when the theory is explained) converted to a decimal number. It is easier to have this when calculating error correction, so I output it as a side note.

Now, take a look at the definition of the function make_hamming_matrix.

# parity check matrix (H)
A = []
for x in range(1,2**r):
    bin_list = decimal_to_binlist(x, r)
    if sum(bin_list) == 1: continue
    A.append(bin_list)
A = np.array(A)
I_H = np.eye(r, dtype=int)
H = np.concatenate([A, I_H])

Create a parity check matrix in standard form. In other words, create an r-dimensional identity matrix I_H and other matrices A, and merge them vertically with numpy's concatenate. Each row of the Hamming code parity check matrix (each column in the notation at the time of theoretical explanation) was a binary column of integers from 1 to 2 ** r. Now, I want to make it a standard form, so I take out only the integers other than the one corresponding to the identity matrix, make it a binary sequence, and arrange it vertically to make a matrix A. I'm doing that in the for loop above. Calculate a binary sequence from decimal x with binlist = decimal_to_binlist (x, r) and assign it to binlist. Please refer to the function definition for the processing in decimal_to_binlist. When the for loop is finished, make it a numpy matrix and merge it with the unit matrix created by numpy's eye to make the parity check matrix H.

# represent integer each row of H matrix (for error correction algorithm)
H_int = [binlist_to_decimal(row) for row in H]

Here we are calculating the list of decimal numbers for each row of H. The function binlist_to_decimal is a function that converts a binary sequence to a decimal sequence. For internal processing, refer to the function definition.

# generator matrix (G)
I_G = np.eye(2**r-r-1, dtype=int)
G = np.concatenate([I_G, A], 1)

Then, create the generator matrix G. It's a standard type, so it's easy. Just merge the matrix A and the identity matrix horizontally.

Now, let's return to the main processing section.

err_count = 0
for x in generate_data(k, N):
    ...

Then, initialize the error counter err_count to 0 and enter the for loop. With generate_data (k, N), N pieces of k-bit data are randomly generated. Substitute each data in x and execute the processing in the loop. For the processing contents of the function generator_data, refer to the function definition.

Below is the contents of the loop.

y = (x @ G)%2

Then, let x act on the generator matrix G to generate the code y. However, since I have to calculate mod 2, I set it to'% 2'.

y_error = add_noise(y)

Then, add noise to y at random to make y_error. Specifically, the bits are randomly selected and inverted. For the processing contents of add_noise, refer to the function definition.

y_correct = correct_error(y_error, H_int)

Performs error correction. Let's take a look at the contents of the function correct_error.

d_out = d_in.copy()
p = (d_out @ H) % 2
x = binlist_to_decimal(p)
err_idx = H_int.index(x)
d_out[err_idx] = (d_out[err_idx] + 1) % 2  # bit flip (recover)
return d_out

The first line copies the input data d_in to d_out. The second line activates the parity check matrix. The result is a binary sequence. Determines which row of the parity check matrix this corresponds to. Therefore, use binlist_to_decimal on the 3rd row to convert it to a decimal number, and then use the index method to find out which column of H_int corresponds to on the 4th row. Assign it to err_idx. Line 5 inverts the err_idxth bit of d_out. Finally, it returns d_out (I'm searching each time with the index method, but I think it's better to use LUT. It's troublesome, so I omitted the implementation).

Now, let's go back to the main processing section.

x_correct = y_correct[0:k]  # decode (= extract 1st to k-th elements)

Decrypts the original information from the error-corrected code y_correct. It's a standard type, so it's easy. Just take out the kth bit from the beginning. Now that the series of processing is complete,

print("{0:} -> {1:} -> {2:} -> {3:} -> {4:}".format(x,y,y_error,y_correct,x_correct))

To display the data of each stage.

if sum((x+x_correct)%2) == 1: # if x != x_correct --> 1
    err_count += 1

So, if the error cannot be corrected, add 1 to the error count.

Finally,

err_rate = err_count / N
print("* error rate = {0:} (count:{1:} / total:{2:})".format(err_rate, err_count, N))

To display the error rate.

Confirmation of operation

Now let's see the result of running the above code.

* input(random) -> encode -> add noise(random 1-bit flip) -> correct -> decode:
[1 0 1 1] -> [1 0 1 1 0 1 0] -> [1 0 1 1 1 1 0] -> [1 0 1 1 0 1 0] -> [1 0 1 1]
[1 0 1 1] -> [1 0 1 1 0 1 0] -> [1 1 1 1 0 1 0] -> [1 0 1 1 0 1 0] -> [1 0 1 1]
[0 1 0 1] -> [0 1 0 1 0 1 0] -> [1 1 0 1 0 1 0] -> [0 1 0 1 0 1 0] -> [0 1 0 1]
[0 1 0 1] -> [0 1 0 1 0 1 0] -> [1 1 0 1 0 1 0] -> [0 1 0 1 0 1 0] -> [0 1 0 1]
[1 0 1 0] -> [1 0 1 0 1 0 1] -> [1 0 1 1 1 0 1] -> [1 0 1 0 1 0 1] -> [1 0 1 0]
[0 1 1 1] -> [0 1 1 1 1 0 0] -> [0 1 1 1 1 0 1] -> [0 1 1 1 1 0 0] -> [0 1 1 1]
[0 1 1 1] -> [0 1 1 1 1 0 0] -> [0 1 1 1 0 0 0] -> [0 1 1 1 1 0 0] -> [0 1 1 1]
[0 1 1 1] -> [0 1 1 1 1 0 0] -> [1 1 1 1 1 0 0] -> [0 1 1 1 1 0 0] -> [0 1 1 1]
[1 1 1 1] -> [1 1 1 1 1 1 1] -> [0 1 1 1 1 1 1] -> [1 1 1 1 1 1 1] -> [1 1 1 1]
[0 0 1 0] -> [0 0 1 0 1 1 0] -> [0 0 1 1 1 1 0] -> [0 0 1 0 1 1 0] -> [0 0 1 0]
* error rate = 0.0 (count:0 / total:10)

Since the original information was randomly generated and the noise was also random, the input / output data changed each time it was executed, but the error was always completely corrected. Congratulations, congratulations.

in conclusion

Based on Nielsen, Chan, I tried to fill in the line spacing carefully while referring to other documents, so it became quite a long sentence. It's gone. When I noticed the exercises of Nielsen, Chan, I had conquered almost all of them (whether they answered correctly, though). .. Thanks to that, my understanding of the basics has improved considerably. That said, this is a "classical" error correction story. The story of "quantum" based on this is even deeper and broader (probably). I will continue to do my best, although it is capricious and at my own pace.

that's all

Recommended Posts

Basics of Quantum Information Theory: Quantum Error Correction (Classical Linear Code)
Basics of Quantum Information Theory: Quantum Error Correction (Shor's Code)
Basics of Quantum Information Theory: Quantum Error Correction (CSS Code)
Basics of Quantum Information Theory: Quantum Error Correction (Stabilizer Code: 4)
Basics of Quantum Information Theory: Topological Toric Code
Basics of Quantum Information Theory: Entropy (2)
Basics of Quantum Information Theory: Universal Quantum Calculation by Toric Code (1)
Basics of Quantum Information Theory: Data Compression (1)
Basics of Quantum Information Theory: Horebaud Limits
Basics of Quantum Information Theory: Trace Distance
Basics of Quantum Information Theory: Quantum State Tomography
Basics of Quantum Information Theory: Data Compression (2)
Basics of Quantum Information Theory: Logical Operation by Toric Code (Brading)
Basics of Quantum Information Theory: Fault Tolerant Quantum Computation
Read "Basics of Quantum Annealing" Day 5
Read "Basics of Quantum Annealing" Day 6
Basics of Tableau Basics (Visualization Using Geographic Information)