[PYTHON] Basics of Quantum Information Theory: Entropy (2)

\def\bra#1{\mathinner{\left\langle{#1}\right|}} \def\ket#1{\mathinner{\left|{#1}\right\rangle}} \def\braket#1#2{\mathinner{\left\langle{#1}\middle|#2\right\rangle}}

Introduction

In Previous article, I reviewed the entropy in classical information theory, so this time I will study the entropy in quantum information theory. After explaining its definition and properties, I would like to actually calculate and confirm its important properties using the quantum calculation simulator qlazy.

The following documents were used as references.

  1. Nielsen, Chan "Quantum Computer and Quantum Communication (3)" Ohmsha (2005)
  2. Ishizaka, Ogawa, Kawachi, Kimura, Hayashi "Introduction to Quantum Information Science" Kyoritsu Shuppan (2012)
  3. Tomita "Quantum Information Engineering" Morikita Publishing (2017)
  4. Memomemo-An attempt to compare the properties of quantum entropy (von Neumann entropy) with classical entropy (Shannon entropy)

Definition

Von Neumann Entropy

Entropy (Shannon entropy) in classical information theory is defined as the average value (expected value) of the degree of uncertainty that occurs in each event when there is a certain group of events. Quantum systems are generally represented as pure-state ensembles. Taking the orthonormal $ \ {\ ket {i} \} $ as the pure state, the ensemble is expressed as $ \ {p_i, \ ket {i} \} $. The entropy in the quantum system can be defined by considering the existence probability $ p_i $ of this pure state as the probability of occurrence of an event. That is,

S({p_i, \ket{i}}) = - \sum_{i=1}^{n} p_i \log p_i  \tag{1}

is. This expression can be rewritten as follows using the density operator:

S(\rho) = - Tr (\rho \log \rho)  \tag{2}

I think it's almost self-explanatory, but I'll confirm it for the time being.

\begin{align}
Tr(\rho \log \rho) &= \sum_{i,j} \bra{i} \rho \ket{j} \bra{j} \log \rho \ket{i} \\
&= \sum_{i,j} \bra{i} (\sum_{k} p_k \ket{k} \bra{k}) \ket{j} \bra{j} \log (\sum_{l} p_l \ket{l} \bra{l}) \ket{i} \\
&= \sum_{i,j,k} p_k \braket{i}{k} \braket{k}{j} \bra{j} \log (\sum_{l} p_l \ket{l} \bra{l}) \ket{i} \\
&= \sum_{i,j} p_j \delta_{ij} \bra{j} \log p_i \ket{i} \\
&= \sum_{i} p_i \log p_i  \tag{3}
\end{align}

is.

The entropy defined by equation (2) is called "von neumann entropy". In the following, I will simply call it "entropy" because it is troublesome.

There is one point to note here. In equation (2), the variable of entropy is the density operator $ \ rho $, but we often see another way of writing it. For example, to emphasize the meaning of entropy for the density operator in quantum system A, we write $ S (A) $ with the label representing the quantum system as a variable. Probably not confusing, but just in case. I will use it frequently in this article as well.

Entanglement entropy

When there is a synthetic system AB, the entropy defined for that subsystem is called "entanglement entropy". The entanglement entropy $ S (A) $ for partial system A is

\begin{align}
\rho^{A} &= Tr_{B} (\rho^{AB}) \\
S(A) &= S(\rho^{A}) = Tr(\rho^{A} \log \rho^{A})  \tag{4}
\end{align}

It is defined as.

Joint entropy

The entropy $ S (A, B) $ for the density operator $ \ rho ^ {AB} $ in the synthetic system AB, which is a combination of the quantum systems A and B, is called "joint entropy".

S(A,B) = - Tr (\rho^{AB} \log \rho^{AB})  \tag{5}

Is defined.

Conditional entropy

In the quantum system, the conditional probability is not defined, but the analogy with the classic formally defines the "conditional entropy" as follows.

S(B|A) = S(A,B) - S(A) = S(\rho^{AB}) - S(\rho^{A})  \tag{6}

Relative entropy

"Relative entropy" in quantum information theory is defined as follows.

S(A || B) = S(\rho^{A} || \rho^{B}) = Tr(\rho^{A} \log \rho^{A}) - Tr(\rho^{A} \log \rho^{B}) \tag{7}

It has a shape similar to the relative entropy (Kullback-Leibler distance) in classical information theory.

Mutual information

Similar to classical information theory, the "mutual information" of quantum system A and quantum system B is defined as the sum of each entropy minus the joint entropy.

I(A:B) = S(A) + S(B) - S(A,B)  \tag{8}

nature

Now that we have defined various entropies in quantum information theory, we will explain the properties that hold between them, adding proofs. From the ones listed in the references, I've picked up the ones that seem to be basic and important, with arbitrary judgment and prejudice. It has become 14 in total. It is as follows.

[Entropy]

[Relative entropy and mutual information]

[Joint entropy and partial entropy]

[Properties derived from subadditivity]

[Measurement]

Let's look at them in order.

Entropy

Properties (1) Entropy is non-negative

From equation (1), it can be proved in the same way as in the case of classical information theory (Reference: Previous article). Entropy is zero only when there is a single state, that is, in the pure state.

Properties (2) The maximum value of entropy is log (n)

From equation (1), it can be proved in the same way as in the case of classical information theory (Reference: Previous article). The difference from classical information theory is the meaning of $ n $. In the case of quantum entropy, $ n $ is the number of dimensions of the Hilbert space of the quantum system we are thinking of. In other words, in the case of a 3-qubit system, the dimension of the Hilbert space is 2 to the 3rd power, which is 8, so the entropy is $ \ log $, so it does not exceed 3.

Nature (3) Entropy is unitary invariant

For any unitary transformation $ U $

S(\rho) = S(U \rho U^{\dagger})  \tag{9}

Is established.

[Proof]

S(U \rho U^{\dagger}) = - Tr((U \rho U^{\dagger}) \log (U \rho U^{\dagger})) = - Tr(U \rho U^{\dagger} \cdot U (\log \rho) U^{\dagger}) = - Tr(\rho \log \rho) = S(\rho)  \tag{10}

(End of proof)

Properties (4) Entropy is a concave function

Like classical information theory, entropy is a concave function. That is, for $ \ {p_i \} $, which is $ \ sum_ {i} p_i = 1 $, and $ \ {\ rho_i \} $, which is a set of density operators.

S(\sum_{i} p_i \rho_i) \geq \sum_{i} p_i S(\rho_i)  \tag{11}

Holds [^ 1].

[^ 1]: In the previous article (https://qiita.com/SamN/items/4abc9e3399a0aabadb2c), for the two classical event sequences $ p, q $, the probabilities are x and 1-x, respectively. I proved it on the assumption that it is represented by, but it can be extended to three or more event sequences. This time, we will prove a general inequality that holds when there are multiple quantum systems (density operators), not limited to two.

[Proof]

Let A be the quantum system represented by $ \ {\ rho_i \} $, and have the quantum system B newly defined by the orthonormal system $ \ {\ ket {i} ^ {B} \} $. Then, define the following state $ \ rho ^ {AB} $ [^ 2].

[^ 2]: I entered from a very artificial operation, but adding an auxiliary system to prove it is a very common technique in the world of quantum information. This time too. However, please note that it is assumed that it has not been purified. I simply added the auxiliary system as shown in equation (12).

\rho^{AB} = \sum_{i} p_i \rho_i \otimes \ket{i}^{B} \bra{i}^B  \tag{12}

Here, the eigenvalues and eigenvectors of $ \ rho_i $ are $ \ lambda_i $ and $ \ ket {i} ^ A $.

Calculate this joint entropy.

\begin{align}
S(A,B) &= S(\rho^{AB}) = S(\sum_{i} p_i \rho_i \otimes \ket{i}^{B} \bra{i}^B) \\
&= -Tr((\sum_{i} p_i \rho_i \otimes \ket{i}^{B} \bra{i}^B) \log (\sum_{i} p_i \rho_i \otimes \ket{i}^{B} \bra{i}^B)) \\
&= - \sum_{k,l,m,n} \bra{k}^{A} \bra{l}^{B} (\sum_{i} p_i \rho_i \otimes \ket{i}^{B} \bra{i}^B) \ket{m}^{A} \ket{n}^{B} \bra{m}^{A} \bra{n}^{B} \log(\sum_{j} p_j \rho_j \otimes \ket{j}^{B} \bra{j}^B) \ket{k}^{A} \ket{l}^{B} \\
&= -\sum_{k,l,m,n} \bra{k}^{A} (\sum_{i} p_i \rho_i \delta_{li} \delta_{in}) \ket{m}^{A} \bra{m}^{A} \bra{n}^{B} \log (p_l \rho_l) \ket{k}^{A} \ket{l}^{B} \\
&= -\sum_{k,m} \bra{k}^{A} (\sum_{i} p_i \rho_i) \ket{m}^{A} \bra{m}^{B} \log (p_i \rho_i) \ket{k}^{A} \\
&= -\sum_{i,k} p_i \lambda_{i}^{k} \log (p_i \lambda_{i}^{k}) \\
&= -\sum_{i,k} p_i \lambda_{i}^{k} \log p_i -\sum_{i,k} p_i \lambda_{i}^{k} \log \lambda_{i}^{k} \\
&= -\sum_{i} p_i \log p_i - \sum_{i} p_i \sum_{k} \lambda_{i}^{k} \log \lambda_{i}^{k} \\
&= H(p_i) + \sum_{i} S(\rho_i) \tag{13}
\end{align}

on the other hand,

S(A) = S(\sum_i p_i \rho_i)  \tag{14}
\begin{align}
S(B) &= S(Tr_{A}(\sum_{i} p_i \rho_i \otimes \ket{i}^{B} \bra{i}^{B})) \\
&= S(\sum_{i} p_i \ket{i}^{B} \bra{i}^{B}) = H(p_i) \tag{15}
\end{align}
```

 Can be calculated. Here, applying the subadditivity ($ S (A, B) \ leq S (A) + S (B) $) described later to equations (13), (14), and (15),

```math
\begin{align}
& H(p_i) + \sum_{i} p_i S(\rho_i) \leq S(\sum_{i} p_i \rho_i) + H(p_i) \\
& \sum_{i} p_i S(\rho_i) \leq S(\sum_{i} p_i \rho_i) \tag{16}
\end{align}
```

 It will be. This proves that entropy is a concave function. (End of proof)


### Relative entropy and mutual information

#### Properties (5) Relative entropy is non-negative (Klein inequality)

 Relative entropy is non-negative. That is,

```math
S(\rho || \sigma) \geq 0  \tag{17}
```

 Is established. This inequality is called "Klein's inequality".

 [Proof]

 Suppose $ \ rho and \ sigma $ can be diagonalized by the orthonormal $ \\ {\ ket {\ phi_i} \\}, \\ {\ ket {\ psi_i} \\} $, respectively. That is,

```math
\begin{align}
\rho &= \sum_{i}^{n} p_i \ket{\phi_i} \bra{\phi_i} \\
\sigma &= \sum_{i}^{n} q_i \ket{\psi_i} \bra{\psi_i} \tag{18}
\end{align}
```

 Can be expressed as. From the definition of relative entropy

```math
\begin{align}
S(\rho || \sigma) &= Tr(\rho \log \rho) - Tr(\rho \log \sigma) \\
&= \sum_{i} p_i \log p_i - \sum_{i} \bra{\phi_i} \rho \log \sigma \ket{\phi_i} \\
&= \sum_{i} p_i \log p_i - \sum_{i} \sum_{j} \bra{\phi_i} \rho \ket{\phi_j} \bra{\phi_j} \log \sigma \ket{\phi_i} \\
&= \sum_{i} p_i \log p_i - \sum_{i} p_i \bra{\phi_i} \log \sigma \ket{\phi_i} \tag{19}
\end{align}
```

 It will be. $ \ Bra {\ phi_i} \ log \ sigma \ ket {\ phi_i} $ in the second term on the right side is

```math
\begin{align}
&\bra{\phi_i} \log \sigma \ket{\phi_i} \\
&= \sum_{j,k} \braket{\phi_i}{\psi_j} \bra{\psi_j} \log \sigma \ket{\psi_k} \braket{\psi_k}{\phi_i} \\
&= \sum_{j} \braket{\phi_i}{\psi_j} \log (q_i) \braket{\psi_j}{\phi_i} \tag{20}
\end{align}
```

 here,

```math
P_{ij} \equiv \braket{\phi_i}{\psi_j} \braket{\psi_j}{\phi_i}  \tag{21}
```

 When defined as, equation (19)

```math
S(\rho || \sigma) = \sum_{i} p_i (\log p_i - \sum_{j} P_{ij} \log (q_j))  \tag{22}
```

 It will be.

 Since $ P_ {ij} \ geq 0, \ space \ sum_ {i} P_ {ij} = \ sum_ {j} P_ {ji} = 1 $ and $ \ log (\ cdot) $ are concave functions

```math
\sum_{j} P_{ij} \log (q_j) \leq \log (\sum_{j} P_{ij} q_j)  \tag{23}
```

 Is established. If you put $ r_i \ equiv \ sum_ {j} P_ {ij} q_j $ and substitute equation (23) into equation (22),

```math
S(\rho || \sigma) \geq \sum_{i} \log p_i - \sum_{i} p_i \log r_i  \tag{24}
```

 It will be.$r_i$From the definition of$0 \leq r_i \leq 1$Since the formula holds(24)Is the relative entropy in classical information theory$D(p_i||r_i)$Is equal to$D(p_i||r_i) \geq 0$So after all

```math
S(\rho || \sigma) \geq 0  \tag{25}
```

 Is established.

 The equal sign holds only when there exists $ j $ that satisfies $ P_ {ij} = 1 $ for each $ i $, that is, if $ P_ {ij} $ is a permutation matrix. (End of proof)


#### Property (6) Mutual information is non-negative

```math
I(A:B) \geq 0  \tag{26}
```

 Is established.

 [Proof]

```math
\begin{align}
& S(\rho^{AB} || \rho^{A} \otimes \rho^{B}) \\
&= Tr(\rho^{AB} \log \rho^{AB}) - Tr(\rho^{AB} \log (\rho^{A} \otimes \rho^{B})) \\
&= Tr(\rho^{AB} \log \rho^{AB}) - Tr(\rho^{AB} \log (\rho^{A} \otimes I^{B})) - Tr(\rho^{AB} \log (I^{A} \otimes \rho^{B})) \\
&= Tr(\rho^{AB} \log \rho^{AB}) - Tr(\rho^{A} \log \rho^{A}) - Tr(\rho^{B} \log \rho^{B}) \\
&= S(A) + S(B) - S(A,B) \\
&= I(A:B)  \tag{27}
\end{align}
```

 In$S(\rho^{AB} || \rho^{A} \otimes \rho^{B}) \geq 0$So the formula(26)Is established.

 The equal sign holds only for $ \ rho ^ {AB} = \ rho ^ {A} \ otimes \ rho ^ {B} $. In other words, only when the synthetic system AB is in the product state of the partial system A and the partial system B [^ 3]. (End of proof)

 [^ 3]: From this equal sign condition, $ S (\ rho ^ {A} \ otimes \ rho ^ {B}) = S (\ rho_ {A}) + S (\ rho ^ {B}) $ I can guide you. That is, the entropy of the product state is the sum of each entropy. It's a good idea to remember it as a little formula.


### Joint entropy and partial entropy

#### Properties (7) Pure state partial system

 When the synthetic system AB is in the pure state, the values of the entropy $ S (A) $ and $ S (B) $ of the partial system always match.

 [Proof]

 Since the synthetic system AB is in a pure state,

```math
\ket{\Psi}^{AB} = \sum_{i=1}^{R} \sqrt{r_i} \ket{i}^{A} \ket{i}^{B}  \tag{28}
```

 It can be disassembled like Schmidt. Where R is a Schmidt trunk. At this time, the density operator of the synthetic system AB is

```math
\rho^{AB} = \ket{\Psi}^{AB} \bra{\Psi}^{AB} = \sum_{i=1}^{R} \sum_{j=1}^{R} \sqrt{r_i r_j} \ket{i}^{A} \bra{j}^{A} \otimes \ket{i}^{B} \bra{j}^{B}  \tag{29}
```

 It will be. The density operator of the partial system is

```math
\begin{align}
\rho^{A} &= Tr_{B} \rho^{AB} = \sum_{i=1}^{R} r_i \ket{i}^{A} \bra{i}^{A}  \\
\rho^{B} &= Tr_{A} \rho^{AB} = \sum_{i=1}^{R} r_i \ket{i}^{B} \bra{i}^{B}  \tag{30}
\end{align}
```

 Therefore, the entropy of the partial system (entanglement entropy) is

```math
\begin{align}
S(A) &= - Tr(\rho^{A} \log \rho^{A}) = - \sum_{i=1}^{R} r_i \log r_i \\  
S(B) &= - Tr(\rho^{B} \log \rho^{B}) = - \sum_{i=1}^{R} r_i \log r_i \tag{31}
\end{align}
```

 is. Therefore, it was proved that $ S (A) = S (B) $ holds. (End of proof)

 This is a very interesting property. First, the pure state is fixed as one quantum state, so the entropy is zero. If you divide it into two subsystems, each of them will be in a mixed state, but when you calculate the entropy, it always shows that they match. In other words, if you divide a system in a pure state without any uncertainty into two, uncertainty will spring up from each system for some reason, and both will match. The image of the "quantum correlation" = "entanglement" that existed between the two systems corresponds to the uncertainty that springs up, and the entropy of this subsystem is entangled. We call it entropy. Needless to say, this cannot be classical information theory. Even if you focus on a part of a certain event, the entropy remains zero because it is definitely fixed.


#### Properties (8) Subadditivity

```math
S(A,B) \leq S(A) + S(B) \tag{32}
```

 Is established.

 [Proof]

 Since the mutual information is non-negative (property (6)), it can be proved immediately.

```math
I(A:B) = S(A) + S(B) - S(A,B) \geq 0  \tag{33}
```

 Therefore, equation (32) holds.

 The equal sign is only when the synthetic system AB is in the product state of the partial system A and the partial system B. (End of proof)


#### Properties (9) Triangle inequality (Araki-Lieb inequality)

```math
S(A,B) \geq |S(A) - S(B)|  \tag{34}
```

 Is established.

 [Proof]

 Purify by adding auxiliary system R to synthetic system AB. From the subadditivity of property (8)

```math
S(R) + S(A) \geq S(A,R)  \tag{35}
```

 Is true, and the synthetic ABR is in a pure state, so from property (7),

```math
\begin{align}
& S(A,R) = S(B) \\
& S(R) = S(A,B)  \tag{36} 
\end{align}
```

 Is established. Substituting equation (36) into equation (35)

```math
S(A,B) \geq S(B) - S(A)  \tag{37}
```

 Can lead. Even if A and B are exchanged, the same argument holds, so

```math
S(A,B) \geq S(A) - S(B)  \tag{38}
```

 is. Since equations (37) and (38) must hold at the same time,

```math
S(A,B) \geq |S(A) - S(B)|  \tag{34}
```

 must be. (End of proof)

 The equality condition of this triangle inequality is not self-evident. Or rather, it seems to be a fairly difficult problem. In fact, in [Nielsen Chan](https://www.ohmsha.co.jp/book/9784274200090/), "Practice 11.16: ($ S (A, B) \ geq S (B) -S (A)) The answer is posted as "equal condition for $") ". According to it, the spectral decomposition of $ \ rho ^ {AB} $ is $ \ rho ^ {AB} = \ sum_ {i} p_ {i} \ ket {i} ^ {AB} \ bra {i} ^ {AB } $, the operator $ \ rho_ {i} ^ {A} \ equiv Tr_ {B} (\ ket {i} ^ {AB} \ bra {i} ^ {AB}) $ has a common proof basis Have (hereinafter referred to as "condition 1" in this article), $ \ rho_ {i} ^ {B} \ equiv Tr_ {A} (\ ket {i} ^ {AB} \ bra {i} ^ {AB}) $ S (A, B) = S (B) -S (A) $ holds only when $ has orthogonal platforms (we will call it "condition 2") [^ 4 ], But there is no proof (it is an exercise to prove). So I tried it [^ 5] [^ 6].

 [^ 4]: Due to the convenience of the proof later, the symbols described in [Nielsen Chan](https://www.ohmsha.co.jp/book/9784274200090/) have been slightly changed.

 [^ 5]: However, in the end, I was exhausted only in one direction with the necessary and sufficient conditions. Moreover, there may be some suspicious parts. But for reference, I'll expose it, sweat.

 [^ 6]: And, needless to say, I'll add one just in case. This equal sign condition is property (7) itself when the synthetic system AB is in a pure state.

 [Proof of equal sign condition]

 Since $ \ ket {i} ^ {AB} $ is in a pure state, the basis of system A and system B $ \\ {\ ket {a_i ^ {j}} \\}, \\ {\ ket {b_ {i} You can use ^ {j}} \\} $ to decompose the Schmidt as follows ($ R_i $ is the Schmidt trunk).

```math
\ket{i}^{AB} = \sum_{j=1}^{R_i} \sqrt{r_{i}^{j}} \ket{a_{i}^{j}} \ket{b_{i}^{j}}  \tag{39}
```

 Then, $ \ rho_ {i} ^ {A}, \ rho_ {i} ^ {B} $ are each

```math
\begin{align}
\rho_{i}^{A} &= Tr_{B} (\ket{i}^{AB} \bra{i}^{AB}) \\
&= Tr_{B} (\sum_{j=1}^{R_i} \sqrt{r_{i}^{j}}) \ket{a_{i}^{j}} \ket{b_{i}^{j}} \sum_{k=1}^{R_i} \sqrt{r_{i}^{k}}) \braket{a_{i}^{k}}{b_{i}^{k}} \\ 
&= \sum_{j=1}^{R_i} \sum_{k=1}^{R_i} \sqrt{r_{i}^{j} r_{i}^{k}} Tr_{B}(\ket{b_{i}^{j}} \bra{b_{i}^{k}}) \\
&= \sum_{j=1}^{R_i} r_{i}^{j} \ket{a_{i}^{j}} \bra{a_{i}^{j}} \\
\rho_{i}^{B} &= \sum_{j=1}^{R_i} r_{i}^{j} \ket{b_{i}^{j}} \bra{b_{i}^{j}} \tag{40}
\end{align}
```

 It will be. Here, the coefficient $ r_ {i} ^ {j} $ is the diagonal component (eigenvalue) of each, but the point where it is the same in both A system and B system is the interesting point of Schmidt decomposition [^ 7].

 [^ 7]: [Previous article](https://qiita.com/SamN/items/e894be584dddb69ec1e2) also emphasized the Schmidt decomposition.

 What happens to $ \ rho ^ {A} and \ rho ^ {B} $? $ \ Rho ^ {AB} $

```math
\rho^{AB} = \sum_{i=1}^{R^{AB}} \sum_{j=1}^{R_i} \sum_{k=1}^{R_i} p_i \sqrt{r_{i}^{j} r_{i}^{k}} \ket{a_{i}^{j}} \bra{a_{i}^{k}} \otimes \ket{b_{i}^{j}} \bra{b_{i}^{k}} \tag{41}
```

 Because it can be written

```math
\begin{align}
\rho^{A} &= Tr_{B} (\rho^{AB}) \\
&= \sum_{i=1}^{R^{AB}} \sum_{j=1}^{R_i} p_{i} r_{i}^{j} \ket{a_{i}^{j}} \bra{a_{i}^{j}} \\
&= \sum_{i=1}^{R^{AB}} p_{i} \rho_{i}^{A} \\
\rho^{B} &= Tr_{A} (\rho^{AB}) \\
&= \sum_{i=1}^{R^{AB}} \sum_{j=1}^{R_i} p_{i} r_{i}^{j} \ket{b_{i}^{j}} \bra{b_{i}^{j}} \\
&= \sum_{i=1}^{R^{AB}} p_{i} \rho_{i}^{B} \tag{42}
\end{align}
```

 It will be.

 Here, assuming that condition 1 is satisfied, $ \\ {\ ket {a_ {1} ^ {j}} \\}, \\ {\ ket {a_ {2} ^ {j}} \\} , \ cdots, \\ {\ ket {a_ {R ^ {AB}} ^ {j}} \\} $ are all the same orthonormal system, and if condition 2 is satisfied, then $ \ It can be said that \ {\ ket {b_ {i} ^ {j}} \\} $ is an orthonormal system as a whole [^ 8].

 [^ 8]: I think this can be said because it has a "common eigenbase" and an "orthogonal platform", but it may be strictly suspicious (assuming that it is too strong). (Doing).

 If so, equation (40)

```math
\begin{align}
\rho_{i}^{A} &= \sum_{j=1}^{R} r^{j} \ket{a^{j}} \bra{a^{j}} \\
\rho_{i}^{B} &= \sum_{j=1}^{R} r^{j} \ket{b_{i}^{j}} \bra{b_{i}^{j}} \tag{43}
\end{align}
```

 (Since condition 1, the Schmidt coefficient and Schmidt trunk are i-independent, which is also the Schmidt coefficient and Schmidt trunk of system B).

 At this time, what happens to $ \ rho ^ {A}, \ rho ^ {B} $?

```math
\begin{align}
\rho^{A} &= \sum_{i=1}^{R^{AB}} \sum_{j=1}^{R} p_{i} r^{j} \ket{a^{j}} \bra{a^{j}} \\
&= \sum_{j=1}^{R} r^{j} \ket{a^{j}} \bra{a^{j}} \\
\rho^{B} &= \sum_{i=1}^{R^{AB}} \sum_{j=1}^{R} p_{i} r^{j} \ket{b_{i}^{j}} \bra{b_{i}^{j}} \tag{44}
\end{align}
```

 is.

 Based on this premise, let's calculate the entropy $ S (A), S (B) $ of the partial system.

```math
\begin{align}
S(A) &= S(\rho^{A}) = -Tr(\rho^{A} \log \rho^{A}) \\
&= - \sum_{l=1}^{R} \sum_{m=1}^{R} \bra{a^{l}} \rho^{A} \ket{a^{m}} \bra{a^{m}} \log \rho^{A} \ket{a^{l}} \\
&= - \sum_{l=1}^{R} \sum_{m=1}^{R} r^{m} \braket{a^{l}}{a^{m}} \bra{a^{m}} \log r^{l} \ket{a^{l}} \\
&= - \sum_{l=1}^{R} r^{l} \log r^{l} \\
S(B) &= S(\rho^{B}) = -Tr(\rho^{B} \log \rho^{B}) \\
&= - \sum_{k=1}^{R^{AB}} \sum_{l=1}^{R} \sum_{m=1}^{R^{AB}} \sum_{n=1}^{R} \bra{b_{k}^{l}} \rho^{B} \ket{b_{m}^{n}} \bra{b_{m}^{n}} \log \rho^{B} \ket{b_{k}^{l}} \\
&= - \sum_{k=1}^{R^{AB}} \sum_{l=1}^{R} \sum_{m=1}^{R^{AB}} \sum_{n=1}^{R} p_{m} r^{n} \braket{b_{k}^{l}}{b_{m}^{n}} \bra{b_{m}^{n}} \log(p_{k} r^{l}) \ket{b_{k}^{l}} \\
&= - \sum_{k=1}^{R^{AB}} \sum_{l=1}^{R} p_{k} r_{k}^{l} \log (p_{k} r^{l}) \tag{45} 
\end{align}
```

 If you calculate $ S (B) -S (A) $, paying attention to $ \ sum_ {k} p_ {k} = 1, \ space \ sum_ {l} r ^ {l} = 1 $

```math
\begin{align}
S(B)-S(A) &= - \sum_{k=1}^{R^{AB}} \sum_{l=1}^{R} p_{k} r_{k}^{l} \log (p_{k} r^{l}) + \sum_{l=1}^{R} r^{l} \log r^{l} \\
&= - \sum_{k=1}^{R^{AB}} p_{k} r^{l} (\log p_{k} + \log r^{l}) + \sum_{l=1} r^{l} \log r^{l} \\
&= - \sum_{k=1}^{R^{AB}} p_{k} \log p_{k} - \sum_{l=1}^{R} r^{l} \log r^{l} + \sum_{l=1} r^{l} \log r^{l} \\
&= - \sum_{k=1}^{R^{AB}} p_{k} \log p_{k} = S(A,B)  \tag{46}
\end{align}
```

 Then, when condition 1 and condition 2 are satisfied, it can be proved that $ S (A, B) = S (B) -S (A) $ is satisfied. (End of proof)

 That's why [Nielsen Chan](https://www.ohmsha.co.jp/book/9784274200090/) I cleared (only half) "Practice 11.16" (I think I did). Once you know the opposite direction of the necessary and sufficient conditions, add an article.


#### Properties (10) Subadditivity

 When there are three quantum systems A, B, and C, the following inequality holds. This property is called "subadditivity". According to [Nielsen Chan](https://www.ohmsha.co.jp/book/9784274200090/), this is "one of the most important and useful results of quantum information theory". Don't think "what is this?", Let's understand it well.

```math
S(A,B,C) + S(B) \leq S(A,B) + S(B,C) \tag{47}
```

```math
S(A) + S(B) \leq S(A,C) + S(B,C)  \tag{48}
```

 [Proof]

 The proof using "monotonicity of relative entropy" seems to be quick, so I will do it [^ 9].

 [^ 9]: [Nielsen Chan](https://www.ohmsha.co.jp/book/9784274200090/) had another intricate proof. On the other hand, in [Introduction to Quantum Information Science](https://www.kyoritsu-pub.co.jp/bookdetail/9784320122994), "monotonicity of relative entropy" is used unproven (because it is difficult for beginners). Was there. This monotonic proof is easier to understand sensuously, so here, refer to [Introduction to Quantum Information Science](https://www.kyoritsu-pub.co.jp/bookdetail/9784320122994). I tried to describe the proof.

 "Relative entropy monotonicity" is the property that the relative entropy for the quantum states of two systems decreases by passing through some physical process (quantum channel). Relative entropy is a concept equivalent to the "Kullback-Leibler distance" in classical information theory, so roughly speaking, it is a quantum expression of the image that the distance becomes smaller. In general, two different quantum states become indistinguishable over time, so I think they are intuitively easy to understand. Expressed as an expression

```math
S(\rho || \sigma) \geq S(\Gamma(\rho) || \Gamma(\sigma))  \tag{49}
```

 is. Here, the CPTP map representing the physical process is $ \ Gamma $.

 So, after accepting equation (49), we will prove the subadditivity.

 First, let's start with equation (47).

 The mutual information between the system A and the synthetic system BC is defined, and from the equation (27) used when proving the property (6) that the mutual information is non-negative,

```math
\begin{align}
I(A:B,C) &= S(A)+S(B,C)-S(A,B,C) \\
&= S(\rho^{ABC} || \rho^{A} \otimes \rho^{BC})  \tag{50}
\end{align}
```

 Is established. Trace out system C from this right side. Since that operation is equivalent to applying a CPTP map, we can use equation (49),

```math
S(\rho^{ABC} || \rho^{A} \otimes \rho^{BC}) \geq S(\rho^{AB} || \rho^{A} \otimes \rho^{B}) = S(A)+S(B)-S(A,B)  \tag{51}
```

 It will be. From equations (50) and (51)

```math
S(A,B,C) + S(B) \leq S(A,B) + S(B,C)  \tag{47}
```

 So, I was able to prove equation (47).

 Next is equation (48).

 Considering the system ABCD purified by adding system D to the synthetic system ABC, from the property (7),

```math
S(A,B) = S(C,D), \space S(A,B,C) = S(D)  \tag{52}
```

 Is established. Substituting this into equation (47)

```math
S(D) + S(B) \leq S(C,D) + S(B,C) \tag{53}
```

 And if you replace D with A,

```math
S(A) + S(B) \leq S(A,C) + S(B,C) \tag{48}
```

 So, we can prove equation (48).

 The equal sign holds only when system C and another system are in a producted state. (End of proof)

 From this subadditivity, the properties of (11) (12) (13) shown below can be proved.


### About properties derived from subadditivity

#### Properties (11) Conditioning reduces entropy

```math
S(A|B,C) \leq S(A|B)  \tag{54} 
```

 Is established.

 [Proof]

 From subadditivity

```math
\begin{align}
& S(A,B,C) + S(B) \leq S(A,B) + S(B,C) \\
& S(A,B,C) - S(B,C) \leq S(A,B) - S(B) \\
& S(A|B,C) \leq S(A|B)  \tag{55}
\end{align}
```

 It will be.

 The equal sign holds only when system C and another system are in a producted state. (End of proof)


#### Property (12) Mutual information decreases when part of the system is discarded

```math
I(A:B) \leq I(A:B,C)  \tag{56}
```

 Is established.

 [Proof]

 From subadditivity

```math
\begin{align}
& S(B) + S(A,B,C) \leq S(A,B) + S(B,C) \\
& S(A) + S(B) - S(A,B) \leq S(A) + S(B,C) - S(A,B,C) \\
& I(A:B) \leq I(A:B,C)  \tag{57}
\end{align}
```

 It will be.

 The equal sign holds only when system C and another system are in a producted state. (End of proof)


#### Properties (13) Mutual information is reduced by quantum channels

 Assuming that the system $ A, B $ is changed to the system $ A ^ {\ prime}, B ^ {\ prime} $ by the quantum channel,

```math
I(A:B) \geq I(A^{\prime}:B^{\prime})  \tag{58}
```

 Holds [^ 10].

 [^ 10]: $ \ rho ^ {A}, \ rho ^ {B} $ is $ \ Gamma (\ rho ^ {A) as the quantum channel (corresponding CPTP map) that is currently considering $ \ Gamma $. }), \ Gamma (\ rho ^ {B}) Think of it as $.

 [Proof]

 Now suppose the three systems A, B, and C satisfy $ \ rho ^ {ABC} = \ rho ^ {AB} \ otimes \ rho ^ {C} $. This corresponds to the equal sign condition of property (12), so

```math
I(A:B) = I(A:B,C) \tag{59}
```

 Is established. Suppose that A, B, C changes to A', B', C'as a result of applying unitary transformation to system BC. From the unitary invariance of entropy (property (3))

```math
I(A:B,C) = I(A^{\prime}:B^{\prime},C^{\prime})  \tag{60}
```

 is. Since the amount of mutual information decreases when the system C'is discarded,

```math
I(A^{\prime}:B^{\prime},C^{\prime}) \geq I(A^{\prime}:B^{\prime})  \tag{61}
```

 is. From equations (59) (60) (61)

```math
I(A:B) \geq I(A^{\prime}:B^{\prime})  \tag{62}
```

 Is established.


 The equal sign holds when the quantum channel is a unitary transformation. (End of proof)


### About measurement

#### Properties (14) Entropy increased by projection measurement

 Entropy increases in the case of non-selective projection measurements, that is, projection measurements that do not measure (or forget) the results.

 If the projection operator is $ \\ {P_i \\} \ space (\ sum_ {i} P_i = 1, \ space (P_ {i}) ^ 2 = P_ {i}) $, then by non-selective measurement State $ \ rho $

```math
\rho^{\prime} = \sum_{i} P_i \rho P_i  \tag{63}
```

 It changes like. At this time,

```math
S(\rho) \leq S(\rho^{\prime})  \tag{64}
```

 Is established.

 [Proof]

 From Klein's inequality and the definition of relative entropy

```math
\begin{align}
0 & \leq S(\rho || \rho^{\prime}) = Tr(\rho \log \rho) - Tr(\rho \log \rho^{\prime}) \\
&= - S(\rho) - Tr(\rho \log \rho^{\prime})  \tag{65}
\end{align}
```

 So, you just need to show $ -Tr (\ rho \ log \ rho ^ {\ prime}) = S (\ rho ^ {\ prime}) $.


```math
\begin{align}
- Tr(\rho \log \rho^{\prime}) &= - Tr (\sum_{i} P_i \rho \log \rho^{\prime}) \\
&= -Tr(\sum_{i} P_i \rho \log \rho^{\prime} P_i)  \tag{66}
\end{align}
```

 So, $ \ rho ^ {\ prime} P_i = P_i \ rho ^ {\ prime} P_i = P_i \ rho ^ {\ prime} $, that is, $ P_i $ and $ \ rho ^ {\ prime} $ are commutative So $ P_i $ and $ \ log \ rho ^ {\ prime} $ are commutative. Then

```math
\begin{align}
- Tr(\rho \log \rho^{\prime}) &= -Tr(\sum_{i} P_i \rho P_i \log \rho^{\prime}) \\
&= -Tr(\rho^{\prime} \log \rho^{\prime}) = S(\rho^{\prime})  \tag{67}
\end{align}
```

 Therefore, when this is substituted into equation (66),

```math
S(\rho) \leq S(\rho^{\prime})  \tag{64}
```

 It will be.

 The equal sign holds only for $ \ rho = \ rho ^ {\ prime} $. (End of proof)

 So, for example, if you (non-selectively) projectively measure a pure state with zero entropy, the result will be uncertain and the entropy will increase. If you compare the measurement in classical information theory to the occurrence of an event, you can see the result (= reduce uncertainty), so the entropy should decrease. It's kind of weird that the entropy increases even though it's measured. In quantum information theory, the theory is constructed on the premise that the pure state is a definite state, but in reality, the pure state is a superposition of multiple eigenstates, which is the definite state in classical statistics. It would be greatly appreciated if you could taste a little that it is different [^ 11].

 [^ 11]: In other words, it may be understood that "non-selective projection measurement" unravels the entanglement that existed in the original quantum system and exposes uncertainty. I will. Then, one note. Now that "non-selective measurement" is premised, it may seem strange at first glance, but in the case of "selective projection measurement" that confirms the result properly, the entropy becomes zero as in the classical theory. ..


## Confirmation by simulator

 Now, let's take up the equal sign conditions of property (7) and property (9) by dogma and prejudice among the properties shown above, and use a simulator to confirm that they actually hold. Property (7) is the "entropy of the pure state partial system". I think that it is the property that you can enjoy the fun of quantum entropy in the simplest way, and I chose it because it can be easily implemented with [qlazy](https://github.com/samn33/qlazy). Property (9) is a triangle inequality. I struggled to prove the equal sign condition, so I took it up to check if it was wrong.

### Properties (7) Entropy of pure state partial system

 First is property (7). As explained earlier, the entropy of the pure state is zero, but the entropy of each subsystem when it is divided into two is generally zero or more and equal. So let's experience the mystery of quantum entanglement.

 The whole Python code is below.

```python
import numpy as np
from scipy.stats import unitary_group
from qlazypy import QState, DensOp

def random_qstate(qnum):  # random pure state

    dim = 2**qnum
    vec = np.array([0.0]*dim)
    vec[0] = 1.0
    mat = unitary_group.rvs(dim)
    vec = np.dot(mat, vec)
    qs = QState(vector=vec)

    return qs

if __name__ == '__main__':

    qnum_A = 2
    qnum_B = 2

    id_A = list(range(qnum_A))
    id_B = [i+qnum_A for i in range(qnum_B)]

    qs = random_qstate(qnum_A+qnum_B)
    de = DensOp(qstate=[qs], prob=[1.0])

    ent = de.entropy()
    ent_A = de.entropy(id_A)
    ent_B = de.entropy(id_B)

    print("** S(A,B)  = {:.4f}".format(ent))
    print("** S(A)    = {:.4f}".format(ent_A))
    print("** S(B)    = {:.4f}".format(ent_B))

    qs.free()
    de.free()
```

 It's short, so it's easy to explain. Create a random pure state qs with the random_qstate function and create a density operator de based on it. Calculate the entropy with the entropy method of the density operator class (added in v0.0.28). If you specify a partial qubit number list as an argument, the corresponding entanglement entropy will be calculated. The execution result is as follows.

```
** S(A,B)  = 0.0000
** S(A)    = 1.4852
** S(B)    = 1.4852
```

 Since we are starting from generating a random pure state, the entropy value changes with each execution, but S (A, B) is zero no matter how many times we do it, and S (A) and S (B) ) Values matched. The same was true even if the number of qubits (qnum_A, qnum_B) of system A and system B were changed. So, I was able to confirm the property (7). By the way, in this example, the number of qubits in system A and system B is 2, so the entropy does not exceed 2 due to property (2). If it exceeds, it is a bug.


### Property (9) About the equal sign condition of the triangle inequality

 The following property (9) is the equality condition of the triangle inequality that was "Practice 11.16" of [Nielsen Chan](https://www.ohmsha.co.jp/book/9784274200090/). If you create the density operator of the synthetic system AB according to the procedure performed in the proof, you should be able to confirm whether the equal sign is really satisfied. Actually, "Practice 11.17" of [Nielsen Chan](https://www.ohmsha.co.jp/book/9784274200090/) shows a concrete example that satisfies the equal sign. , It is also an example of the answer to that [^ 12].

 [^ 12]: [Nielsen Chan](https://www.ohmsha.co.jp/book/9784274200090/) exercises are shown by hand, so the answer is not in line with the questioner's intention. But.

 Now let's look at the whole Python code.

```python
import random
import math
import numpy as np
from qlazypy import QState, DensOp

def computational_basis(dim,rank):

    return np.array([[1 if j==i else 0 for j in range(dim)] for i in range(rank)])

def normalized_random_list(dim):

    rand_list = np.array([random.random() for _ in range(dim)])
    return rand_list / sum(rand_list)

def is_power_of_2(N):

    if math.log2(N) == int(math.log2(N)):
        return True
    else:
        return False
    
if __name__ == '__main__':

    # settings
    mixed_num = 3  # mixed number of pure states
    qnum_A = 2     # qubit number of system A

    # system A
    dim_A = 2**qnum_A
    rank = dim_A
    id_A = list(range(qnum_A))

    # system B
    dim_B = mixed_num*rank
    if is_power_of_2(dim_B):
        qnum_B = int(math.log2(dim_B))
    else:
        qnum_B = int(math.log2(dim_B)) + 1
        dim_B = 2**qnum_B
    id_B = [i+qnum_A for i in range(qnum_B)]

    # basis of system A,B
    basis_A = computational_basis(dim_A, rank)
    basis_B = computational_basis(dim_B, mixed_num*rank)

    # random schmidt coefficients
    coef = normalized_random_list(rank)

    # random probabilities for mixing the pure states
    prob = normalized_random_list(mixed_num)

    # basis for system A+B
    dim_AB = dim_A * dim_B
    basis_AB = [None]*mixed_num
    for i in range(mixed_num):
        basis_AB[i] = np.zeros(dim_AB)
        for j in range(dim_A):
            basis_AB[i] = basis_AB[i] + \
                math.sqrt(coef[j]) * np.kron(basis_A[j],basis_B[i*dim_A+j])

    # construct the density operator
    matrix = np.zeros((dim_AB,dim_AB))
    for i in range(mixed_num):
        matrix = matrix + prob[i] * np.outer(basis_AB[i],basis_AB[i])
    de = DensOp(matrix=matrix)

    # calculate the entropies
    ent = de.entropy()
    ent_A = de.entropy(id_A)
    ent_B = de.entropy(id_B)

    print("** S(A,B)    = {:.4f}".format(ent))
    print("** S(A)      = {:.4f}".format(ent_A))
    print("** S(B)      = {:.4f}".format(ent_B))
    print("** S(B)-S(A) = {:.4f}".format(ent_B-ent_A))

    de.free()
```

 Assuming condition 1 and condition 2 explained earlier,

```math
\ket{i}^{AB} = \sum_{i=1}^{R_i} \sqrt{r_{i}^{j}} \ket{a_{i}^{j}} \ket{b_{i}^{j}}  \tag{39}
```

 Schmidt coefficients $ r_ {i} ^ {j} $ and $ \ ket {a_ {i} ^ {j}} $ are i-independent, and $ \ ket {b_ {i} ^ {j}} $ is the whole (As a vector set with i and j as indexes), it forms an orthonormal system. That is,

```math
\ket{i}^{AB} = \sum_{i=1}^{R} \sqrt{r^{j}} \ket{a^{j}} \ket{b_{i}^{j}}  \tag{68}
```

 Can be written. Using this, the density operator,

```math
\rho^{AB}=\sum_{i=1}^{R^{AB}} p_{i} \ket{i}^{AB} \bra{i}^{AB}  \tag{69}
```

 You just have to prepare.

 In the code shown above, we first initialized the value of $ R ^ {AB} $ (variable mixed_num) to 3 and the number of qubits of system A (variable qnum_A) to 2. Then the dimension of the Hilbert space of system A is 2 squared to 4 (variable dim_A). System B must be an orthonormal system with i and j as indexes, so $ R ^ {AB} $ multiplied by the number of dimensions of system A, that is, 12 (= 3 * 4). It seems good to make the dimension of system B. However, it is inconvenient if the dimension of Hilbert space in quantum information theory is not a power of 2, so set it to 16 (= 2 ** 4).

 Since the dimensions and the number of qubits of system A and system B have been determined, each of them constitutes an orthonormal system. Anything is fine, so we will use it as the calculation basis, which is the easiest to implement. Created with the function computational_basis.

 What we also need is the coefficient $ r ^ {j} in equation (69) and the coefficient $ p_ {i} $ in equation (70). This is randomly determined (but so that the sum is 1). Create with the function normalized_random_list.

 Now that we have the materials, we construct the density operator $ \ rho ^ {AB} $ (variable de). Then, the entropy method calculates and displays the entropy of the synthetic system, the entropy of system A and system B, and their differences, and the program ends.

 The results are as follows.

```
** S(A,B)    = 1.3598
** S(A)      = 1.8018
** S(B)      = 3.1616
** S(B)-S(A) = 1.3598
```

 Certainly, it turned out that S (B) -S (A) = S (A, B) holds. No matter how many times I run it, the numbers are different, but this equation always holds. I tried changing some of the initial settings, but it was the same.


## in conclusion

 I wanted to fully enjoy the fun and mystery of entropy in quantum information theory, but it has become such a long article. Actually, there are some things that I haven't said yet, but in the future, if necessary, or if I feel like it, I would like to make an additional article in a separate article.

 Well, from the next time onwards, I'm wondering what to do. Whether it is "quantum cryptography" or "quantum error correction", or before that, we will take up a little more basic topic, so the schedule is undecided. I don't know.


 that's all


Recommended Posts

Basics of Quantum Information Theory: Entropy (2)
Basics of Quantum Information Theory: Horebaud Limits
Basics of Quantum Information Theory: Trace Distance
Basics of Quantum Information Theory: Quantum State Tomography
Basics of Quantum Information Theory: Data Compression (2)
Basics of Quantum Information Theory: Topological Toric Code
Basics of Quantum Information Theory: Fault Tolerant Quantum Computation
Basics of Quantum Information Theory: Quantum Error Correction (CSS Code)
Basics of Quantum Information Theory: Quantum Error Correction (Stabilizer Code: 4)
Basics of Quantum Information Theory: Quantum Error Correction (Classical Linear Code)
Basics of Quantum Information Theory: Logical Operation by Toric Code (Brading)
Read "Basics of Quantum Annealing" Day 5
Read "Basics of Quantum Annealing" Day 6
Basics of Tableau Basics (Visualization Using Geographic Information)
Basics of Python ①
Basics of python ①
Basics of Python scraping basics
# 4 [python] Basics of functions
Basics of network programs?
Basics of Perceptron Foundation
Basics of regression analysis
Basics of python: Output
Quantum computer implementation of quantum walk 2
Basics of Machine Learning (Notes)
python: Basics of using scikit-learn ①
Supervised learning 1 Basics of supervised learning (classification)
Quantum computer implementation of quantum walk 3
Acquisition of weather information (DarkSky)
XPath Basics (1) -Basic Concept of XPath
Quantum computer implementation of quantum walk 1
Basics of Python × GIS (Part 1)
Introduction of library PyPhi for dealing with integrated information theory (IIT)
Quantification of "consciousness" in integrated information theory (IIT3.0), calculation method of Φ