[PYTHON] I tried to dig deeper about safety while calculating the stochastic finality of Proof of Work

The public blockchain is called the public chain. Many blockchains of this type use the "Proof of Work" as the process of uniquely determining the correct data. And this mechanism has the property that there is no finality (settlement completion). In short, this is the nature of sending an asset to someone and not admitting that it has certainly been deposited. The question may arise, "Well, can it be used as money?", But the concept of "stochastic finality" has been introduced as a solution. This time, we will dig deeper while actually calculating the stochastic finality.

What is finality (payment completion)?

Finality means "it can be considered that the settlement has been confirmed". If you use the money you normally use, if you hand the 100-yen coin to the cashier, the ownership of the 100-yen coin will be transferred at that point, so you can consider that the settlement has been confirmed there. Even in the case of bank transfer, the settlement will be completed as long as the bank that manages the account approves the transfer.

However, in the public chain, its finality is ** stochastically ** fixed. In other words, it's okay to admit that the payment has been confirmed after this amount of time (after mining has progressed). Therefore, if mining is successful, it is not safe and there is always the possibility that it will be overturned.

Hash power ≒ computing power

Let's take a look at the concept of "hash power" before considering stochastic finality. Hash power represents the computing power of the machines participating in mining in the sense of computing power. Of course, the higher the computing power, the faster you can succeed in mining because you can do more calculations.

Also, if you start mining on two machines with different hash powers at the same time, the machine with the higher hash power will be able to mine faster and win the competition.

Satoshi Nakamoto's way of thinking

Even in the paper "Bitcoin: A Peer-to-Peer Electronic Cash System", which is said to be the original text that summarizes the core ideas of Bitcoin, it is a stochastic finality. Indirectly mentions.

In Chapter 11, "Calculations", there is a part that actually calculates and verifies how strong the Proof of Work is against malicious attacks. An attack here is an attack in which a block that has been successfully mined is revoked by a chain longer than that. Proof of Work has a rule that the longest chain is the correct information, so even if you think it is the longest at the moment, if a longer chain comes out, that is "justice". This is one of the important parts, as if the phenomenon of frequent block expiration and invalidation of transactions occurs frequently, it will be completely useless as a currency, let alone finality.

By the way, in Satoshi Nakamoto's paper, the Poisson distribution of statistics is used to calculate as follows. (If you are allergic to mathematical formulas, skip to the next chapter ...)

CodeCogsEqn (2).png

The meaning of each variable is as follows.

Variables / constants meaning
p Probability of a well-meaning miner succeeding in mining
q Probability of malicious miners succeeding in mining
z The number of blocks after a transaction is stored in a block
λ CodeCogsEqn (1).png
e Napier number (base of natural logarithm)

By the way, mining is always successful for either good or bad miners, so $ p + q = 1. It is complicated with case classification, but it is roughly as follows. First, the probability that an attacker who is delayed by z blocks will succeed in mining and catch up with the correct blockchain is * $ (q / p) ^ z $ *. $ (q / p) $ is the probability that only one block can be mined, and if q = 0.1, it will be 1/9. Since it continues z times (z blocks), it takes z times to multiply by z.

The probability that both of the above two events will occur can be calculated by multiplying by * $ \ frac {λ ^ ke ^ {-λ}} {k!} (\ Frac {q} {p}) ^ z $ *. ..

Then, the above formula is divided into sigma when the value of z is larger than k and when it is smaller than k. After that, it is expressed by subtracting the probability that the event does not occur from the total probability (1) as shown on the right side. This way of thinking is called a complementary event. By doing this, infinity disappears and you can think of sigma from 0 to z.

CodeCogsEqn (3).png

After that, arrange the right side of the above formula like the right side of the lower formula ...

CodeCogsEqn (4).png

The formula below is completed. This formula also appears in Satoshi Nakamoto's paper, but it is easier to handle in the program because there is no infinity and there is only one sigma.

CodeCogsEqn.png

Also, in this paper, this formula is incorporated into the code (C language) for calculation.

probability.c


#include <math.h>
double AttackerSuccessProbability(double q, int z)
{
  double p = 1.0 - q;
  double lambda = z * (q / p);
  double sum = 1.0;
  int i, k;
  for (k = 0; k <= z; k++)
    {
      double poisson = exp(-lambda);
      for (i = 1; i <= k; i++)
        poisson *= lambda / i;
      sum -= poisson * (1 - pow(q / p, z - k));
    }
    return sum;
}

I calculated it with Python

It is possible to calculate in C language as it is in the paper, but (personally) I am good at scientific calculation, so let's convert it to Python and calculate.

Probability of successful attacker

In Python, it looks like this:

probability.py


import numpy as np

def attcker_success(q, z):    
    p = 1.0 - q
    l = z * (q/p)
    sum = 1.0

    for k in range(0, z+1):
        poisson = np.exp(-l)
        for i in range(1, k+1):
            poisson *= l / i
        sum -= poisson * (1 - np.power(q/p, z-k))
    return sum

The usage of variables and the calculation procedure are used almost as they are. If you execute the following code, there is a probability of q = 0.1, z = 10, that is, if a malicious miner succeeds in successful mining with a probability of 10% and blocks are connected 10 blocks longer than a well-meaning miner. You can calculate the probability.

probability.py


print('{:.50f}'.format(attcker_success(0.1, 10)))

Output result


0.00000124140217479795642434325410319306826067986549

Let's calculate the probability with various patterns a little more! In the for statement, calculate the probability when the value of z is 0 to 10 and output it.

probability.py


for z_i in range(0,11):
    print("z=",z_i,'  {:.50f}'.format(attcker_success(0.1, z_i)),sep='')

Output result


z=0  1.00000000000000000000000000000000000000000000000000
z=1  0.20458727394278242162073411236633546650409698486328
z=2  0.05097789283933862325426389361382462084293365478516
z=3  0.01317224167889648189788687204782036133110523223877
z=4  0.00345524346648517360902630457530904095619916915894
z=5  0.00091368218792791224339144839916571072535589337349
z=6  0.00024280274536281863662079416599226533435285091400
z=7  0.00006473531692709972641154581030065173763432539999
z=8  0.00001729980418716505960723822665769944251223932952
z=9  0.00000463116397250268184871292015403199116008181591
z=10  0.00000124140217479795642434325410319306826067986549

If you compare it with the probability in Satoshi Nakamoto's paper, you can see that it matches. satoshinakamoto-report.png

I tried to make a graph

So far, we have calculated the probability, but let's draw it on a graph and visualize it. While inheriting the code so far, let's write the code as follows.

plot.py


import numpy as np
import matplotlib.pyplot as plt

def attcker_success(q, z):    
    p = 1.0 - q
    l = z * (q/p)
    sum = 1.0
    for k in range(0, z+1):
        poisson = np.exp(-l)
        for i in range(1, k+1):
            poisson *= l / i
        sum -= poisson * (1 - np.power(q/p, z-k))
    return sum

def attcker_plot(q, z):
    probability = []
    progress = []
    for z_prg in range(z+1):
        progress.append(z_prg)
        probability.append(attcker_success(q, z_prg))
    plt.plot(progress, probability)
    plt.show()

Let's draw a graph for the case of q = 0.1 and z = 10 for which the probability was calculated earlier. It is okay if you specify it as an argument as follows.

plot.py


attcker_plot(0.1, 10)

The output graph is as follows. The horizontal axis is z and the vertical axis is the success rate of malicious miners. You can see that the larger the value of z, the closer the probability is to 0, and from around ** 4 it is ** as close to 0 as possible. download.png

From here, let's draw a graph with a little more patterns. Let's rewrite the code as follows and consider a total of 5 patterns with q values in 0.05 increments from 0.1 to 0.25. Also, the value of z has been increased to 20. The graphs are overlaid. The graph above was very simple, so let's use seaborn to make it a little "gorgeous".

comparison.py


import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

def attcker_success(q, z):    
    p = 1.0 - q
    l = z * (q/p)
    sum = 1.0
    for k in range(0, z+1):
        poisson = np.exp(-l)
        for i in range(1, k+1):
            poisson *= l / i
        sum -= poisson * (1 - np.power(q/p, z-k))
    return sum

def attcker_data(q, z):
    probability = []
    for z_prg in range(z+1):
        probability.append(attcker_success(q, z_prg))
    return probability

probability_list = []
q_value_list = np.arange(0.05, 0.3, 0.05)

for q_value in q_value_list:
    probability_list.append(attcker_data(q_value, 20))
df = pd.DataFrame(probability_list, index=["q=0.05","q=0.10","q=0.15","q=0.20","q=0.25"])
df = df.T

sns.set(style="darkgrid")
sns.set_context("talk")
plt.figure(figsize=(15, 10))
plt.xlabel('z')
plt.ylabel('probability')
plt.xticks( np.arange(0, 22, 1) )
plt.yticks( np.arange(0, 1.1, 0.1) )
sns.lineplot(data=df)

The graph drawn by executing this code is as follows. figure-comparison.jpg

You can see that the higher the value of q (probability of success of the attacker), the slower the change in the graph. In this way, the higher the hash power of the attacker and the higher the probability of successful mining, the more blocks can be connected continuously to create a unique chain.

In the end, how long should I wait?

So far, we've been thinking about how probabilistically malicious miners will succeed. An important point when considering finality is the belief that a transaction once made will not be overturned. Looking back at the discussion so far, we have to think about how likely it is that a malicious miner will overturn the blockchain. In that sense, it is necessary to determine the finality stochastically, but as you can see from the graph, the larger z, the smaller the probability of exponential flipping. Therefore, even if the value of q (success rate of attacker) is high to some extent, the probability approaches 0 as long as z exceeds a certain level.

** If it approaches 0 infinitely, it may be said that it is realistic to confirm the transaction and consider that the settlement is completed **, which is the basis of the stochastic finality. I am.

In fact, in the case of Bitcoin, the transaction is considered to be settled only after 6 blocks have passed. In the case of Bitcoin, one block is mined in 10 minutes on average, so it is necessary to wait about 1 hour to complete the payment. Looking at the graph above, the probability that the chain will be overturned when 6 blocks are connected is close to 0. (Although it is not 0 ...)

In addition, you can receive a mining reward when a miner succeeds in mining, but this is not available unless you connect 100 blocks from the block mined by the miner.

Summary

This time, I tried to confirm the stochastic finality statistically. In the public chain, we don't know who is participating in the network and how many people, so we are trying to prevent fraud by using the computing power of the machine. As introduced here, if you think probabilistically, it seems that you can guarantee the correctness even in a system without an administrator.

However, if the conditions are met to some extent, it will be possible to cheat, such as a successful attack and a non-existent transaction, or a non-existent transaction. I'll talk about this in another article.

References </ b> N.Satoshi(2008)"Bitcoin: A Peer-to-Peer Electronic Cash System"

Recommended Posts

I tried to dig deeper about safety while calculating the stochastic finality of Proof of Work
I tried to improve the efficiency of daily work with Python
I tried to summarize the logical way of thinking about object orientation.
I tried to touch the API of ebay
I tried to correct the keystone of the image
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I tried to automate the face hiding work of the coordination image for wear
I tried to summarize the basic form of GPLVM
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros
I tried to implement automatic proof of sequence calculation
I tried to classify the voices of voice actors
I tried to summarize the string operations of Python
I tried to find the entropy of the image with python
I tried to find out the outline about Big Gorilla
I tried to get the location information of Odakyu Bus
I tried to find the average of the sequence with TensorFlow
[Python] I tried to visualize the follow relationship of Twitter
[Machine learning] I tried to summarize the theory of Adaboost
I tried to fight the Local Minimum of Goldstein-Price Function
[Linux] I tried to summarize the command of resource confirmation system
I tried to get the index of the list using the enumerate function
I tried to automate the watering of the planter with Raspberry Pi
I tried to fix "I tried stochastic simulation of bingo game with Python"
I tried to expand the size of the logical volume with LVM
I tried to summarize the frequently used implementation method of pytest-mock
I tried to visualize the common condition of VTuber channel viewers
I calculated the stochastic integral (I to integral)
I tried to organize about MCMC.
I tried to move the ball
I tried to estimate the interval.
I tried to take the difference of Config before and after work with pyATS / Genie self-made script
When I tried to write about logistic regression, I ended up finding the mean and variance of the logistic distribution.
I tried to transform the face image using sparse_image_warp of TensorFlow Addons
I tried to visualize the age group and rate distribution of Atcoder
I tried transcribing the news of the example business integration to Amazon Transcribe
zoom I tried to quantify the degree of excitement of the story at the meeting
I tried to estimate the similarity of the question intent using gensim's Doc2Vec
I wanted to be careful about the behavior of Python's default arguments
I tried how to improve the accuracy of my own Neural Network
I tried to solve the 2020 version of 100 language processing [Chapter 3: Regular expressions 25-29]
I tried to get the authentication code of Qiita API with Python.
I tried to automatically extract the movements of PES players with software
(Python) I tried to analyze 1 million hands ~ I tried to estimate the number of AA ~
I tried to find the optimal path of the dreamland by (quantum) annealing
I tried to extract and illustrate the stage of the story using COTOHA
I tried to verify and analyze the acceleration of Python by Cython
I tried to analyze the negativeness of Nono Morikubo. [Compare with Posipa]
I tried to streamline the standard role of new employees with Python
I tried to visualize the text of the novel "Weathering with You" with WordCloud
[Linux] I tried to verify the secure confirmation method of FQDN (CentOS7)
I tried to get the RSS of the top song of the iTunes store automatically
I tried to get the movie information of TMDb API with Python
I tried to display the altitude value of DTM in a graph
I tried the common story of using Deep Learning to predict the Nikkei 225
Using COTOHA, I tried to follow the emotional course of Run, Melos!
I tried to verify the result of A / B test by chi-square test
I tried to predict the behavior of the new coronavirus with the SEIR model.
I tried the asynchronous server of Django 3.0
I tried to summarize the umask command