"Brute force of MD5 hash value of 6-digit password" I tried it with Python

It looks interesting, so I tried to participate

This is a project (?) Originating from Making a password inquiry system (clojure reducers), and a realistic range of JAL's password management system. It is a delusion of the method that can be achieved with.

Don't call it now or outdated.

Past achievements (in no particular order)

Please let us know if there are any omissions in the list. I will add it.

A record that doesn't shine at all is added here.

"Brute force of MD5 hash value of 6 digit password" by Python

Execution environment

os:windows7 Home Premium sp1 cpu:i7-4770T @2.50GHz Python:3.4.1(Anaconda 2.0.1 (64-bit))

Source code

Since it is Python3, there is work to convert the character string to a byte string. The cpu has 8 cores, but the number of parallels is 4. The code I used can be found here [https://github.com/soyiharu/md5_time_trial).

Single thread version

hash_single.py


import time
import hashlib
import sys

def main():
    argv = sys.argv[1:]
    if len(argv) != 2:
        sys.exit(0)

    salt = argv[0]
    hash = argv[1]
    
    start = time.time()
    for i in range(1000000):
        pw = "{0}${1:06d}".format(salt, i).encode("utf-8")
        tmp = hashlib.md5(pw).hexdigest()
        if hash == tmp:
            print("match[{0:06d}]".format(i))
    end = time.time()
    print("elapsed time:{0}s".format(end - start))
    
if __name__ == "__main__":
    main()

Multithreaded version

hash_parallel.py


import time
import hashlib
import sys
from multiprocessing import Pool
from itertools import repeat

def calc_hash(arg):
    hash, salt, i = arg
    tmp = hashlib.md5("{0}${1:06d}".format(salt, i).encode("utf-8")).hexdigest()
    return hash == tmp
    
def main():
    argv = sys.argv[1:]
    if len(argv) != 2:
        sys.exit(0)

    salt = argv[0]
    hash = argv[1]
    
    start = time.time()
    
    pool = Pool(4)
    result = pool.map(calc_hash, zip(repeat(hash), repeat(salt), range(1000000)))
    index = result.index(True)
    print("match[{0:06d}]".format(index))
    
    end = time.time()
    print("elapsed time:{0}s".format(end - start))
    
if __name__ == "__main__":
    main()

Execution method

Repeat each 5 times to measure the time.

python hash_single.py hoge 4b364677946ccf79f841114e73ccaf4f
python hash_parallel.py hoge 4b364677946ccf79f841114e73ccaf4f

Measurement result

First time Second time Third time 4th 5th time average standard deviation
Single version 1.724097967 1.736099005 1.729099035 1.733099937 1.739099026 1.732298994 0.005891065
Multi version 1.086061954 1.098062992 1.080061913 1.113064051 1.085062027 1.092462587 0.013278723

The unit is seconds

Execution time in parallel is 63% of non-parallel ※reference "Brute force of MD5 hash value of 6-digit password" When using OpenMP, the execution result (parallel number 4) was 0.912s even though it was slightly changed based on the kita up to 0.70 seconds. (Source Code)

Summary

Since the execution time is only about 63% in parallelization, it is at all in terms of parallelization, but I think that Python has worked hard because I caught up with the difference of about 16% from the case of c. It can be said that c is useless.

Postscript (2014/09/23)

I learned about the existence of a bytearray. The python string and byte string cannot be changed, but the bytearray can be changed. Using this, I thought that it would be faster if I recreated only the 6-digit number part without recreating the entire byte string every time in the single thread version, but it was only about 0.1 seconds faster, not as much as I expected. did.

I'll post the changes so that you can see what you've done.

python


from itertools import product
pw = bytearray("{0}${1:06d}".format(salt, 0).encode("utf-8"))  #Prepare bytearray
for i, value in enumerate(product(b'0123456789', repeat=6)):
    pw[-6:] = value  #Change only the number part

After all, if you want to pursue the speed, maybe C / C ++.

Recommended Posts

"Brute force of MD5 hash value of 6-digit password" I tried it with Python
I tried hundreds of millions of SQLite with python
I don't have a sense of "quiz asking investment sense", so I tried to solve it with brute force (Python Monte Carlo simulation)
[OpenCV / Python] I tried image analysis of cells with OpenCV
I tried to automatically generate a password with Python3
I tried "morphology conversion" of images with Python + OpenCV
I tried fp-growth with python
I tried scraping with Python
Image processing with Python (I tried binarizing it into a mosaic art of 0 and 1)
I tried gRPC with Python
I tried scraping with python
I tried to find the entropy of the image with python
I tried "gamma correction" of the image with Python + OpenCV
I tried running Movidius NCS with python of Raspberry Pi3
I tried a stochastic simulation of a bingo game with Python
I tried web scraping with python.
I tried running prolog with python 3.8.2.
I tried SMTP communication with Python
I compared the speed of Hash with Topaz, Ruby and Python
I tried scraping the ranking of Qiita Advent Calendar with Python
I tried to create a list of prime numbers with python
I tried to fix "I tried stochastic simulation of bingo game with Python"
I tried to improve the efficiency of daily work with Python
I tried to automatically collect images of Kanna Hashimoto with Python! !!
Wrangle x Python book I tried it [2]
I tried scraping Yahoo News with Python
I tried non-photorealistic rendering with Python + opencv
I tried a functional language with Python
I tried to discriminate a 6-digit number with a number discrimination application made with python
I tried recursion with Python ② (Fibonacci sequence)
I tried to streamline the standard role of new employees with Python
Wrangle x Python book I tried it [1]
I tried to get the movie information of TMDb API with Python
[Zaif] I tried to make it easy to trade virtual currencies with Python
#I tried something like Vlookup with Python # 2
[Python & SQLite] I tried to analyze the expected value of a race with horses in the 1x win range ①
When I tried to change the root password with ansible, I couldn't access it.
When I tried to create a virtual environment with Python, it didn't work
I tried to easily visualize the tweets of JAWS DAYS 2017 with Python + ELK
I tried to automatically send the literature of the new coronavirus to LINE with Python
[Python / DynamoDB / boto3] List of operations I tried
I tried "smoothing" the image with Python + OpenCV
I tried image recognition of CIFAR-10 with Keras-Learning-
I tried image recognition of CIFAR-10 with Keras-Image recognition-
I tried Flask with Remote-Containers of VS Code
I tried L-Chika with Raspberry Pi 4 (Python edition)
I tried Jacobian and partial differential with python
I tried to get CloudWatch data with Python
I tried using mecab with python2.7, ruby2.3, php7
I tried to output LLVM IR with Python
I tried "binarizing" the image with Python + OpenCV
I tried running faiss with python, Go, Rust
I tried to automate sushi making with python
I tried playing mahjong with Python (single mahjong edition)
I tried running Deep Floor Plan with Python 3.6.10.
I tried sending an email with SendGrid + Python
[Python Data Frame] When the value is empty, fill it with the value of another column.
I tried to put out the frequent word ranking of LINE talk with Python
I tried to automate the article update of Livedoor blog with Python and selenium.
[Python] I tried to automatically create a daily report of YWT with Outlook mail
[Python & SQLite] I analyzed the expected value of a race with horses with a win of 1x ②