[PYTHON] How to create a large amount of test data in MySQL? ??

After defining DB / table during development using MySQL, you may want test data. Here, we will introduce the procedure from database creation to test data registration.

Executing the script file

When creating a large amount of test data, it is necessary to automate the work. In MySQL, you can execute a file in which SQL commands are written in addition to manually entering SQL commands.

For example, if there is a file that describes the following SQL command, the execution method is as follows.

SQLFile.sql


CREATE DATABASE CreateTestData;
SHOW DATABASES;

Terminal


#Log in to mysql
mysql -u root -p
Enter password: 
・
・
・
# SQLFile.Run sql
mysql> SOURCE SQLFile.sql
+--------------------+
| Database           |
+--------------------+
| information_schema |
| CreateTestData     |
| mysql              |
| performance_schema |
+--------------------+
5 rows in set (0.03 sec)

Since it is possible to read an external file and execute SQL commands in this way, if you can generate a file that describes a large amount of SQL commands that generate test data, you can register a large amount of test data.

Creating a script file for registering test data

Table to register data

This time, define the following user table and create sample data

CREATE.sql


CREATE TABLE user (
  user_id INT PRIMARY KEY AUTO_INCREMENT,
  user_name VARCHAR(45),
  user_age INT,
  created_at DATETIME
 );

Creating a program that generates test data like that

From the viewpoint of understanding and reading the contents, the program that generates the sql script with python3 is shown below.

createSQL.py


# coding:utf-8
import random
import time

#Generate a name from a list of surnames and names
def randomName():
    myouji = ["Tanaka", "Suzuki", "Sato", "Takahashi"]
    namae = ["Taro", "Jiro", "Saburo", "Hanako"]
    return random.choice(myouji) + random.choice(namae)

# 0~Generate 100 random numbers
def randomAge():
    return random.randint(0, 100)

#Random date generation between start and end
def randomDate(start, end):
    format = '%Y-%m-%d %H:%M:%S'
    stime = time.mktime(time.strptime(start, format))
    etime = time.mktime(time.strptime(end, format))
    ptime = stime + random.random() * (etime - stime)
    return time.strftime(format, time.localtime(ptime))

#File name to output
OUTPUT_FILE = "TestData.sql"

#Number of data to be registered
RECORD_COUNT = 10

#SQL command string to execute
sqlCommands = ""

#Specify the database to use(This time CreateTestData)
sqlCommands += "USE CreateTestData;\n"

#Generate INSERT statements for the number of data to be registered
for _ in range(RECORD_COUNT):

    #Random data generation to register
    name = randomName()
    age  = randomAge()
    date = randomDate("2014-6-28 00:00:00", "2015-6-28 00:00:00")

    #Generate Insert statement from random data
    sqlCommands += "INSERT INTO user " \
                   "(user_name, user_age, created_at) " \
                   "VALUES ('{}', '{}', '{}');\n"\
                   .format(name, age, date)

#Write the generated SQL command to a file
f = open(OUTPUT_FILE, 'w')
f.write(sqlCommands)
f.close()

Running the above program will generate a file called TestData.sql in the same directory. If it can be generated successfully, the contents of TestData.sql will be as follows, and INSERT statements for the number of registered data are described.

TestData.sql


USE CreateTestData;
INSERT INTO user (user_name, user_age, created_at) VALUES ('Jiro Sato', '39', '2015-04-28 04:08:08');
INSERT INTO user (user_name, user_age, created_at) VALUES ('Jiro Suzuki', '60', '2015-06-25 23:37:52');
INSERT INTO user (user_name, user_age, created_at) VALUES ('Saburo Tanaka', '40', '2015-04-20 04:56:53');
・
・
・

Execution of the generated script file

To register the test data, let's execute and check the generated script file as shown above.

Terminal


#Log in to mysql
mysql -u root -p
Enter password: 
・
・
・

#Registration of test data
mysql> SOURCE TestData.sql

Database changed
Query OK, 1 row affected (0.01 sec)

Query OK, 1 row affected (0.00 sec)
・
・
・

#Check if the data has been registered
mysql> select * from user;
+---------+--------------+----------+---------------------+
| user_id | user_name    | user_age | created_at          |
+---------+--------------+----------+---------------------+
|       1 |Jiro Sato|       39 | 2015-04-28 04:08:08 |
|       2 |Jiro Suzuki|       60 | 2015-06-25 23:37:52 |
|       3 |Saburo Tanaka|       40 | 2015-04-20 04:56:53 |
|       4 |Saburo Sato|       77 | 2015-04-08 02:46:19 |
|       5 |Jiro Sato|       57 | 2014-11-17 11:21:06 |
|       6 |Taro Takahashi|       83 | 2014-09-20 07:57:11 |
|       7 |Hanako Tanaka|       24 | 2014-11-17 00:51:38 |
|       8 |Jiro Takahashi|       89 | 2014-11-29 20:22:24 |
|       9 |Taro Tanaka|       38 | 2015-02-26 18:04:42 |
|      10 |Jiro Sato|       13 | 2014-09-26 13:14:35 |
+---------+--------------+----------+---------------------+
10 rows in set (0.00 sec)

It was confirmed that 10 sample data were successfully registered.

Summary

I was able to register the data by generating a script file to create test data and executing the script file. As the number of tables and columns increases, you can register test data for any table or column by extending the program accordingly.

Let's Enjoy Development Life!

Recommended Posts

How to create a large amount of test data in MySQL? ??
[TensorFlow 2.x compatible version] How to train a large amount of data using TFRecord & DataSet in TensorFlow (Keras)
How to create a JSON file in Python
How to create data to put in CNN (Chainer)
How to create a Rest Api in Django
How to send a visualization image of data created in Python to Typetalk
Example of how to aggregate a large amount of time series data using Python at a reasonable speed in a small memory environment
One-liner to create a large number of test files at once on Linux
How to develop in a virtual environment of Python [Memo]
How to get a list of built-in exceptions in python
How to get an overview of your data in Pandas
How to get a quadratic array of squares in a spiral!
How to create a Conda package
How to create a Dockerfile (basic)
How to create a config file
How to create an instance of a particular class from dict using __new__ () in python
How to plot the distribution of bacterial composition from Qiime2 analysis data in a box plot
How to display a specified column of files in Linux (awk)
Overview of how to create a server socket and how to establish a client socket
How to check the memory size of a variable in Python
How to create a heatmap with an arbitrary domain in Python
How to check the memory size of a dictionary in Python
<Pandas> How to handle time series data in a pivot table
How to get the vertex coordinates of a feature in ArcPy
Create a function to get the contents of the database in Go
How to create a clone from Github
How to create a git clone folder
How to use Google Test in C
How to get a stacktrace in python
How to handle consecutive values in MySQL
How to create a repository from media
How to test on a Django-authenticated page
How to test each version of IE using Selenium in modan.IE (VM)
Test & Debug Tips: Create a file of the specified size in Python
How to create a wrapper that preserves the signature of the function to wrap
[Development environment] How to create a data set close to the production DB
How to format a list of dictionaries (or instances) well in Python
Code reading of faker, a library that generates test data in Python
[Python] [Word] [python-docx] Try to create a template of a word sentence in Python using python-docx
How to calculate the volatility of a brand
How to create sample CSV data with hypothesis
Create a Vim + Python test environment in 1 minute
How to create large files at high speed
A simple example of how to use ArgumentParser
How to create a Python virtual environment (venv)
Upload a large number of images to Wordpress
How to clear tuples in a list (Python)
How to keep track of work in Powershell
How to embed a variable in a python string
How to create a function object from a string
Summary of how to import files in Python 3
I want to create a window in Python
Randomly sample MNIST data to create a dataset
How to implement a gradient picker in Houdini
How to notify a Discord channel in Python
[Note] How to create a Ruby development environment
How to create a Kivy 1-line input box
How to create a multi-platform app with kivy
How to read time series data in PyTorch
[Python] How to draw a histogram in Matplotlib
How to write a named tuple document in 2020