[PYTHON] How to handle consecutive values in MySQL

Introduction

Recently, I have more and more opportunities to use MySQL. Among them, there were occasional occasions when I wanted a serial number, such as generating test data or generating consecutive dates.

The serial number has the generate_series () function for PostgreSQL and the LEVEL pseudo column for Oracle. However, MySQL doesn't have the right one, so I had to devise various things.

So, this time, I would like to summarize the "ingenuity" of generating serial numbers with MySQL.

Add serial number to existing table

For example, suppose you have a table with the following fruits and their prices:

Fruit table


SELECT * FROM sample;

+-----------+-------+
| name      | price |
+-----------+-------+
|Apple|   100 |
|Mandarin orange|    80 |
|banana|   120 |
+-----------+-------+

To number and arrange them in ascending order of price, do as follows.

Give serial numbers to fruit tables


SET @num=0;
SELECT
  @num:=@num+1 AS num,
  name,
  price
FROM sample
ORDER BY price;

+------+-----------+-------+
| num  | name      | price |
+------+-----------+-------+
|    1 |Mandarin orange|    80 |
|    2 |Apple|   100 |
|    3 |banana|   120 |
+------+-----------+-------+

By defining the " @ </ span> num" variable, you can easily assign serial numbers as described above. Be sure to remember "SET @ </ span> num = 0;" here. Since the " @ </ span> num" variable continues to fluctuate, executing the above SELECT statement again produces the following result.

@Re-execute the SELECT statement without initializing num


SELECT
  @num:=@num+1 AS num,
  name,
  price
FROM sample
ORDER BY price;

+------+-----------+-------+
| num  | name      | price |
+------+-----------+-------+
|    4 |Mandarin orange|    80 |
|    5 |Apple|   100 |
|    6 |banana|   120 |
+------+-----------+-------+

Since " @ </ span> num" was 3 last time, it has increased further. Therefore, initialization by "SET @ </ span> num = 0;" is required.

Serial number generation for test data

The nice thing about the DB serial number generation function is that you can easily generate test data. However, MySQL does not have the right function, so some ingenuity is required.

Simple way

As explained earlier, you can easily assign serial numbers to existing tables. Therefore, if MySQL has a table that is easily accessible by any user, it is possible to create a table with only serial numbers.

Many people seem to use "information_schema.COLUMNS" as such a convenient table. For example, to create a test table with a serial number of 10 rows:

10 lines of test data generation


SET @num=0;
SELECT
  @num:=@num+1 AS num,
  MD5(@num) AS txt
FROM information_schema.COLUMNS
LIMIT 10;

+------+----------------------------------+
| num  | txt                              |
+------+----------------------------------+
|    1 | c4ca4238a0b923820dcc509a6f75849b |
|    2 | c81e728d9d4c2f636f067f89cc14862c |
|    3 | eccbc87e4b5ce2fe28308fd9f2a7baf3 |
|    4 | a87ff679a2f3e71d9181a67b7542122c |
|    5 | e4da3b7fbbce2345d7772b0674a318d5 |
|    6 | 1679091c5a880faf6fb5e6087eb1b2dc |
|    7 | 8f14e45fceea167a5a36dedd4bea2543 |
|    8 | c9f0f895fb98ab9159f51fd0297e236d |
|    9 | 45c48cce2e2d7fbdea1afc51c7c6ad26 |
|   10 | d3d9446802a44259755d38e6d163e820 |
+------+----------------------------------+

However, the number of rows in the table that can be generated by this method depends on the number of rows in "information_schema.COLUMNS". To generate test data beyond this, you need to use UNION well or create your own table with a very large number of rows.

Programmatic method

By concatenating the tables with UNION, you can generate a table with any number of rows as shown below.

However, it is not practical to manually generate thousands of lines of this. It's easy to connect to MySQL in some programming language and loop through it.

This time I will try connecting from python using mysql-connector-python.

mysql_test.py


# coding:utf-8
import mysql.connector

#Connection information dictionary
config = {
    'user': 'root',
    'password': 'password',
    'host': '123.45.67.890',
    'database': 'test'
}

#Connect
conn = mysql.connector.connect(**config)

# ------------------------------
#Test data SELECT statement generation
# ------------------------------
start_num = 0
end_num = 10
query_list = ["(SELECT {start_num} AS s".format(start_num=start_num)]
# 「UNION ALL SELECT {num}"Add
for i in range(start_num + 1, end_num):
    query_list.append(" UNION ALL SELECT {num}".format(num=i))
query_list.append(")")
query = "\n".join(query_list)
# ------------------------------

#Execution of SELECT statement
cursor = conn.cursor()
cursor.execute(query)

#Get SELECT result
for get_row in cursor:
    print(get_row[0])

#End of connection
conn.close()

As a result, the following SELECT statement is generated, and the table with the serial number can be obtained.

Generated SQL


(SELECT 0 AS s
 UNION ALL SELECT 1
 UNION ALL SELECT 2
 UNION ALL SELECT 3
 UNION ALL SELECT 4
 UNION ALL SELECT 5
 UNION ALL SELECT 6
 UNION ALL SELECT 7
 UNION ALL SELECT 8
 UNION ALL SELECT 9
)

You can refer to the serial number by making the serial number part a subquery as follows.

mysql_test2.py


# coding:utf-8
import mysql.connector

#Connection information dictionary
config = {
    'user': 'root',
    'password': 'password',
    'host': '123.45.67.890',
    'database': 'test'
}

#Connect
conn = mysql.connector.connect(**config)

# ------------------------------
#Test data SELECT statement generation
# ------------------------------
start_num = 0
end_num = 10
query_list = ["SELECT serial_num.s, md5(serial_num.s) FROM", "(SELECT {start_num} AS s".format(start_num=start_num)]
# 「UNION ALL SELECT {num}"Add
for i in range(start_num + 1, end_num):
    query_list.append(" UNION ALL SELECT {num}".format(num=i))
query_list.append(") AS serial_num")
query = "\n".join(query_list)
# ------------------------------

#Execution of SELECT statement
cursor = conn.cursor()
cursor.execute(query)

#Get SELECT result
for get_row in cursor:
    print(get_row[0], get_row[1])

#End of connection
conn.close()

Execution result


0 cfcd208495d565ef66e7dff9f98764da
1 c4ca4238a0b923820dcc509a6f75849b
2 c81e728d9d4c2f636f067f89cc14862c
3 eccbc87e4b5ce2fe28308fd9f2a7baf3
4 a87ff679a2f3e71d9181a67b7542122c
5 e4da3b7fbbce2345d7772b0674a318d5
6 1679091c5a880faf6fb5e6087eb1b2dc
7 8f14e45fceea167a5a36dedd4bea2543
8 c9f0f895fb98ab9159f51fd0297e236d
9 45c48cce2e2d7fbdea1afc51c7c6ad26

Generated SQL


SELECT serial_num.s, md5(serial_num.s) FROM
(SELECT 0 AS s
 UNION ALL SELECT 1
 UNION ALL SELECT 2
 UNION ALL SELECT 3
 UNION ALL SELECT 4
 UNION ALL SELECT 5
 UNION ALL SELECT 6
 UNION ALL SELECT 7
 UNION ALL SELECT 8
 UNION ALL SELECT 9
) AS serial_num

Create a procedure for test data generation

It is possible to generate a huge table only with MySQL by using a procedure without using a program. The following is the procedure to generate the generate_series_tbl table with the number rows passed as an argument. Since DROP TABLE is done at the beginning, a new table is created each time it is executed.

Test data generation procedure


DELIMITER //
CREATE PROCEDURE generate_series (IN max INT)
BEGIN
  DECLARE i INT;
  SET i = 0;
  DROP TABLE IF EXISTS generate_series_tbl;
  CREATE TABLE generate_series_tbl (num INT);
  WHILE i < max DO
    INSERT INTO generate_series_tbl VALUES (i);
    SET i = i + 1;
  END WHILE;
END
//
DELIMITER ;

Execution example


CALL generate_series(100);

Query OK, 1 row affected (1.61 sec)


SELECT count(*) FROM generate_series_tbl;

+----------+
| count(*) |
+----------+
|      100 |
+----------+


SELECT * FROM generate_series_tbl;
+------+
| num  |
+------+
|    0 |
|    1 |
|    2 |
|    3 |
…(Omission)…
|   97 |
|   98 |
|   99 |
+------+

As mentioned above, a table with serial numbers from 0 to 99 was generated. Of course, since we are inserting the actual data, the larger the table, the longer it will take to generate it. Also, since the huge table will be left unattended, it may be good to generate it as a temporary table.

Reference link

About serial number generation

About mysql-connector-python

About the procedure

Recommended Posts

How to handle consecutive values in MySQL
How to handle session in SQLAlchemy
How to use Mysql in python
How to handle Japanese in Python
How to handle datetime type in python sqlite3
How to develop in Python
How to handle data frames
[Python] How to output the list values in order
How to handle JSON in Ruby, Python, JavaScript, PHP
[Django] How to give input values in advance with ModelForm
How to get all the keys and values in the dictionary
[Blender] How to handle mouse and keyboard events in Blender scripts
How to use discrete values as variables in Scipy optimize
How to search by string to use mysql json_contains in SQLAlchemy
[Python] How to do PCA in Python
How to use classes in Theano
How to write soberly in pandas
How to collect images in Python
How to update Spyder in Anaconda
How to reflect CSS in Django
How to kill processes in bulk
[Python] How to swap array values
How to wrap C in Python
How to use ChemSpider in Python
How to use PubChem in Python
How to run TensorFlow 1.0 code in 2.0
How to log in to Docker + NGINX
How to call PyTorch in Julia
How to handle multiple versions of CUDA in the same environment
How to get all the possible values in a regular expression
<Pandas> How to handle time series data in a pivot table
How to create a large amount of test data in MySQL? ??
How to use calculated columns in CASTable
[Introduction to Python] How to use class in Python?
How to suppress display error in matplotlib
How to access environment variables in Python
How to dynamically define variables in Python
How to do R chartr () in Python
Put together consecutive values in the list
How to delete expired sessions in Django
[Itertools.permutations] How to put permutations in Python
How to use Google Test in C
How to implement nested serializer in drf-flex-fields
How to execute commands in jupyter notebook
How to do'git fetch --tags' in GitPython
How to get a stacktrace in python
How to display multiplication table in python
How to extract polygon area in Python
How to reassign index in pandas dataframe
How to check opencv version in python
How to enable SSL (TLS) in Apache
How to use Anaconda interpreter in PyCharm
How to specify non-check target in Flake8
How to switch python versions in cloud9
How to adjust image contrast in Python
How to use __slots__ in Python class
How to dynamically zero pad in Python
How to do Server-Sent Events in Django
How to use regular expressions in Python
How to implement Scroll View in pythonista 1
How to convert DateTimeField format in Django