[PYTHON] Script to create FlatBuffers binaries from SQL database

As a serializer, there is FlatBuffers developed by Google. Since deserialization is explosive, it seems that it can be used in games etc., but it seems that it is not popular. I thought that one of the reasons was that it was troublesome to create the data, so I created a tool that serializes the contents of the SQL database with FlatBuffers as it is. As a result, the structure definition of the data can be done in the SQL table.

In the case of social games, the definition of master data is usually done by the server engineer, and the client seems to receive the contents as a JSON or SQLite DB file, but the client side is waiting until the conversion support is completed. However, if you use this tool, you can supply data to the client side when the SQL data table is created, and you can respond immediately to changes in the table.

Preparation


Prepare a database for master data

The tool assumes MySQL. Create the required table. If you want a sample, use the official MySQL sakila-db.

Install the FlatBuffers tools

There is a build tool called flatc that makes it available. For mac, brew install flat buffers is OK. Super easy. Just in case, type flatc on the console to make sure it's in your path.

install python

I will use the popular Python for conversion, so install it.

Create a working folder appropriately

Then put the following files.

makefbs.py


#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import os
import os.path
import re
import shutil
import time
import sqlite3
import pymysql.cursors

_isSqliteDb = False

def GetConvName(_str):
    _convName = ''
    _names = re.split('[ _]', _str)
    for _name in _names:
        _convName = _convName + _name.capitalize()
    return _convName


class SchemaInfo:
    def __init__(self, _keyName, _keyType, _fbsKeyType):
        self.keyName = _keyName
        self.keyType = _keyType
        self.fbskeyName = GetConvName(_keyName)
        self.fbsKeyType = _fbsKeyType

#Get column name and type
def GetShemaInfo(_tableName, _conn):
    _keys = []
    _cur = _conn.cursor()
    if _isSqliteDb == True:
        # sqlite
        _order = "select sql from sqlite_master where type = 'table' and name = '" + _tableName + "'"
        _cur.execute(_order)
        _schema = _cur.fetchone()[0]
        _startPos = _schema.find(_tableName) + len(_tableName) + 1
        _schema = _schema[_startPos:-5]
        for _item in _schema.split(','):
            _itemTmp = _item.split(' ')
            if _itemTmp[1] == 'integer':
                _fbsKeyType = "int"
            elif _itemTmp[1] == 'real':
                _fbsKeyType = "float"
            elif _itemTmp[1] == 'numeric':
                _fbsKeyType = "int"
            else:
                _fbsKeyType = "string"
            _keys.append(SchemaInfo(_itemTmp[0], _itemTmp[1], _fbsKeyType))
    else:
        # mysql
        _order = "SHOW COLUMNS FROM " + _tableName 
        _cur.execute(_order)
        _schema = _cur.fetchall()
        for _item in _schema:
            _type = _item['Type']
            _isUnsigned = _type.find('unsigned') >= 0
            if _type.find('tinyint') >= 0:
                _fbsKeyType = "ubyte" if _isUnsigned else 'byte'
            elif _type.find('smallint') >= 0:
                _fbsKeyType = "ushort" if _isUnsigned else 'short'
            elif _type.find('bigint') >= 0:
                _fbsKeyType = "ulong" if _isUnsigned else 'long'
            elif _type.find('mediumint') >= 0 or _type.find('int') >= 0:
                _fbsKeyType = "uint" if _isUnsigned else 'int'
            elif _type.find('float') >= 0:
                _fbsKeyType = "float"
            elif _type.find('double') >= 0:
                _fbsKeyType = "double"
            else:
                _fbsKeyType = "string"
            _keys.append(SchemaInfo(_item['Field'], _item['Type'], _fbsKeyType))
    return _keys

def GetShemaInfoFromFbsKeyName(_keys, _name):
    for _item in _keys:
        if _item.fbskeyName == _name:
            return _item
    return None


def Create_Fbs(_conn, _list, _dbMasterName):
    _fd = open('param.fbs', 'w')
    _fd.write("// IDL file for our master's schema.\n")
    _fd.write("\n")
    _fd.write("namespace param;\n")
    _fd.write("\n")

    for _name in _list:
        Create_Master(_name, _fd, _conn)

    _fd.write("table " + _dbMasterName + "\n")
    _fd.write("{\n")
    for _name in _list:
        _fd.write(' ' + _name[1] + 'Master:' + _name[1] + "Master;\n")
    _fd.write("}\n")
    _fd.write("\n")
    _fd.write("root_type " + _dbMasterName + ";\n")
    _fd.close()


def Create_Master(_name, _fd, _conn):
    print("create " + _name[0])

    _keys = GetShemaInfo(_name[0], _conn)

    _fd.write("table " + _name[1] + "Element\n")
    _fd.write("{\n")
    for _key in _keys:
        _fd.write(" " + _key.fbskeyName + ":" + _key.fbsKeyType + ";\n")
    _fd.write("}\n\n")
    _fd.write("table " + _name[1] + "Master\n")
    _fd.write("{\n")
    _fd.write(" data:[" + _name[1] + "Element];\n")
    _fd.write("}\n\n\n")


def Create_Conv(_conn, _list):
    for _name in _list:
        _elementName = _name[1] + 'Element'
        _masterName = _name[1] + 'Master'
        _keyInfos = GetShemaInfo(_name[0], _conn)
        _startString = _elementName + "." + _elementName + "Start"
        _endString = _elementName + "." + _elementName + "End"

        _fd1 = open('conv' + _masterName + '.py', 'w')
        _fd1.write("import utility\n")
        if _isSqliteDb:
            _fd1.write("import sqlite3\n")
        else :
            _fd1.write("import pymysql.cursors\n")
        _fd1.write("from param import " + _elementName + "\n")
        _fd1.write("from param import " + _masterName + "\n")
        _fd1.write("\n")
        _fd1.write("class conv" + _masterName + "(object):\n")
        _fd1.write("\n")
        _fd1.write("    @classmethod\n")
        _fd1.write("    def Create(self, _conn, _builder):\n")
        _fd1.write("        _cmd = '" + _name[0] + "'\n")
        _fd1.write("        print(\"convert \" + _cmd)\n")
        _fd1.write("\n")
        _fd1.write("        _cur = _conn.cursor()\n")
        _fd1.write("        _cur.execute(\"SELECT * FROM \" + _cmd)\n")
        _fd1.write("        _list = _cur.fetchall()\n")
        _fd1.write("\n")
        _fd1.write("        _elements = []\n")
        _fd1.write("        for _row in _list:\n")

        _fd2 = open('param/' + _elementName + '.py', 'r')
        _strline = _fd2.readline()
        while _strline and _strline.find(_elementName + "Start(") < 0:
            _strline = _fd2.readline()

        _strline = _fd2.readline()
        _params = []
        while _strline and _strline.find(_elementName + "End(") < 0:
            _endPos = _strline.find("):")
            _strline = _elementName + '.' +_strline[4:(_endPos + 1)]
            _strline = _strline.replace('(builder', '(_builder')
            _params.append(_strline)
            _strline = _fd2.readline()

        for _param in _params:
            _fbsArgName = _param[(_param.rfind(', ') + 2):-1]
            _keyInfo = GetShemaInfoFromFbsKeyName(_keyInfos, _fbsArgName)
            _argName = "_" + _fbsArgName
            _writeText = "            " + _argName + " = "
            if _keyInfo.fbsKeyType == 'int' or _keyInfo.fbsKeyType == 'uint' or \
                _keyInfo.fbsKeyType == 'short' or _keyInfo.fbsKeyType == 'ushort' or \
                _keyInfo.fbsKeyType == 'long' or _keyInfo.fbsKeyType == 'ulong' or \
                _keyInfo.fbsKeyType == 'byte' or _keyInfo.fbsKeyType == 'ubyte':
                _writeText = _writeText + "utility.GetInt(_row['" + _keyInfo.keyName + "'])\n"
            elif _keyInfo.fbsKeyType == 'float':
                _writeText = _writeText + "utility.GetFloat(_row['" + _keyInfo.keyName + "'])\n"
            elif _keyInfo.fbsKeyType == 'double':
                _writeText = _writeText + "utility.GetDouble(_row['" + _keyInfo.keyName + "'])\n"
            else:
                _writeText = _writeText + "_builder.CreateString(utility.GetStr(_row['" + _keyInfo.keyName + "']))\n"
            _fd1.write(_writeText)

        _fd1.write("            " + _startString + "(_builder)\n")

        for _param in _params:
            _startPos = _param.rfind(', ') + 2
            _param = _param[:_startPos] + '_' + _param[_startPos:]
            _fd1.write("            " + _param + "\n")

        _fd1.write("            _elements.append(" + _endString + "(_builder))\n")
        _fd1.write("\n")
        _fd1.write("        " + _masterName + "." + _masterName + "StartDataVector(_builder, len(_elements))\n")
        _fd1.write("        for i in range(len(_elements)):\n")
        _fd1.write("            _builder.PrependUOffsetTRelative(_elements[i])\n")
        _fd1.write("        _data = _builder.EndVector(len(_elements))\n")
        _fd1.write("\n")
        _fd1.write("        " + _masterName + "." + _masterName + "Start(_builder)\n")
        _fd1.write("        " + _masterName + "." + _masterName + "AddData(_builder, _data)\n")
        _fd1.write("        return " + _masterName + "." + _masterName + "End(_builder)\n")

        _fd1.close()
        _fd2.close()


def Create_ConvBaseFile(_list, _dbParam, _outName, _dbMasterName):
    _fd1 = open('convAll.py', 'w')
    _fd1.write("#!/usr/bin/env python\n")
    _fd1.write("# -*- coding: utf-8 -*-\n")
    _fd1.write("import sys\n")
    _fd1.write("import os\n")
    _fd1.write("sys.path.append('../flatbuffers')\n")
    _fd1.write("import flatbuffers\n")
    if _isSqliteDb:
        _fd1.write("import sqlite3\n")
    else :
        _fd1.write("import pymysql.cursors\n")
    for _name in _list:
        _masterName = _name[1] + "Master"
        _fd1.write("import conv" + _masterName + "\n")
    _fd1.write("from param import DbMaster\n")
    _fd1.write("\n")
    _fd1.write("def Convert(_conn, _outName):\n")
    _fd1.write("    _builder = flatbuffers.Builder(0)\n")
    _fd1.write("\n")
    for _name in _list:
        _masterName = _name[1] + "Master"
        _fd1.write("    _" + _masterName + " = conv" + _masterName + ".conv" + _masterName + "().Create(_conn, _builder)\n")
    _fd1.write("\n")
    _DbMasterStr = _dbMasterName + "." + _dbMasterName
    _fd1.write("    " + _DbMasterStr + "Start(_builder)\n")
    for _name in _list:
        _masterName = _name[1] + "Master"
        _fd1.write("    " + _DbMasterStr + "Add" + _masterName + "(_builder, _" + _masterName + ")\n")
    _fd1.write("    _totalData = " + _DbMasterStr + "End(_builder)\n")
    _fd1.write("    _builder.Finish(_totalData)\n")
    _fd1.write("\n")
    _fd1.write("    _final_flatbuffer = _builder.Output()\n")
    _fd1.write("    _fd = open(_outName, 'wb')\n")
    _fd1.write("    _fd.write(_final_flatbuffer)\n")
    _fd1.write("    _fd.close()\n")
    _fd1.write("\n")
    _fd1.write("if __name__ == \"__main__\":\n")
    if _isSqliteDb:
        _fd1.write("    _conn = sqlite3.connect('app.db')\n")
        _fd1.write("    _conn.row_factory = sqlite3.Row\n")
    else:
        _fd1.write("    _conn = pymysql.connect(host='" + _dbParam[1] + "', user='" + _dbParam[2] + "', password='" + _dbParam[3] + "', db='" + _dbParam[4] + "', charset='utf8', cursorclass=pymysql.cursors.DictCursor)\n")
    _fd1.write("    Convert(_conn, '" + _outName + "')\n")
    _fd1.write("    _conn.close()\n")
    _fd1.close()


def Create_UtilityFile():
    _fd1 = open('utility.py', 'w')
    _fd1.write(\
    "def GetInt(_str):\n"\
    "    return 0 if _str is None or _str == 'None' else int(_str)\n"\
    "\n"\
    "def GetFloat(_str):\n"\
    "    return 0 if _str is None or _str == 'None' else float(_str)\n"\
    "\n"\
    "def GetDouble(_str):\n"\
    "    return 0 if _str is None or _str == 'None' else double(_str)\n"\
    "\n"\
    "def GetStr(_str):\n"\
    "    if isinstance(_str, unicode):\n"\
    "        return _str.encode('utf-8')\n"\
    "    if isinstance(_str, str):\n"\
    "        return _str\n"\
    "    return \"\" if _str is None else str(_str)\n"\
    )
    _fd1.close()


if __name__ == "__main__":
    _dbParam = []
    _isSqliteDb = False
    _settingName = 'Setting.txt'
    _inName = 'TableName.txt'
    _outName = 'db_master.dat'
    _dbMasterName = 'DbMaster'
    _flatparam = '-c'

    #for argv in sys.argv:

    if os.path.isfile(_settingName):
        _fd1 = open(_settingName, 'r')
        _dbParam = _fd1.read().splitlines()
        _fd1.close()

    _isSqliteDb = _dbParam[0] == "sqlite"

    _list = []
    _fd1 = open(_inName, 'r')
    _line = _fd1.readline().rstrip("\n")
    while _line:
        _list.append(_line.split(','))
        _line = _fd1.readline().rstrip("\n")

    if _isSqliteDb:
        _conn = sqlite3.connect(_dbParam[1])
        _conn.row_factory = sqlite3.Row
    else:
        _conn = pymysql.connect(host=_dbParam[1], user=_dbParam[2], password=_dbParam[3], db=_dbParam[4], charset='utf8', cursorclass=pymysql.cursors.DictCursor)

    # Create working directory
    shutil.rmtree("conv", True)
    time.sleep(0.1)
    os.makedirs("conv")
    os.chdir('conv')
    _fd = open('__init__.py', 'w')
    _fd.close()
    Create_UtilityFile()

    Create_Fbs(_conn, _list, _dbMasterName)
    os.system('flatc ' + _flatparam + ' -p param.fbs')

    Create_Conv(_conn, _list)
    Create_ConvBaseFile(_list, _dbParam, _outName, _dbMasterName)
    _conn.close()

    os.chdir('..')
    os.system('python conv/convAll.py')

Create a definition file for conversion

Create a file that describes the connection information to SQL and the definition of which table to refer to. Put it in the same folder as makefbs.py.

Setting.txt


mysql          (SQL selection)
localhost      (Connection destination address)
root           (username)
Pass-123       (password)
sakila         (DB name)

TableName.txt


actor_info,SampleActorInfo         (DB table name,Output class name)
customer_list,SampleCustomerList
film_list,SampleFilmList

Copy the FlatBuffers file for conversion

Create a folder called flatbuffers and copy the officially downloaded source for python.

Convert

Run python ./makefbs.py in the console. It connects to MySQL, parses the specified table structure, and automatically generates the FlatBuffers schema. Then run flatc to create the final binary data. A data file called db_master.dat and a working folder called conv are created. This tool is designed to output header files for C language. Other languages can be supported by rewriting the _flatparam variable in makefbs.py. There is a file called param_generated.h in the conv folder, which will be used for deserialization.


Try using the created data

python


#include <iostream>
#include <fstream>
#include <sys/stat.h>
#include "param_generated.h"

int main(int argc, const char * argv[])
{
	struct stat results;
	
	if (stat(argv[1], &results) != 0)
	{
		std::cout << "file not found\n";
		return -1;
	}

	char* buff = new char[results.st_size];
	std::ifstream file;
	file.open(argv[1], std::ios::in | std::ios::binary);
	if (file)
	{
		file.read(buff, results.st_size);
		auto data = param::GetDbMaster(buff);
		auto actorMaster = data->SampleActorInfoMaster()->data();
		auto num = actorMaster->size();
		for (int i = 0; i < num; i++)
		{
			auto data = actorMaster->Get(i);
			std::cout << data->ActorId() << " " << data->FirstName()->c_str() << " " << data->LastName()->c_str() << "\n";
		}
	}
	delete[] buff;

    return 0;
}

In the sample, a table called actor_info is grouped into a std :: vector class called SampleActorInfoMaster. The data in each row is defined as the SampleActorInfoElement class and can be obtained with Get (). And it feels like param :: GetDbMaster has multiple table information.

The acquisition itself is organized by std :: vector, so I think it's easy. It takes time to update the header file every time the table structure changes, but I think that it is convenient to save the time and effort to convert the SQL table structure into data as it is. Since it is made on the premise of custom, all the columns of the table are serialized at present, but have you play with makefbs.py according to each project, such as deciding the naming convention so that you can make columns that are not serialized I think it's good.

Recommended Posts

Script to create FlatBuffers binaries from SQL database
Python script to create a JSON file from a CSV file
Create folders from '01' to '12' with python
Connect to utf8mb4 database from python
How to create a clone from Github
Script to generate directory from json file
Execute a script from Jupyter to process
How to create a repository from media
Script to create a Mac dictionary file
Edit Excel from Python to create a PivotTable
Easy script migration from CentOS to Oracle Linux
How to create a function object from a string
Create a New Todoist Task from Python Script
[python] Create table from pandas DataFrame to postgres
Sum from 1 to 10
sql from python
How to use NUITKA-Utilities hinted-compilation to easily create an executable file from a Python script
I wrote Python code to create a table (view) dependency diagram (PlantUML) from SQL
How to create an article from the command line
[Python3] Connection to Oracle Database and SQL execution [cx_Oracle]
Speeding up when connecting from cx_Oracle to Autonomous Database
How to create a simple TCP server / client script
Automatic conversion from MySQL Workbench mwb file to sql file
How to create a kubernetes pod from python code
ODBC access to SQL Server from Linux with Python