I tried to implement ModanShogi with Kinx

Introduction

Happy New Year. We look forward to working with you this year.

By the way, the scripting language Kinx delivered in ** "looks like JavaScript, brain (contents) is Ruby, (stability is AC/DC)" **, but please continue to favor this year as well. I'm happy.

This time is the first series of making language processing systems.

I made a parser combinator and a JIT library, so I want to make a language processing system. This may be the first and last time. I made a simple calculator before, but this is a series to try something a little more advanced.

I've hidden it until now, but I'm actually a shogi fan. Hanyu is a god. I also went to listen to the lecture.

Make ModanShogi

A long time ago, Modan Shogi became popular. By the way, the term "modern" here does not mean "modern", but it is ** modern-yaki **, so be careful. Modern-yaki modern means ** "a lot" **. The spelling is also different.

Source code

Actually, I made it a long time ago and put it in the examples folder. You can see the entire source below.

And, as I wrote below, I noticed in this article that I forgot to support real numbers in the compiled version. Maybe I'll fix it soon. Please note.

This time, I will explain this from the top, and I also use it as a memorandum of my own.

By the way, in order to achieve this, it is a secret that the JIT library is crafted so that putc and putn can be called as described below. In the future, I'd like to make it general-purpose so that C functions can be called, but that's the connection up to that point.

specification

See here for specifications.

I will quote the sample from the above. In addition, I will explain the whole article in case the original article is gone.

Hello World by ModanShogi.

▲ 9 eight silver △ 9 two balls ▲ 6 four steps △ 6 six silver ▲ 6 steps △ same ball
▲ 6 seven steps △ 6 one ball ▲ 6 eight balls △ 6 three steps ▲ 6 nine balls △ 7 seven silver
▲ 7 Gokin △ 7 One ball ▲ 4 Eight silver △ 4 Two balls ▲ 9 Six and △ 9 Eight steps
▲ Same ball △ 6 Seven balls ▲ 9 Five gold △ 9 One ball ▲ 6 Three gold △ Same ball
▲ 6 eight gold △ 6 two balls ▲ 4 steps △ 4 three balls ▲ 5 five steps △ 5 three balls

grammar

Operating environment

The ModanShogi program runs on a virtual machine like this:

As an instruction, there are the following.

PIECE Also known as COL ROW meaning
When mov X Y X = Y
Ayumu add X Y X += Y
Money sub X Y X -= Y
Silver mul X Y X *= Y
Katsura div X Y X /= Y
Incense mod X Y X %= Y
Dragon push X _ Push X
Horse pop X _ Pop to X
ball putc X _ Output character with character code X
king putn X _ Output X as a number
Fly jump_if X Y If X is non-zero, jump to the label indicated by register Y
Horn jump_ifp X Y If X is 0 or more, jump to the label indicated by register Y

Implementation

Well, here is the implementation. In general, the operation of the language processing system is as follows.

It's rough. In detail, there are a lot of keywords such as lexical analysis, semantic analysis, optimization, etc., but this time it is not necessary, so I will follow the above flow.

Make a parser

Now, let's first make a parser to meet the above specifications. Basically, I will explain in the order of ** Source **.

Loading the library

First, load the required libraries. There are two:

using Parsek;
using Jit;

constant

Next, let's define constants such as each instruction of ModanShogi. It is also defined as a character string for code output (dump) (although it is not used).

enum {
    label = 0,
    mov,
    add,
    sub,
    mul,
    div,
    mod,
    push,
    pop,
    putc,
    putn,
    jump_if,
    jump_ifp,
}

var opname = [
    "label",
    "mov",
    "add",
    "sub",
    "mul",
    "div",
    "mod",
    "push",
    "pop",
    "putc",
    "putn",
    "jump_if",
    "jump_ifp",
];

Converter to internal representation

Next is the converter. What is converted is that the value obtained by grammatical interpretation is converted into an internal representation. Make a combination of opcode and arguments.

class Converter {
    var map_ = [
        { '▲': 0, '△': 1, '☗': 0, '☖': 1 },
        { '1': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9, 'same': 0 },
        { 'one': 1, 'two': 2, 'three': 3, 'four': 4, 'Five': 5, 'Six': 6, 'Seven': 7, 'Eight': 8, 'Nine': 9, ' ': 0 },
        { 'When': mov, 'Ayumu': add, 'Money': sub, 'Silver': mul, 'Katsura': div, 'Incense': mod,
          'Dragon': push, 'Horse': pop, 'ball': putc, 'king': putn, 'Fly': jump_if, 'Horn': jump_ifp },
    ];
    var prev_;
    public convert(stmt) {
        # System.println(stmt);
        var r = stmt.map { => map_[_2][_1] };
        if (r[1] == 0) r[1] = prev_[1];
        if (r[2] == 0) r[2] = prev_[2];
        prev_ = r;
        return {
            r1: r[1],
            r2: r[2],
            op: r[3],
        };
    }
}

In order to handle same, the previous state (prev_) is retained.

Grammar definition

Well, this is the long-awaited grammar definition. The parser combinator is easy because it includes the lexical analysis part in a sense.

Only two things are important, the others are skipped.

First, from the point of skipping. Basically, the elements up to the first character are skipped, so you can skip anything other than "* ▲ △ ☗☖ ".

var conv = new Converter();
var $ = new Parsek();
var ignore = $.noneOf("*▲△☗☖").many();
var lexeme = &(p) => p.skip(ignore);

Next, let's write the syntax that represents the operation. It's like this. Although it was not written in the explanation, it seems that it is necessary to be able to accept expressions such as "same money left" and "upper right left straight" (it was written somewhere, but I could not find it ...).

#   [PLAYER][COL][ROW][PIECE]
#   (Example: ▲ 2 Shikoku)
var player = $.oneOf('▲△☗☖');
var col = $.oneOf('123456789 Same as above');
var row = $.oneOf('one two three four five six seven eight nine');
var piece = $.oneOf('Ayaka Katsura Silver Gold Ball King Hikaku Ryoma');
var addc = $.oneOf("Top right left straight");

Next is the label. The label has label as the opcode and keeps the specified number as address.

It's hard to understand now, but address is the label number.

var number = $.regex(/[1-9][0-9]*|[0-9]/).map(Integer.parseInt);
var labelp = lexeme($.seq($.string('*'), number)).map { => { op: label, address: _1[1] } };

Since it is only necessary that either the operation or the label appears, if it is a program, the program can be expressed as follows.

var stmt = lexeme($.seq(player, col, row, piece)).map { => conv.convert(_1) };
var program = $.alt(stmt, labelp).many();

The above program expresses the following.

Make an interpreter

Now, let's create an interpreter. Represent the VM according to the previous rule. However, only the jump destination address is resolved first. It is as follows.

class Interpreter(opts_) {
    private setJumpAddress(code) {
        var address = [];
        for (var i = 0, l = code.length(); i < l; ++i) {
            if (code[i].op == label) {
                address[code[i].address] = i;
            }
        }
        return address;
    }
    public eval(code) {
        var stack = [];
        var reg = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
        var address = setJumpAddress(code);
        for (var i = 0, l = code.length(); i < l; ++i) {
            var { op, r1, r2 } = code[i];
            # System.println([opname[op], r1, r2]);
            switch (op) {
            case label: # nop
                break;
            case mov:
                reg[r1] = reg[r2];
                break;
            case add:
                reg[r1] += reg[r2];
                break;
            case sub:
                reg[r1] -= reg[r2];
                break;
            case mul:
                reg[r1] *= reg[r2];
                break;
            case div:
                reg[r1] /= reg[r2];
                break;
            case mod:
                reg[r1] %= reg[r2];
                break;
            case push:
                stack.push(reg[r1]);
                break;
            case pop:
                reg[r1] = stack.pop();
                break;
            case putc:
                if (!opts_.disableDisplay) {
                    System.print(*Integer.parseInt(reg[r1]));
                }
                break;
            case putn:
                if (!opts_.disableDisplay) {
                    System.print(reg[r1]);
                }
                break;
            case jump_if:
                if (reg[r1]) {
                    i = address[reg[r2]];
                }
                break;
            case jump_ifp:
                # System.println("check = ", reg[r1]);
                if (reg[r1] >= 0) {
                    i = address[reg[r2]];
                }
                break;
            }
        }
    }
}

I'll repost the rules I mentioned earlier, so if you check that they match your implementation, you'll understand (although PIECE has already disappeared on the interpreter).

PIECE Also known as COL ROW meaning
When mov X Y X = Y
Ayumu add X Y X += Y
Money sub X Y X -= Y
Silver mul X Y X *= Y
Katsura div X Y X /= Y
Incense mod X Y X %= Y
Dragon push X _ Push X
Horse pop X _ Pop to X
ball putc X _ Output character with character code X
king putn X _ Output X as a number
Fly jump_if X Y If X is non-zero, jump to the label indicated by register Y
Horn jump_ifp X Y If X is 0 or more, jump to the label indicated by register Y

Make a compiler

Next is the implementation of the compiler.

As I wrote this article, I noticed that it doesn't support real numbers. Let's fix it later ...

The compiler compiles to native code in the Jit library. In the same way, the address resolution of the jump destination is done first. It's more complicated than an interpreter. Especially the jump address solution is a bit tricky. This is because the jump destination is not known at the time of compilation because it is an indirect jump based on the register value according to the language specifications. The jump destination according to the label number is stored in advance in the form of a table, and the table is referenced and jumped at runtime.

class Compiler(opts_) {
    private setJumpAddress(code, c) {
        var address = [], max = 0;
        for (var i = 0, l = code.length(); i < l; ++i) {
            if (code[i].op == label) {
                address[i] = c.makeConst(Jit.VAR(code[i].address));
                max = code[i].address if (max < code[i].address);
            }
        }
        return [address, max];
    }
    public compile(code) {
        var c = new Jit.Compiler();
        c.enter();

        var jt;
        var labelTarget = [];
        var jumpTarget = [];
        var [address, max] = setJumpAddress(code, c);
        var reg = [-1, Jit.VAR(max+1), Jit.VAR(max+2), Jit.VAR(max+3), Jit.VAR(max+4), Jit.S0, Jit.S1, Jit.S2, Jit.S3, Jit.S4];
        for (var i = 1, l = reg.length(); i < l; ++i) {
            c.mov(reg[i], Jit.IMM(i));
        }
        c.mov(Jit.S5, Jit.IMM(max+5));
        c.mov(Jit.VAR(max+1005), Jit.IMM(0));
        for (var i = 0, l = code.length(); i < l; ++i) {
            var { op, r1, r2 } = code[i];
            # System.println([opname[op], r1, r2]);
            switch (op) {
            case label:
                labelTarget[i] = c.label();
                break;
            case mov:
                c.mov(reg[r1], reg[r2]);
                break;
            case add:
                c.add(reg[r1], reg[r1], reg[r2]);
                break;
            case sub:
                c.sub(reg[r1], reg[r1], reg[r2]);
                break;
            case mul:
                c.mul(reg[r1], reg[r1], reg[r2]);
                break;
            case div:
                c.mov(Jit.R0, reg[r1]);
                c.mov(Jit.R1, reg[r2]);
                c.div();
                c.mov(reg[r1], Jit.R0);
                break;
            case mod:
                c.mov(Jit.R0, reg[r1]);
                c.mov(Jit.R1, reg[r2]);
                c.divmod();
                c.mov(reg[r1], Jit.R1);
                break;
            case push:
                c.mov(Jit.R1, Jit.S5);
                c.localp(Jit.R0, Jit.R1);
                c.mov(Jit.MEM1(Jit.R0), reg[r1]);
                c.add(Jit.S5, Jit.S5, Jit.IMM(1));
                break;
            case pop:
                c.sub(Jit.S5, Jit.S5, Jit.IMM(1));
                c.mov(Jit.R1, Jit.S5);
                c.localp(Jit.R0, Jit.R1);
                c.mov(reg[r1], Jit.MEM1(Jit.R0));
                break;
            case putc:
                if (!opts_.disableDisplay) {
                    c.mov(Jit.R0, reg[r1]);
                    c.icall(Jit.IMM(Jit.Clib.putc));
                }
                break;
            case putn:
                if (!opts_.disableDisplay) {
                    c.mov(Jit.R0, reg[r1]);
                    c.icall(Jit.IMM(Jit.Clib.putn));
                }
                break;
            case jump_if:
                jt = c.eq(reg[r1], Jit.IMM(0));
                c.mov(Jit.R1, reg[r2]);
                c.localp(Jit.R0, Jit.R1);
                c.mov(Jit.R0, Jit.MEM1(Jit.R0));
                c.ijmp(Jit.R0);
                jt.setLabel(c.label());
                break;
            case jump_ifp:
                jt = c.slt(reg[r1], Jit.IMM(0));
                c.mov(Jit.R1, reg[r2]);
                c.localp(Jit.R0, Jit.R1);
                c.mov(Jit.R0, Jit.MEM1(Jit.R0));
                c.ijmp(Jit.R0);
                jt.setLabel(c.label());
                break;
            }
        }

        c.ret(Jit.R0);
        var code = c.generate();
        address.each {
            if (_1) _1.setLabel(labelTarget[_2]);
        };
        return code;
    }
}

Execution part

Now that we have the interpreter and compiler, let's write the part that actually parses and executes the source code. Ignore the comment out of the sample and load the following game record.

var kifu =
# Hello, world!
%{
▲ 9 eight silver △ 9 two balls ▲ 6 four steps △ 6 six silver ▲ 6 steps △ same ball
▲ 6 seven steps △ 6 one ball ▲ 6 eight balls △ 6 three steps ▲ 6 nine balls △ 7 seven silver
▲ 7 Gokin △ 7 One ball ▲ 4 Eight silver △ 4 Two balls ▲ 9 Six and △ 9 Eight steps
▲ Same ball △ 6 Seven balls ▲ 9 Five gold △ 9 One ball ▲ 6 Three gold △ Same ball
▲ 6 eight gold △ 6 two balls ▲ 4 steps △ 4 three balls ▲ 5 five steps △ 5 three balls
};

Below is the measurement of the execution time using the timer and the display of the result.

var t, tmr = new SystemTimer();

First, parsing.

# Parsing the code
tmr.restart();
var r = ignore.then(program).parseAll(kifu);
t.parse = tmr.elapsed();

Then run it in the interpreter.

# Interpret it
var interp = new Interpreter();
tmr.restart();
interp.eval(r.value);
t.interpret.total = tmr.elapsed();

Finally run with the compiler. It is divided into a compile phase and an execution phase.

# Compile it
var compiler = new Compiler();
tmr.restart();
var code = compiler.compile(r.value);
t.jit.compile = tmr.elapsed();

# Execute it
tmr.restart();
code.run();
t.jit.execute = tmr.elapsed();
t.jit.total = t.jit.compile + t.jit.execute;

And the result display.

System.println("Result = ", t.toJsonString(true));

benchmark

Benchmark results. For now, it's Linux.

Hello, world!
Hello, world!
Result = {
    "interpret": {
        "total": 0.001682
    },
    "jit": {
        "compile": 0.000799,
        "execute": 0.00014,
        "total": 0.000939
    },
    "parse": 0.008555
}

You've got Hello, world! Twice!

What the JSON result represents is as follows.

In other words, since the parsing process is common, when comparing the interpreter and the compiler, the compiler has faster execution time as shown below!

Execution form Parsing process Run Overall
Interpreter 8.5 ms 1.7 ms 10.2 ms
compiler 8.5 ms (0.8+0.1) 0.9 ms 9.4 ms

in conclusion

The subject is that, but as I wrote earlier

I think we can somehow grasp the flow.

Let's enjoy the fun language implementation life!

I need to be able to handle real numbers in the compiler.

see you.

Recommended Posts

I tried to implement ModanShogi with Kinx
I tried to implement file upload with Spring MVC
I tried to implement TCP / IP + BIO with JAVA
I tried to implement Stalin sort with Java Collector
I tried to interact with Java
[Rails] I tried to implement batch processing with Rake task
I tried to get started with WebAssembly
I tried to implement the Iterator pattern
I tried to implement flexible OR mapping with MyBatis Dynamic SQL
I tried to verify AdoptOpenJDK 11 (11.0.2) with Docker image
I tried to implement polymorphic related in Nogizaka.
I tried to manage struts configuration with Coggle
I tried to manage login information with JMX
I tried to implement deep learning in Java
I tried to implement a server using Netty
I tried to break a block with java (1)
I tried what I wanted to try with Stream softly.
I tried DI with Ruby
I tried to read and output CSV with Outsystems
I tried to implement Firebase push notification in Java
I tried to implement a function equivalent to Felica Lite with HCE-F of Android
[Java 11] I tried to execute Java without compiling with javac
I started MySQL 5.7 with docker-compose and tried to connect
I tried to get started with Spring Data JPA
I tried to draw animation with Blazor + canvas API
[Java] I tried to implement Yahoo API product search
I tried UPSERT with PostgreSQL.
I tried BIND with Docker
I tried to verify yum-cron
I tried to implement the Euclidean algorithm in Java
roman numerals (I tried to simplify it with hash)
[Swift] I tried to implement Instagram profile-like UI with UICollectionView only with code without storyboard
I tried to implement the like function by asynchronous communication
I tried to make an introduction to PHP + MySQL with Docker
I tried to create a java8 development environment with Chocolatey
I tried to modernize a Java EE application with OpenShift.
I tried to increase the processing speed with spiritual engineering
[Rails] I tried to create a mini app with FullCalendar
I tried to link chat with Minecraft server with Discord API
I tried to implement a buggy web application in Kotlin
[Rails] I tried to implement "Like function" using rails and js
I tried to create a padrino development environment with Docker
I tried to get started with Swagger using Spring Boot
I tried upgrading from CentOS 6.5 to CentOS 7 with the upgrade tool
I tried to be able to pass multiple objects with Ractor
I tried to chew C # (indexer)
I tried using JOOQ with Gradle
I tried morphological analysis with MeCab
I tried to summarize iOS 14 support
I tried UDP communication with Java
I tried to explain the method
I tried GraphQL with Spring Boot
I tried Flyway with Spring Boot
I tried to summarize Java 8 now
I tried to chew C # (polymorphism: polymorphism)
I tried customizing slim with Scaffold
I tried to explain Active Hash
I tried to solve the problem of "multi-stage selection" with Ruby
I tried to implement Ajax processing of like function in Rails
I tried to build the environment of PlantUML Server with Docker
I tried to build an http2 development environment with Eclipse + Tomcat