[PYTHON] Let's recommend Malbolge to those who say "If you do one programming language, you can do it"

When I looked at Twitter, there was a reply like "Can you say the same thing with Malbolge?" In response to the statement "If you do one program language, you can do something else". I was a shallow student and asked "What is Malbolge?", But it was a very powerful language specification.

I've also muttered the above "If you do one" type, but I'm sorry that I was a well frog.

Looking at the language specifications, I didn't feel like writing the code myself, but I realized that the amount of implementation would not be so large if it was an interpreter, so I made it.

Check the environment below. I don't know if it works elsewhere ... I'm sorry ...

About Malbolge

In pedia, the specifications are also described below for English people.

After that, the following article by Robert (id: Robe) was very helpful. Thank you very much.

https://robe.hatenablog.jp/entry/20060824/1156353550

As you can see from the above, it's very difficult to write ... or is there someone who can write with a pencil? You can see the difficulty by looking at Hello World (below) posted on pedia.

(=<`:9876Z4321UT.-Q+*)M'&%$H"!~}|Bzy?=|{z]KwZY44Eq0/{mlk**
hKs_dG5[m_BA{?-Y;;Vb'rR5431M}/.zHGwEDCBA@98\6543W10/.R,+O<

Personally, I got the impression that it was a virtual CPU rather than a language. Roughly the features that I understood.

In short, it's like writing in assembler or machine language with obfuscated code.

Try moving the interpreter

If you do it on a Mac, the following page by Takaaki Yamamoto | Kazuaki Yamamoto will be very helpful. Thank you very much.

https://blog.monophile.net/posts/20150118_malbolge_hello.html

The source of the interpreter (C language) is as follows.

http://www.lscheffer.com/malbolge_interp.html

This is a bulleted list when running on a Mac.

I tried to make an interpreter

Creating an interpreter is much easier than creating a program because the number of instructions is small and the maximum memory value is not a big deal.

So I made it roughly with Python.

** Of course, it is not perfect according to the specifications, so please be careful. I think there is a bug somewhere. ** **


#!/usr/bin/env python
# coding: utf-8

#
# malbolge_test.Written as py
#

import sys
import hexdump


class MalbolgeInterpreterModoki:
    """
Malbolge interpreter (Modoki)
→ Please forgive me because I have not confirmed the detailed operation!
    """

    def __init__(self, debug_mode):
        """
Initialization
        """
        self.debug_mode = debug_mode

        self.OP_INDEX_CODE = 0
        self.OP_INDEX_FUNC = 1

        #Breakdown of code (instruction) converted by xlat1
        self.op_table = [           
            [106,   self.op_mod_d],
            [105,   self.op_jmp],
            [42,    self.op_rotate],
            [112,   self.op_crz],
            [60,    self.op_print],
            [47,    self.op_input],
            [118,   self.op_end],
            [111,   self.op_nop]
        ]

        #Looking at the original code, it seems like one word is 10 decimal digits?
        self.BASE3_DIGITS = 10      

        #Memory size (not bytes, words-unsigned short-)
        self.MEMORY_SIZE = 59049    

        #xlat1 is[C]It is a conversion table used when taking the instruction code from.
        self.xlat1 = "+b(29e*j1VMEKLyC})8&m#~W>qxdRp0wkrUo[D7,XTcA\"lI"
        self.xlat1 += ".v%{gJh4G\\-=O@5`_3i<?Z';FNQuY]szf$!BS/|t:Pn6^Ha"

        #After running xlat2[C]This is a conversion table used to rewrite the contents of.
        self.xlat2 = "5z]&gqtyfr$(we4{WP)H-Zn,[%\\3dL+Q;>U!pJS72FhOA1C"
        self.xlat2 += "B6v^=I_0/8|jsb9m<.TVac`uY*MK'X~xDl}REokN:#?G\"i@"

        self.cpu_reset()

    def cpu_reset(self):
        """
Register and memory initialization (and related variables)
It looks like BSS, but it doesn't make much sense because it will be filled soon ...
        """
        self.memory = [0] * self.MEMORY_SIZE
        self.A=int(0)
        self.C=int(0)
        self.D=int(0)
        self.seq = 0
        self.end = False
        self.console = ""

    def show_regs(self):
        """
Register display. Please use it if necessary for debugging.
        """
        print (self.seq, ": A=", self.A, "C=", self.C, "D=", self.D)

    def dump_memory(self, max):
        """
Memory dump. Please use it if necessary for debugging.
# It may be tough without an address (;^ω^)
        """
        print ("memory: size=", len(self.memory), " max=", max)
        text = "["
        i = 0
        for data in self.memory:
            text += hex(data) + " "
            i += 1
            if i == max:
                break
        text += "]"
        print(text)

    def to_base3(self, data):
        """
Malbolge performs ternary operations, so you need to create a ternary number.
This makes decimal numbers into decimal numbers. I'm doing the worst conversion method (sweat)
        """

        #First, extract each digit from the original data.
        digits = []
        while True:
            digits.append(data % 3)
            if data <= 2:
                break
            data = int(data / 3)

        #Since it is fixed in digits in Malbolge, the rest is filled with 0
        remain = self.BASE3_DIGITS - len(digits)
        for i in range(0, remain):
            digits.append(0)

        return digits

    def from_base3(self, digits):
        """
Malbolge performs ternary operations, so you need to create a ternary number.
This makes a decimal number a decimal number. I'm doing the worst conversion method (sweat)
        """

        #Example: 123-> 3 + 2 * 3 + 1 * 9 = 18
        data = 0
        base = 1
        i = 0
        for digit in digits:
            data += digit * base
            i += 1
            if i == 1:
                base = 3
            else:
                base *= 3
        return data


    def crazy(self, x, y):
        """
Malbolge arithmetic processing. Calculates two ternary numbers and outputs one ternary number
The following calculation is performed for each digit. Since the number of digits is fixed, you can describe it with that

        x/y   0   1   2
        ----------------
        0     1   0   0          
        1     1   0   2
        2     2   2   1

Example: crazy(x=10, y=12) =12 (5 for decimal because it is a ternary number)
        """
        crazy_table = [
            [1, 0, 0],
            [1, 0, 2],
            [2, 2, 1]
        ]
        digits_x = self.to_base3(x)
        digits_y = self.to_base3(y)
        result = []
        data = 0

        for i in range(0, self.BASE3_DIGITS):
            data = crazy_table[digits_x[i]][digits_y[i]]
            result.append(data)

        return result

    def op_mod_d(self):
        """
MOV instruction D= [D];
        """
        ref_D = self.memory[self.D]
        if self.debug_mode:
            print (self.seq, "C=", self.C, ":106: D = [D]; # D=", self.D, "[D]=" , ref_D)
        self.D = ref_D

    def op_jmp(self):
        """
JUMP instruction C= [D]; jmp [D];
        """
        ref_D = self.memory[self.D]
        if self.debug_mode:
            print (self.seq, "C=", self.C, ":105: C = *D; jmp [D]; # D=", self.D, "[D]=", ref_D)
        self.C = ref_D

    def op_rotate(self):
        """
Right shift command rotate_right [D]; A=D;
        """
        ref_D = self.memory[self.D]

        #Right shift reduces one digit (123 in decimal)/10 ->Like 12)
        result = int(ref_D / 3) 

        #Move the contents of the least significant digit to the most significant digit (123 if it is a continuation of the above)->Feeling to make it 312)
        #19683 is a ternary number 1000000000
        result = result + (ref_D % 3) * 19683 

        if self.debug_mode:
            print (self.seq, "C=", self.C, ": 42: rotate_right [D]; A=D; # D=", self.D, "[D]=", ref_D, "result=", result)
        self.memory[self.D] = result
        self.A = result

    def op_crz(self):
        """
Arithmetic instructions[D] = crazy(A, [D]); A=[D]; 
        """
        ref_D = self.memory[self.D]
        result_digits = self.crazy(ref_D, self.A)
        result = self.from_base3(result_digits)

        if self.debug_mode:
            print (self.seq, "C=", self.C, ":112: [D] = crazy(A, [D]); A=[D]; # D=", self.D, "[D]=", ref_D, "A=", self.A, "result=", result)

        self.memory[self.D] = result
        self.A = result

    def op_print(self):
        """
Single character output print A; 
I'm sorry, it's troublesome to process, so I will break each character m(__)m
        """

        #If you look at the original code, let me cast it
        #It seems that it is okay to output
        #(Is this a reference when creating a character string?)
        ascii = chr(self.A % 256) 

        if self.debug_mode:
            print (self.seq, "C=", self.C, ": 60: print A; # put(\"",  ascii ,"\") A=", self.A, ",", hex(self.A))
        else:
            print("put(\"",  ascii ,"\")=", self.A)
        self.console += ascii

    def op_input(self):
        """
Single character input input A
I'm sorry, it's troublesome to process<enter>Is required m(__)m
        @ :new line
        # :System shutdown (handled as an internal special command)
        """
        print ("(m_o_m) Sorry, please input 1 char(@=cr,#=exit) and <enet>:")
        text = input()
        if text[0] == "@":
            data = 0x0a
        elif text[0] == "#":
            print("exit this program.")
            sys.exit(0)
        else:
            data = ord(text[0])
        if self.debug_mode:
            print (self.seq, "C=", self.C, ": 47: input A; # putc=", ord(text[0]), hex(data))
        self.A = data

    def op_end(self):
        """
End end;
The program ends here.
        """
        if self.debug_mode:
            print (self.seq, "C=", self.C, ":118: end;")
        print ("end of program.")
        print("console:")
        print(self.console)
        print("--------completed.")
        sys.exit(0)

    def op_nop(self):
        """
        NOP nop
        """
        if self.debug_mode:
            print (self.seq, "C=", self.C, ":111: nop;")
        return

    def execute_opcode(self, op_code):
        """
Executes processing based on the instruction code
Each instruction code and its processing is self.op_Because I listed it in the table
My job is to compare and skip
        """
        for op in self.op_table:
            if op[self.OP_INDEX_CODE] == None or op[self.OP_INDEX_CODE] == op_code:
                op[self.OP_INDEX_FUNC]()
                return
        if self.debug_mode:
            print ("illegal op code ", op_code, " : -> nop.")

    def inc_regs(self):
        """
In Malbolge, C and D are added after each process is completed.+1
When it reaches the end of memory, it returns to the beginning.
        """
        if self.C == self.MEMORY_SIZE-1:
            self.C = 0
        else:
            self.C += 1

        if self.D == self.MEMORY_SIZE-1:
            self.D = 0
        else:
            self.D += 1

        self.seq += 1

    def execute_1step(self):
        """
Executes only one instruction.
If you want to do something good with step execution
This may be convenient.
        """

        #In Malbolge the code in memory
        #I don't simply do it.
        #Table converted as follows
        #Use the code.
        #In other words, be aware of this
        #It seems that you need to write an instruction.
        ref_C = self.memory[self.C]
        t_index = (ref_C - 33 + self.C) % 94
        op_code = ord(self.xlat1[t_index])

        #Execute the instruction
        self.execute_opcode(op_code)

        #After executing the instruction, the memory is rewritten.
        #You have to be aware of this when writing memory operations
        #I don't think it should be done.
        ref_C = self.memory[self.C]
        self.memory[self.C] = ord(self.xlat2[ref_C - 33])

        self.inc_regs()

    def execute(self):
        """
Run the program.
You can either get an end or die from a bug
(Oh, you can also exit with # with input later)
        """
        while True:
            self.execute_1step()

    def is_this_valid_code(self, i, b):
        """
Simple grammar check
        valid_Check if it is other than the instructions in list.
If it doesn't work, it will exit.
        """
        #NG except for the following commands (commands?).
        valid_list = "ji*p</vo"
        #( x - 33 + i ) % 94
        t_index = (b - 33 + i) % 94
        c = self.xlat1[t_index]
        for valid in valid_list:
            if valid == c:
                return 
        print("Illegal opcode= \"" + c + "\"(org:"+ str(b) + ")")  
        sys.exit(1) 

    def load_code(self, path):
        """
Extract the program into memory.
        """

        #Initialization
        self.cpu_reset()
        cnt = 0

        #Code loading
        with open(path, "rb") as f:
            data = f.read()
            for b in data:

                #Ignore space line breaks
                if b == 0x20 or b == 0x0d or b == 0x0a:
                    continue

                #Simple grammar check. If it doesn't work, exit
                self.is_this_valid_code(cnt, b)

                #Write the program in memory
                #
                #Malbolge's memory model
                #It's unsigned short
                #The program is unsigned char.
                #What are you saying
                # memory(0) <- code(0)
                # memory(1) <-Fly
                # memory(2) <- code(1)
                # memory(3) <-Fly
                #will do. It seems to be useless if it is packed
                #Since this is Python, I just put it in an array ...
                self.memory[cnt] = b
                cnt += 1

        #In C language, the rest of the memory should be filled with 0.
        #Malbolge didn't allow it, like this
        #You have to calculate Crazy and put it in.
        while cnt < self.MEMORY_SIZE:
            x = self.memory[cnt - 2]
            y = self.memory[cnt - 1]
            self.memory[cnt] = self.from_base3(self.crazy(x,y))
            cnt += 1

if __name__ == "__main__":

    #Argument check
    if len(sys.argv) < 2 or (len(sys.argv) == 3 and sys.argv[1] != "debug"):
        print ("python malbolge_test.py (debug) <program file>")
        sys.exit(1)

    #Set the presence / absence of debug mode and the program file from the argument.
    debug = False    
    path = sys.argv[1]
    if len(sys.argv) == 3:
        debug = True
        path = sys.argv[2]

    #Run Malbolge
    m = MalbolgeInterpreterModoki(debug)
    m.load_code(path)   #Extract the program to memory
    m.execute()         #Program execution

In this article, we will treat the above as malbolge_test.py when explaining.

How to use

There are two ways to use it.

Normal mode

This is the mode to execute normally. However, the following points are different between my lack of ability and omission (sweat).

If you think it's hidden around here, I'd like you to correct the source (sweat)

The execution example is as follows (hello_ma is the Malbolge program of hello world)

 python ./malbolge_test.py hello_ma <enter>

Debug mode

This is a mode in which the execution status of each instruction can be traced in addition to the above. However, please note that it will be printed without a hump, so it will be doubled.

Similarly, an execution example. Add debug.

 python ./malbolge_test.py debug hello_ma <enter>

Operation check status

The following operations have been confirmed. To put it the other way around, I'm only doing this. ..

About the code

Thank you to those who are gracious that you can see even a fucking interpreter (like) made by such a defeat. If you look at it from the main below, I think you can follow it in a flow.

By the way, I didn't know it from the specifications, but sometimes I found it from the interpreter code.

How Hello World is handled.

Actually, this is the feeling I wanted to do the most. I tried to visualize how that complicated and strange source is actually processed.

Since it is long to do everything, execute it in debug mode and reprint the log up to the middle (up to "Hell").


0 C= 0 :106: D = [D]; # D= 0 [D]= 40
1 C= 1 :112: [D] = crazy(A, [D]); A=[D]; # D= 41 [D]= 93 A= 0 result= 29524
2 C= 2 :112: [D] = crazy(A, [D]); A=[D]; # D= 42 [D]= 75 A= 29524 result= 72
3 C= 3 : 60: print A; # put(" H ") A= 72 , 0x48
4 C= 4 :112: [D] = crazy(A, [D]); A=[D]; # D= 44 [D]= 90 A= 72 result= 29506
5 C= 5 :112: [D] = crazy(A, [D]); A=[D]; # D= 45 [D]= 89 A= 29506 result= 35
6 C= 6 :112: [D] = crazy(A, [D]); A=[D]; # D= 46 [D]= 52 A= 35 result= 29507
7 C= 7 :112: [D] = crazy(A, [D]); A=[D]; # D= 47 [D]= 52 A= 29507 result= 44
8 C= 8 :112: [D] = crazy(A, [D]); A=[D]; # D= 48 [D]= 69 A= 44 result= 29541
9 C= 9 : 60: print A; # put(" e ") A= 29541 , 0x7365
10 C= 10 :112: [D] = crazy(A, [D]); A=[D]; # D= 50 [D]= 48 A= 29541 result= 73
11 C= 11 :112: [D] = crazy(A, [D]); A=[D]; # D= 51 [D]= 47 A= 73 result= 29552
12 C= 12 :112: [D] = crazy(A, [D]); A=[D]; # D= 52 [D]= 123 A= 29552 result= 60
13 C= 13 :112: [D] = crazy(A, [D]); A=[D]; # D= 53 [D]= 109 A= 60 result= 29548
14 C= 14 : 60: print A; # put(" l ") A= 29548 , 0x736c
15 C= 15 : 60: print A; # put(" l ") A= 29548 , 0x736c

Up to this point, it seems that I'm doing my best with crazy calculations and creating the desired character string. If you go to the back, it seems that you are also using the jump command.

I didn't use the rotate command. By the way, 99 Bottles of Beer also used Rotate. You can write this program well, right? I just have respect for it. As a fledgling level me.

Bonus: A little addictive case

Actually, there are cases where I was addicted to debugging. It is "copy of input (author: Lou Scheffer)" in Robe's site. Reprint the code.


D'BA@?>=<;:9876543210/.-,+*)('&%$#"!~}|{zyxwvutsrqponmlkji
hgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<;:9876543210/
.-,+*)('&%$#"!~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTS
RQPONMLKJIHGFEDC&_Soot Soot Soot Soot Soot Soot Soot Soot Soot Soot Soot Soot Soot Soot
Suss Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo
Suss Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo
Suss Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo
Suss Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo Soo

(Reprinted from Robe's site (URL below))

This works with the original interpreter, but when I do it with my interpreter I get the following error:

Illegal opcode= "~"(org:239)

I thought that it would hurt if I touched my own interpreter halfway through the fledgling level, but it seems that even the original interpreter behaves unexpectedly.

I modified the original as follows. I have a range check in the exec function.


#if 1 /*Additional part: Try adding a range check*/
    if (mem[c] -33 >= sizeof(xlat2))
    {
      printf("!!!!!!!!!!!!!!! %d >= %lu \n", mem[c] -33, sizeof(xlat2));
      return;
    }
#endif
    mem[c] = xlat2[mem[c] - 33];

    if ( c == 59048 ) c = 0; else c++;
    if ( d == 59048 ) d = 0; else d++;

When executed with this, it is as follows.

!!!!!!!!!!!!!!! 156 >= 95

In other words, it seems that it is established by accessing data outside the range of the conversion table called xlat2. I wondered what would happen in this case.

I'm a beginner in Malbolge, so I don't have any knowledge about it. Also, please forgive that there is a good possibility that some of the above may be misunderstood.

Finally

So, let's give out Malbolge when the mount is taken, saying, "If you do one programming language, the rest will be manageable" (bitter smile).

Oh, it's a joke, isn't it?

Recommended Posts

Let's recommend Malbolge to those who say "If you do one programming language, you can do it"
Is Parallel Programming Hard, And, If So, What Can You Do About It?
If you write TinderBot in Python, she can do it
What you can do with programming skills
Let's summarize what you want to do.
If you guys in the scope kitchen can do it with a margin ~ ♪
What to do if you can't pipenv shell
What to do if you get `No kernel for language python found` in Hydrogen
What to do if you get "(35,'SSL connect error')" in pycurl (one of them)
You can do it in 3 minutes! How to make a moving QR code (GIF)!