Introduction

In the previous article, I implemented the interpreter (http://qiita.com/quwahara/items/30e93dfd2690913d66c0). The interpreter can now perform four arithmetic operations, but the calculation results cannot be displayed. Then I'm lonely, so I'll add println to display the calculation result.

What you want to do by adding println

When implementing, confirm what you want to do concretely. For example, if you have a program like the one below, we aim to output 1 and line breaks to the standard output.

println(1)

Also, if a variable to which 2 is assigned is specified as an argument ofprintln (), we aim to output the contents 2 as a line feed to the standard output.

a = 2
println(a)

How to implement

So far, we have implemented lexical analysis (Lexer), parser (Parser), and interpreter (Interpreter). We will consider how to implement each in turn.

How to implement in lexical analysis (Lexer)

The implemented lexical analysis does not have the ability to parse parentheses, ( and ). First add it. Also, in the lexical analysis specifications, the alphabetical sequence was defined as a variable name (variable). However, println, which is one of the alphabetical sequences, is a function name, not a variable name. The fact that it is a sequence of alphabets is no longer enough to determine whether it is a variable name or a function name. Therefore, change the meaning of the alphabet to ident.

How to implement parsing

To think about how to implement it, first look closely at the token sequence of the println (1) call. It is decomposed into the following four tokens.

println and (and 1 and )

This sequence is similar to the sequence of ʻa + 1. I will look at it. ʻA + 1 also tries to arrange tokens in the same way.

ʻA, +and1`

Let's compare them.

println → ʻa` → Both are identifiers
( → + → Both symbols
1 → 1 → Both numbers
) → No support

The parsing of ʻa + 1puts the tokens on the left and right with+in the middle. Observing the contrast,println (1)also has a composition in which the tokens that come to the left and right come with(in the middle. From that observation, to parseprintln (1), it seems that you can parse it by parsing ʻa + 1 and confirming that the ) token comes at the end. .. In other words, it seems that it can be implemented by slightly modifying the parsing to operator tokens. It also supports the change from variable to ident, which was mentioned in the previous implementation of lexical analysis.

How to implement the Interpreter

The problem with implementing the println (1) call was that the interpreter implicitly represented the variable as a String. That was fine because there was no possibility that there was a function name as it was, but now we have to distinguish between the variable name and the function name. Therefore, we will introduce a class that represents the variable name and the function name for each. Also, as we observed in the parsing implementation, ( is the key token for the println (1) call. Invite them to call a method that handles the ( when they encounter a token. Finally, here as well, it corresponds to the change from the variable name (variable) to the identifier (ident) mentioned in the previous method of implementing lexical analysis.

Try to implement in Java

Move on to implementation.

Again, lexical analysis (Lexer), parser (Parser), interpreter (Interpreter), For each, let's take a look at the changes and additions in order.

Lexer.java

An implementation of Lexer.java. First, add the function to parse parentheses, ( and ).

`Lexer.java`


    
    private boolean isParenStart(char c) {
        return c == '(' || c == ')';
    }

    private Token paren() throws Exception {
        Token t = new Token();
        t.kind = "paren";
        t.value = Character.toString(next());
        return t;
    }

Next is the response to changes from variable names to identifiers.

`Lexer.java`


    
    private boolean isIdentStart(char c) throws Exception {
        return Character.isAlphabetic(c);
    }

    private Token ident() throws Exception {
        StringBuilder b = new StringBuilder();
        b.append(next());
        while (!isEOT() && (Character.isAlphabetic(c()) || Character.isDigit(c()))) {
            b.append(next());
        }
        Token t = new Token();
        t.kind = "ident";
        t.value = b.toString();
        return t;
    }

Finally, if you add the parsing call part of the parentheses and modify the call part of the change to ident, The implementation of Lexer.java is complete.

`Lexer.java`


    public Token nextToken() throws Exception {
        skipSpace();
        if (isEOT()) {
            return null;
        } else if (isSignStart(c())) {
            return sign();
        } else if (isDigitStart(c())) {
            return digit();
        } else if (isIdentStart(c())) {
            return ident();
        } else if (isParenStart(c())) {
            return paren();
        } else {
            throw new Exception("Not a character for tokens");
        }
    }

Parser.java

An implementation of Parser.java. First, change the definition of how it works with respect to the meaning of the token. Add ( to degrees, which defines the degree of operator order at <-Add. ( Is also a correspondence to treat like an operator. println () wants to combine with priority over other operators, so set the degree to 80. Corresponds to the change to ident at <-Update.

`Parser.java`


    public Parser() {
        degrees = new HashMap<>();
        degrees.put("(", 80);   // <-- Add
        degrees.put("*", 60);
        degrees.put("/", 60);
        degrees.put("+", 50);
        degrees.put("-", 50);
        degrees.put("=", 10);
        factorKinds = Arrays.asList(new String[] { "digit", "ident" }); // <-- Update
        binaryKinds = Arrays.asList(new String[] { "sign" });
        rightAssocs = Arrays.asList(new String[] { "=" });
    }

It is a change of the part to be analyzed. The processing to perform analysis was added to the part of the if statement with <-Add. Assigning tokens to ʻoperator.left and ʻoperator.right is almost the same as parsing the operator of the if statement above. ʻOperator.right = expression (0); The argument of ʻexpression () is 0 unlike the above if statement. This is because one expression is taken independently as an argument of the function, and there is no need to compare the precedence with the preceding and following operators. consume () confirms that the next token is a closing brace and advances the token of interest in the analysis to the next. The implementation of Parser.java is complete.

`Parser.java`


    private Token bind(Token left, Token operator) throws Exception {
        if (binaryKinds.contains(operator.kind)) {
            operator.left = left;
            int leftDegree = degree(operator);
            if (rightAssocs.contains(operator.value)) {
                leftDegree -= 1;
            }
            operator.right = expression(leftDegree);
            return operator;
        } else if(operator.kind.equals("paren") && operator.value.equals("(")) {    // <-- Add
            operator.left = left;
            operator.right = expression(0);
            consume(")");
            return operator;
        } else {
            throw new Exception("The token cannot place there.");
        }
    }

    private Token consume(String expectedValue) throws Exception {
        if (!expectedValue.equals(token().value)) {
            throw new Exception("Not expected value");
        }
        return next();
    }

Interpreter.java

An implementation of Interpreter.java. First, in order to distinguish between variable names and function names, we will introduce a class that represents each. The function is an abstract class for future expansion. ʻInvoke ()` method implements what the function actually does.

`Interpreter.java`


    public static class Variable {
        public String name;
        public Integer value;

        @Override
        public String toString() {
            return name + " " + value;
        }
    }

    public static abstract class Func {
        public String name;

        abstract public Object invoke(Object arg) throws Exception;
    }

    public static class Println extends Func {
        public Println() {
            name = "println";
        }

        @Override
        public Object invoke(Object arg) throws Exception {
            System.out.println(arg);
            return null;
        }
    }

It is a change of the initialization part. Added a Map to hold the function name and its value to accommodate the function. Add the entity of println () there.

`Interpreter.java`


public class Interpreter {

    public Map<String, Func> functions;
    public Map<String, Variable> variables;
    List<Token> body;

    public Interpreter init(List<Token> body) {
        functions = new HashMap<>();
        Func f = new Println();
        functions.put(f.name, f);
        variables = new HashMap<>();
        this.body = body;
        return this;
    }

It is a change of the part that calls the method that processes each token according to the meaning of the token. Added a function call where there is a <-Add. The change to ident is where the <-Update is.

`Interpreter.java`


    public Object expression(Token expr) throws Exception {
        if (expr.kind.equals("digit")) {
            return digit(expr);
        } else if (expr.kind.equals("ident")) { // <-- Update
            return ident(expr);
        } else if (expr.kind.equals("paren")) { // <-- Add
            return invoke(expr);
        } else if (expr.kind.equals("sign") && expr.value.equals("=")) {
            return assign(expr);
        } else if (expr.kind.equals("sign")) {
            return calc(expr);
        } else {
            throw new Exception("Expression error");
        }
    }

Changed var () to ʻident ()`. If the identifier is a function name, it resolves to a function. Otherwise it resolves to a variable.

`Interpreter.java`


    public Object ident(Token token) {
        String name = token.value;
        if (functions.containsKey(name)) {
            return functions.get(name);
        }
        if (variables.containsKey(name)) {
            return variables.get(name);
        } else {
            Variable v = new Variable();
            v.name = name;
            v.value = 0;
            variables.put(name, v);
            return v;
        }
    }

The following med corresponds to the Variable class introduced to represent variable names.

`Interpreter.java`


    public Variable assign(Token expr) throws Exception {
        Variable variable = variable(expression(expr.left));
        Integer value = value(expression(expr.right));
        variable.value = value;
        return variable;
    }

    public Variable variable(Object value) throws Exception {
        if (value instanceof Variable) {
            return (Variable) value;
        } else {
            throw new Exception("left value error");
        }
    }

    public Integer value(Object value) throws Exception {
        if (value instanceof Integer) {
            return (Integer) value;
        } else if (value instanceof Variable) {
            Variable v = (Variable) value;
            return v.value;
        }
        throw new Exception("right value error");
    }

ʻInvoke ()makes a function call. That's what I wanted to do in this article. func () is the process of ʻinvoke () and makes sure that the result of ʻexpr.left` is the function name.

`Interpreter.java`


    private Object invoke(Token expr) throws Exception {
        Func f = func(expression(expr.left));
        Integer value = value(expression(expr.right));
        return f.invoke(value);
    }

    public Func func(Object value) throws Exception {
        if (value instanceof Func) {
            return (Func) value;
        } else {
            throw new Exception("Not a function");
        }
    }

The string in the program below using the above implementation

a = 3 + 4 * 5
println(a)

Is calculated and the value assigned to the variable is printed to the standard output.

`Interpreter.java`


    public static void main(String[] args) throws Exception {
        String text = "a = 3 + 4 * 5";
        text += "println(a)";
        List<Token> tokens = new Lexer().init(text).tokenize();
        List<Token> blk = new Parser().init(tokens).block();
        new Interpreter().init(blk).run();
        // --> 23
    }
}

That's all for the implementation. Thank you very much.

in conclusion

The full source is available here.

Calc https://github.com/quwahara/Calc/tree/article-4-interpreter/Calc/src/main/java

There is a continuation article.

** Corresponds to prioritized parentheses ** http://qiita.com/quwahara/items/b76c6e438aeb32450391

4 Add println to the interpreter

Introduction

What you want to do by adding println

How to implement

How to implement in lexical analysis (Lexer)

How to implement parsing

How to implement the Interpreter

Try to implement in Java

Lexer.java

Lexer.java

Lexer.java

Parser.java

Parser.java

Interpreter.java

Interpreter.java

Interpreter.java

Interpreter.java

Interpreter.java

Interpreter.java

Interpreter.java

in conclusion

`Lexer.java`

`Lexer.java`

`Lexer.java`

`Parser.java`

`Parser.java`

`Interpreter.java`

`Interpreter.java`

`Interpreter.java`

`Interpreter.java`

`Interpreter.java`

`Interpreter.java`

`Interpreter.java`