In the previous article, I implemented the interpreter (http://qiita.com/quwahara/items/30e93dfd2690913d66c0). The interpreter can now perform four arithmetic operations, but the calculation results cannot be displayed. Then I'm lonely, so I'll add println to display the calculation result.
When implementing, confirm what you want to do concretely.
For example, if you have a program like the one below, we aim to output 1
and line breaks to the standard output.
println(1)
Also, if a variable to which 2
is assigned is specified as an argument ofprintln ()
, we aim to output the contents 2
as a line feed to the standard output.
a = 2
println(a)
So far, we have implemented lexical analysis (Lexer), parser (Parser), and interpreter (Interpreter). We will consider how to implement each in turn.
The implemented lexical analysis does not have the ability to parse parentheses, (
and )
.
First add it.
Also, in the lexical analysis specifications, the alphabetical sequence was defined as a variable name (variable).
However, println
, which is one of the alphabetical sequences, is a function name, not a variable name.
The fact that it is a sequence of alphabets is no longer enough to determine whether it is a variable name or a function name.
Therefore, change the meaning of the alphabet to ident.
To think about how to implement it, first look closely at the token sequence of the println (1)
call.
It is decomposed into the following four tokens.
println
and (
and 1
and )
This sequence is similar to the sequence of ʻa + 1. I will look at it. ʻA + 1
also tries to arrange tokens in the same way.
,
+and
1`Let's compare them.
println
→ ʻa` → Both are identifiers(
→ +
→ Both symbols1
→ 1
→ Both numbers)
→ No supportThe parsing of ʻa + 1puts the tokens on the left and right with
+in the middle. Observing the contrast,
println (1)also has a composition in which the tokens that come to the left and right come with
(in the middle. From that observation, to parse
println (1), it seems that you can parse it by parsing ʻa + 1
and confirming that the )
token comes at the end. ..
In other words, it seems that it can be implemented by slightly modifying the parsing to operator tokens.
It also supports the change from variable to ident, which was mentioned in the previous implementation of lexical analysis.
The problem with implementing the println (1)
call was that the interpreter implicitly represented the variable as a String.
That was fine because there was no possibility that there was a function name as it was, but now we have to distinguish between the variable name and the function name.
Therefore, we will introduce a class that represents the variable name and the function name for each.
Also, as we observed in the parsing implementation, (
is the key token for the println (1)
call.
Invite them to call a method that handles the (
when they encounter a token.
Finally, here as well, it corresponds to the change from the variable name (variable) to the identifier (ident) mentioned in the previous method of implementing lexical analysis.
Move on to implementation.
Again, lexical analysis (Lexer), parser (Parser), interpreter (Interpreter), For each, let's take a look at the changes and additions in order.
Lexer.java
An implementation of Lexer.java.
First, add the function to parse parentheses, (
and )
.
Lexer.java
private boolean isParenStart(char c) {
return c == '(' || c == ')';
}
private Token paren() throws Exception {
Token t = new Token();
t.kind = "paren";
t.value = Character.toString(next());
return t;
}
Next is the response to changes from variable names to identifiers.
Lexer.java
private boolean isIdentStart(char c) throws Exception {
return Character.isAlphabetic(c);
}
private Token ident() throws Exception {
StringBuilder b = new StringBuilder();
b.append(next());
while (!isEOT() && (Character.isAlphabetic(c()) || Character.isDigit(c()))) {
b.append(next());
}
Token t = new Token();
t.kind = "ident";
t.value = b.toString();
return t;
}
Finally, if you add the parsing call part of the parentheses and modify the call part of the change to ident, The implementation of Lexer.java is complete.
Lexer.java
public Token nextToken() throws Exception {
skipSpace();
if (isEOT()) {
return null;
} else if (isSignStart(c())) {
return sign();
} else if (isDigitStart(c())) {
return digit();
} else if (isIdentStart(c())) {
return ident();
} else if (isParenStart(c())) {
return paren();
} else {
throw new Exception("Not a character for tokens");
}
}
Parser.java
An implementation of Parser.java.
First, change the definition of how it works with respect to the meaning of the token.
Add (
to degrees
, which defines the degree of operator order at <-Add
.
(
Is also a correspondence to treat like an operator.
println ()
wants to combine with priority over other operators, so set the degree to 80
.
Corresponds to the change to ident at <-Update
.
Parser.java
public Parser() {
degrees = new HashMap<>();
degrees.put("(", 80); // <-- Add
degrees.put("*", 60);
degrees.put("/", 60);
degrees.put("+", 50);
degrees.put("-", 50);
degrees.put("=", 10);
factorKinds = Arrays.asList(new String[] { "digit", "ident" }); // <-- Update
binaryKinds = Arrays.asList(new String[] { "sign" });
rightAssocs = Arrays.asList(new String[] { "=" });
}
It is a change of the part to be analyzed.
The processing to perform analysis was added to the part of the if statement with <-Add
.
Assigning tokens to ʻoperator.left and ʻoperator.right
is almost the same as parsing the operator of the if statement above.
ʻOperator.right = expression (0); The argument of ʻexpression ()
is 0
unlike the above if statement.
This is because one expression is taken independently as an argument of the function, and there is no need to compare the precedence with the preceding and following operators.
consume ()
confirms that the next token is a closing brace and advances the token of interest in the analysis to the next.
The implementation of Parser.java is complete.
Parser.java
private Token bind(Token left, Token operator) throws Exception {
if (binaryKinds.contains(operator.kind)) {
operator.left = left;
int leftDegree = degree(operator);
if (rightAssocs.contains(operator.value)) {
leftDegree -= 1;
}
operator.right = expression(leftDegree);
return operator;
} else if(operator.kind.equals("paren") && operator.value.equals("(")) { // <-- Add
operator.left = left;
operator.right = expression(0);
consume(")");
return operator;
} else {
throw new Exception("The token cannot place there.");
}
}
private Token consume(String expectedValue) throws Exception {
if (!expectedValue.equals(token().value)) {
throw new Exception("Not expected value");
}
return next();
}
Interpreter.java
An implementation of Interpreter.java. First, in order to distinguish between variable names and function names, we will introduce a class that represents each. The function is an abstract class for future expansion. ʻInvoke ()` method implements what the function actually does.
Interpreter.java
public static class Variable {
public String name;
public Integer value;
@Override
public String toString() {
return name + " " + value;
}
}
public static abstract class Func {
public String name;
abstract public Object invoke(Object arg) throws Exception;
}
public static class Println extends Func {
public Println() {
name = "println";
}
@Override
public Object invoke(Object arg) throws Exception {
System.out.println(arg);
return null;
}
}
It is a change of the initialization part.
Added a Map to hold the function name and its value to accommodate the function.
Add the entity of println ()
there.
Interpreter.java
public class Interpreter {
public Map<String, Func> functions;
public Map<String, Variable> variables;
List<Token> body;
public Interpreter init(List<Token> body) {
functions = new HashMap<>();
Func f = new Println();
functions.put(f.name, f);
variables = new HashMap<>();
this.body = body;
return this;
}
It is a change of the part that calls the method that processes each token according to the meaning of the token.
Added a function call where there is a <-Add
.
The change to ident is where the <-Update
is.
Interpreter.java
public Object expression(Token expr) throws Exception {
if (expr.kind.equals("digit")) {
return digit(expr);
} else if (expr.kind.equals("ident")) { // <-- Update
return ident(expr);
} else if (expr.kind.equals("paren")) { // <-- Add
return invoke(expr);
} else if (expr.kind.equals("sign") && expr.value.equals("=")) {
return assign(expr);
} else if (expr.kind.equals("sign")) {
return calc(expr);
} else {
throw new Exception("Expression error");
}
}
Changed var ()
to ʻident ()`.
If the identifier is a function name, it resolves to a function.
Otherwise it resolves to a variable.
Interpreter.java
public Object ident(Token token) {
String name = token.value;
if (functions.containsKey(name)) {
return functions.get(name);
}
if (variables.containsKey(name)) {
return variables.get(name);
} else {
Variable v = new Variable();
v.name = name;
v.value = 0;
variables.put(name, v);
return v;
}
}
The following med corresponds to the Variable class introduced to represent variable names.
Interpreter.java
public Variable assign(Token expr) throws Exception {
Variable variable = variable(expression(expr.left));
Integer value = value(expression(expr.right));
variable.value = value;
return variable;
}
public Variable variable(Object value) throws Exception {
if (value instanceof Variable) {
return (Variable) value;
} else {
throw new Exception("left value error");
}
}
public Integer value(Object value) throws Exception {
if (value instanceof Integer) {
return (Integer) value;
} else if (value instanceof Variable) {
Variable v = (Variable) value;
return v.value;
}
throw new Exception("right value error");
}
ʻInvoke ()makes a function call. That's what I wanted to do in this article.
func () is the process of ʻinvoke ()
and makes sure that the result of ʻexpr.left` is the function name.
Interpreter.java
private Object invoke(Token expr) throws Exception {
Func f = func(expression(expr.left));
Integer value = value(expression(expr.right));
return f.invoke(value);
}
public Func func(Object value) throws Exception {
if (value instanceof Func) {
return (Func) value;
} else {
throw new Exception("Not a function");
}
}
The string in the program below using the above implementation
a = 3 + 4 * 5
println(a)
Is calculated and the value assigned to the variable is printed to the standard output.
Interpreter.java
public static void main(String[] args) throws Exception {
String text = "a = 3 + 4 * 5";
text += "println(a)";
List<Token> tokens = new Lexer().init(text).tokenize();
List<Token> blk = new Parser().init(tokens).block();
new Interpreter().init(blk).run();
// --> 23
}
}
That's all for the implementation. Thank you very much.
The full source is available here.
Calc https://github.com/quwahara/Calc/tree/article-4-interpreter/Calc/src/main/java
There is a continuation article.
** Corresponds to prioritized parentheses ** http://qiita.com/quwahara/items/b76c6e438aeb32450391
Recommended Posts