Make a language! (Making a simple calculator ①)

Last time, Make a language! (JavaCC environment construction) has prepared the JavaCC development environment. This time, let's create a simple calculator and learn how to use JavaCC.

How JavaCC works

image.png

When you create a JJT file, JavaCC will generate a Java Parser for you. A JJT file is a file that defines lexical analysis and parsing.

Interpreter and compiler processing flow

  1. Preprocessing (replace source files at once, complete omitted descriptions, etc.)
  2. Lexical analysis (separate the source code into appropriate lexicals.)
  3. Parsing (generate a syntax tree based on lexical)
  4. Semantic analysis (interprets the meaning of the syntax tree. It is executed as it is depending on the interpreter.)
  5. Generation of machine language and bytecode
  6. Machine or VM executes code

image.png

This time, we will focus on lexical analysis, parsing, and semantic analysis (execution of syntactic trees).

Formal representation of language

First of all, before using JavaCC, we will prepare for the language. Think of ways to describe what the language is and analyze it properly.

BNF BNF is a language for defining the grammar of a language. It is described in a format in which the left side is defined by the right side. There are various ways to describe BNF, but this time we will describe BNF in a regular expression style according to JavaCC. The simplicity of the description seems to be about the extended BNF (extendes BNF).

The main symbols used are as follows.

symbol meaning Example
::= Define the left side with the right side ::= <加算formula>
() ()Grouping inside. ("+" | "-")
<xxx> Non-terminal symbol. Abstract expressions and sentences are non-terminal.
"xxx" Terminal symbol. When it comes to specific letters and numbers, it ends. "1"
* Repeat immediately before 0 times or more ("1" | "2")*
+ Repeat immediately before once or more ("1" | "2")+
| Represents or "1" | "2"

From now on, we will consider the simple formula below.

BNF with a simple formula


<formula> ::= <加算formula>
<Addition formula> ::= <Multiplication formula> ( <Addition operator> <Multiplication formula> )*
<Multiplication formula> ::= <Monomial> ( <Multiplication operator> <Monomial> )*
<Monomial> ::= <Open brackets> <formula> <Closing brackets> | <Decimal representation>

<Addition operator> ::= "+" | "-"
<Multiplication operator> ::= "*" | "/" | "%"
<Open brackets> ::= "("
<Closing brackets> ::= ")"
<Decimal representation> ::= (<Numbers>)+ ("." (<Numbers>)+)?
<Numbers> ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "0"

Let's check this BNF. For example, the formula 10 * 20 * (30 + 40) is

<formula>
→ <Addition formula>
→ <Multiplication formula> ( <Addition operator> <Multiplication formula> )*
→ <Multiplication formula>
→ <Monomial> ( <Multiplication operator> <Monomial> )*
→ <Monomial> <Multiplication operator> <Monomial> <Multiplication operator> <Monomial>
→ <Decimal representation> <Multiplication operator> <Monomial> <Multiplication operator> <Monomial>
→ (<Numbers>)+ ("." (<Numbers>)+)? <Multiplication operator> <Monomial> <Multiplication operator> <Monomial>
→ <Numbers> <Numbers> <Multiplication operator> <Monomial> <Multiplication operator> <Monomial>
→ 10 <Multiplication operator> <Monomial> <Multiplication operator> <Monomial>
→ 10 * <Monomial> <Multiplication operator> <Monomial>
→ ...
→ 10 * 20 * <Monomial>
→ 10 * 20 * <Open brackets> <formula> <Closing brackets>
→ 10 * 20 * ( <formula> <Closing brackets>
→ 10 * 20 * ( <Addition formula> <Closing brackets>
→ 10 * 20 * ( <Multiplication formula> ( <Addition operator> <Multiplication formula> )* <Closing brackets>
→ 10 * 20 * ( <Multiplication formula> <Addition operator> <Multiplication formula> <Closing brackets>
→ ...
→ 10 * 20 * ( <Monomial> <Addition operator> <Multiplication formula> <Closing brackets>
→ ...
→ 10 * 20 * ( 30 <Addition operator> <Multiplication formula> <Closing brackets>
→ ...
→ 10 * 20 * ( 30 + 40 )

You can see that it can be derived as follows.

Lexical analysis

Lexical analysis is the process of separating a list of characters with appropriate delimiters. BNF will analyze the part closer to the terminal symbol.

BNF lexical analysis part of a simple formula


<Addition operator> ::= "+" | "-"
<Multiplication operator> ::= "*" | "/" | "%"
<Open brackets> ::= "("
<Closing brackets> ::= ")"
<Decimal representation> ::= (<Numbers>)+ ("." (<Numbers>)+)?
<Numbers> ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "0"

At this time, numbers, decimal expressions, and addition operators are considered to be lexical types. As an image, it is the work of appropriately separating a series of character strings.

Parsing

Parsing will handle the most meaningful parts of BNF (the parts that make sense to create a syntax tree).

BNF parsing part of a simple formula


<formula> ::= <加算formula>
<Addition formula> ::= <Multiplication formula> ( <Addition operator> <Multiplication formula> )*
<Multiplication formula> ::= <Monomial> ( <Multiplication operator> <Monomial> )*
<Monomial> ::= <Open brackets> <formula> <Closing brackets> | <Decimal representation>

It is a task to assemble a tree with the lexicals that are appropriately separated by lexical analysis as nodes. There are various ways to assemble this tree, such as upward parsing and downward parsing. This time JavaCC uses recursive descent parsing.

Role of lexical analysis and parsing

In fact, it is difficult to draw a clear line between parsing and lexical analysis. For example, when you can add + or - to the beginning of a monomial, It depends on how you make it, whether you use + 10 to make one word or treat it as two words, - and 10. (However, in this case, it is better to treat + and - as different lexical terms and let them judge at the stage of parsing because it depends on the context. It may be easy to handle.)

next time

Let's put this BNF into a JavaCC jjt file. Make a language! (Making a simple calculator ②).

Related article

Make a language! I tried using bison and flex Make a language! (JavaCC environment construction) Make a language! (Making a simple calculator ②)

Recommended Posts

Make a language! (Making a simple calculator ②)
Make a language! (Making a simple calculator ①)
Try to make a simple callback
Make a language! (JavaCC environment construction)
[docker] [nginx] Make a simple ALB with nginx
Try making a calculator app in Java
Making Frontale a microservice
Practice making a simple chat app with Docker + Sinatra
[Personal memo] Make a simple deep copy in Java
Let's make a robot! "A simple demo of Java AWT Robot"
Make a reflection utility ②
Make a reflection utility ③
Make a reflection utility ①
Let's make a calculator application in Java ~ Display the application window
[Beginner] Try to make a simple RPG game with Java ①
Make a simple CRUD with SpringBoot + JPA + Thymeleaf ① ~ Hello World ~
Let's make a simple API with EC2 + RDS + Spring boot ①
Make a simple CRUD with SpringBoot + JPA + Thymeleaf ⑤ ~ Template standardization ~
Ruby Learning # 11 Building a Calculator
[Java] Make it a constant
[Java] Draw a simple pattern
[Rails] Make a breadcrumb trail
Making a minute repeater part.2
Making a minute repeater part.1
Make a rhombus using Java
A story about making a calculator to calculate the shell mound rate