0x000 overview
Recently, we have started to learn the basic aspects of computer, such as the principles of computer composition, network, compilation and so on. At present, we are just learning the principles of compilation and are beginning to be interested in it. However, the study of theory is a bit dull and boring. We have decided to change the way, that is, to practice first, to try to solve problems, and to promote theory with practice. Originally intended to write a Chinese JS parsing, but it seems a little difficult, need to be implemented slowly, so find a simple to do, that is, parse the four operations, there is this project, declare: This is a very simple project, this is a very simple project, this is a very simple project. Lexical analysis, grammatical analysis and automata are all implemented in a simple way. After all, they are comparative dishes.
0x001 effect
 Source address: github

Implementing functions:
 Four +*/Positive Integer Operations in Arbitrary Order
 Support ()
 Frontend and backend versatility
 Providing direct computational functions
 Providing Four Operational Expressions to Inverse Polish AST Functions
 Provides parsing functions (temporarily only supports upper and lower character determination)
 Effect demonstration:
0x002 implementation
Since it is very simple, the theory and the way to realize it must be very simple. There are three problems to overcome in order to achieve this effect.
 How to implement priority calculation, such as */() priority is greater than +.
 How to segment strings, such as how to recognize numbers, symbols and wrong characters, is morphemization.
 How to implement grammar detection is to make the rules of expressions satisfy the requirements, such as + followed by numbers or ((here  as an operation, not a symbol).
0x003 Solution 1: How to Implement Priority Operations
1. Ignore priority temporarily
If there is no priority problem, it is very simple to implement a calculation, such as Below Code can achieve a simple addition, subtraction or multiplication and division calculation (within 10, more than one digit will encounter problem 2, here is a little simpler, avoid problem 2):
let calc = (input) => { let calMap = { '+': (num1, num2) => num1 + num2, '': (num1, num2) => num1  num2, '*': (num1, num2) => num1 * num2, '/': (num1, num2) => num1 / num2, } input = [...input].reverse() while (input.length >= 2) { let num1 = +input.pop() let op = input.pop() let num2 = +input.pop() input.push(calMap[op](num1, num2)) } return input[0] } expect(calc('1+2+3+4+51')).toEqual(14) expect(calc('1*2*3/3')).toEqual(2)
Algorithmic steps:

The input is broken up into a stack, because it is less than 10, so there is only one per number:
input = [...input].reverse()

Each time three digits are taken out, if the input is correct, the three digits are taken out, the first digit, the second operator and the third digit:
let num1 = +input.pop() let op = input.pop() let num2 = +input.pop()

The result is pushed back to the stack according to the operator, and then a process is formed. There is only one number left in the stack until the end, or three numbers are taken out at a time. So if the stack depth is <=2, that is the final result.
while (input.length >= 2) { // ...... input.push(calMap[op](num1, num2)) }
Animation demonstration:
2. Consider priorities
But now we need to consider the priority, such as */ priority is greater than +, () operator is the highest, how to solve it, in fact, there are already solutions, I use the suffix expression, also known as inverse Polish.
 Suffix expressions:
The socalled suffix expression is to put the operator at the end of the expression, such as 1 + 1 to 11 +.  The infix expression:
The socalled infix expression, in fact, is our usual way of writing, here do not do indepth.  Prefix expression
The socalled suffix expression is to put the operator at the front of the expression, such as 1 + 1 expressed as + 11, here do not go deep.
Inverse Polish style can refer to the following articles
3. Inverse Polish Priority Solution
In the inverse Polish formula, 1+1*2 can be converted to 112.*+
Code Demonstration:
let calc = (input) => { let calMap = { '+': (num1, num2) => num1 + num2, '': (num1, num2) => num1  num2, '*': (num1, num2) => num1 * num2, '/': (num1, num2) => num1 / num2, } input = [...input].reverse() let resultStack = [] while (input.length) { let token = input.pop() if (/[09]/.test(token)) { resultStack.push(token) continue } if (/[+\*/]/.test(token)) { let num1 = +resultStack.pop() let num2 = +resultStack.pop() resultStack.push(calMap[token](num1, num2)) continue } } return resultStack[0] } expect(calc('123*+')).toEqual(7)
After transformation, the calculation steps are as follows:

Initialize a stack
let resultStack = []

Take one from the expression at a time.
let token = input.pop()

If it is a number, it is pushed into the stack
if (/[09]/.test(token)) { resultStack.push(token) continue }

If it is an operator, it takes two numbers from the stack, performs corresponding operations, and then pushes the results into the stack.
if (/[+\*/]/.test(token)) { let num1 = +resultStack.pop() let num2 = +resultStack.pop() resultStack.push(calMap[token](num1, num2)) continue }

If the expression is not empty, go to step 2. If the expression is empty, the number in the stack is the final result, and the calculation is completed.
while (input.length) { // ... } return resultStack[0]
Animation demonstration:
There are two advantages after converting to the inverse Polish style:
 Do not care about operator priority
 Removing parentheses, such as (1+2)* (3+4), can be converted to 12+34+*, and the operation can be completed according to the inverse Polish operation method.
4. Interfix to suffix
This is the last little problem of Question 1, the process of realizing this problem. as follows:
let parse = (input) => { input = [...input].reverse() let resultStack = [], opStack = [] while (input.length) { let token = input.pop() if (/[09]/.test(token)) { resultStack.push(token) continue } if (/[+\*/]/.test(token)) { opStack.push(token) continue } } return [...resultStack, ...opStack.reverse()].join('') } expect(parse(`1+23+45`)).toEqual('12+34+5')
Preparing two stacks, one stack to store results, one stack to store operators, and finally splicing the two stacks together can convert 1+23+45 to 12+34+5, but if priority is involved, there is nothing to do, for example
expect(parse(`1+2*3`)).toEqual('123*+')
The conversion result of 1+2*3 should be 123*+, but in fact the conversion result is 123+*,*/priority is higher than +, so the following modifications should be made.
let parse = (input) => { input = [...input].reverse() let resultStack = [], opStack = [] while (input.length) { let token = input.pop() if (/[09]/.test(token)) { resultStack.push(token) continue } // if (/[+\*/]/.test(token)) { // opStack.push(token) // continue // } if (/[*/]/.test(token)) { while (opStack.length) { let preOp = opStack.pop() if (/[+\]/.test(preOp)) { opStack.push(preOp) opStack.push(token) token = null break } else { resultStack.push(preOp) continue } } token && opStack.push(token) continue } if (/[+\]/.test(token)) { while (opStack.length) { resultStack.push(opStack.pop()) } opStack.push(token) continue } } return [...resultStack, ...opStack.reverse()].join('') } expect(parse(`1+2`)).toEqual('12+') expect(parse(`1+2*3`)).toEqual('123*+')
 When the operator is */, take out the top element of the stack to determine whether the priority of the elements in the stack is lower than */. If so, push the operator directly into opStack, and then exit. Otherwise, push the elements out of the stack into resultStack all the time.
if (/[+\]/.test(preOp)) { opStack.push(preOp)// The stack is used here for judgment, so it has to be returned after judgment. opStack.push(token) token = null break }else { resultStack.push(preOp) continue }
 Also note that the stack is empty and the operators need to be put directly on the stack.
token && opStack.push(token) continue
 When the operator is +, because it's already the lowest priority, it's OK to just stack all the operators out of the stack.
if (/[+\]/.test(token)) { while (opStack.length) { resultStack.push(opStack.pop()) } opStack.push(token) continue }
The priority problem of +*/ has been solved here, only the priority problem of () is left. His priority is the highest, so the following modifications can be made here:
if (/[+\]/.test(token)) { while (opStack.length) { let op=opStack.pop() if (/\(/.test(op)){ opStack.push(op) break } resultStack.push(op) } opStack.push(token) continue } if (/\(/.test(token)) { opStack.push(token) continue } if (/\)/.test(token)) { let preOp = opStack.pop() while (preOp !== '('&&opStack.length) { resultStack.push(preOp) preOp = opStack.pop() } continue }
 When the operator is +, no more brainless popup, if it is (no popup)
while (opStack.length) { let op=opStack.pop() if (/\(/.test(op)){ opStack.push(op) break } resultStack.push(op) } opStack.push(token)
 Push opStack when the operator is (
if (/\(/.test(token)) { opStack.push(token) continue }
 When the operator is, the opStack is continuously popped up to resultStack until it meets ((not pushed into resultStack)
if (/\)/.test(token)) { let preOp = opStack.pop() while (preOp !== '('&&opStack.length) { resultStack.push(preOp) preOp = opStack.pop() } continue }
Complete code:
let parse = (input) => { input = [...input].reverse() let resultStack = [], opStack = [] while (input.length) { let token = input.pop() if (/[09]/.test(token)) { resultStack.push(token) continue } if (/[*/]/.test(token)) { while (opStack.length) { let preOp = opStack.pop() if (/[+\]/.test(preOp)) { opStack.push(preOp) opStack.push(token) token = null break } else { resultStack.push(preOp) continue } } token && opStack.push(token) continue } if (/[+\]/.test(token)) { while (opStack.length) { let op = opStack.pop() if (/\(/.test(op)) { opStack.push(op) break } resultStack.push(op) } opStack.push(token) continue } if (/\(/.test(token)) { opStack.push(token) continue } if (/\)/.test(token)) { let preOp = opStack.pop() while (preOp !== '(' && opStack.length) { resultStack.push(preOp) preOp = opStack.pop() } continue } } return [...resultStack, ...opStack.reverse()].join('')
Animation examples:
In this way, the suffix conversion is completed, then the whole problem 1 has been solved, and the whole process of suffix=> suffix=> calculation can be completed by calc (parse (input).
0x004 Solution 2: Segmenting Strings
Although the above has solved the big problem of infix=> suffix=> calculation, the most basic problem has not been solved, that is, the input problem. In the process of solving the above problem 1, input is only a simple cut, but also limited to less than 10. Next, the problem to be solved is how to segment the input to meet the requirements.

Solution 1: Regular, although regular can be done as follows, it is still possible to do a simple demo, but for later grammar detection and other things are not very beneficial, so not very good, I gave up this method.
(1+22)*(333+4444)`.match(/([09]+)([+\*/])(\()(\))/g) // output // (11) ["(", "1", "+", "22", ")", "*", "(", "333", "+", "4444", ")"]

Solution 2: Characterbycharacter analysis, the approximate process is
while(input.length){ let token = input.pop() if(/[09]/.test(token)) // Enter digital analysis if(/[+\*/\(\)]/.test(token))// Enter Symbolic Analysis }
Next, try Solution 2 to solve this problem:
1 Define node structure
When we split, instead of simply saving values, we save each node as a similar structure, which can be represented by objects.
{ type:'', value:'' }
Among them, type is the node type, which can summarize all the possible types in the four operations. My summary is as follows:
TYPE_NUMBER: 'TYPE_NUMBER', // number TYPE_LEFT_BRACKET: 'TYPE_LEFT_BRACKET', // ( TYPE_RIGHT_BRACKET: 'TYPE_RIGHT_BRACKET', // ) TYPE_OPERATION_ADD: 'TYPE_OPERATION_ADD', // + TYPE_OPERATION_SUB: 'TYPE_OPERATION_SUB', //  TYPE_OPERATION_MUL: 'TYPE_OPERATION_MUL', // * TYPE_OPERATION_DIV: 'TYPE_OPERATION_DIV', // /
Value is the corresponding true value, such as 123, +, , *, /.
2 Digital Processing
If it's a number, read on until it's not a number. process All the read results are put in value and finally join the team.
if (token.match(/[09]/)) { let next = tokens.pop() while (next !== undefined) { if (!next.match(/[09]/)) break token += next next = tokens.pop() } result.push({ type: type.TYPE_NUMBER, value: +token }) token = next }
3 Symbol Processing
First, define a symbol and type comparison table. If it is not in the table, it indicates that it is an exception input, throws an exception, and if it is taken, it indicates that it is a normal input, then join the team.
const opMap = { '(': type.TYPE_LEFT_BRACKET, ')': type.TYPE_RIGHT_BRACKET, '+': type.TYPE_OPERATION_ADD, '': type.TYPE_OPERATION_SUB, '*': type.TYPE_OPERATION_MUL, '/': type.TYPE_OPERATION_DIV } let type = opMap[token] if (!type) throw `error input: ${token}` result.push({ type, value: token, })
4 Summary
This completes the input processing, at this time, other functions also need to be processed, should be for input has changed from a string into a sequence after tokenize, after modification is completed, it can be calc (parse (tokenize ()) complete a set of saucy operations.
0x005 Solution 3: Grammar Detection
In fact, the problem to be solved in grammar detection is to judge the correctness of input and whether it meets the rules of four operations. The idea of similar machine is used here, but it is simple enough to explode and can only be judged by one step.
Define a Syntax table The table defines the type of node that can appear after a node, such as, +followed only by numbers or (like that).
let syntax = { [type.TYPE_NUMBER]: [ type.TYPE_OPERATION_ADD, type.TYPE_OPERATION_SUB, type.TYPE_OPERATION_MUL, type.TYPE_OPERATION_DIV, type.TYPE_RIGHT_BRACKET ], [type.TYPE_OPERATION_ADD]: [ type.TYPE_NUMBER, type.TYPE_LEFT_BRACKET ], [type.TYPE_OPERATION_SUB]: [ type.TYPE_NUMBER, type.TYPE_LEFT_BRACKET ], [type.TYPE_OPERATION_MUL]: [ type.TYPE_NUMBER, type.TYPE_LEFT_BRACKET ], [type.TYPE_OPERATION_DIV]: [ type.TYPE_NUMBER, type.TYPE_LEFT_BRACKET ], [type.TYPE_LEFT_BRACKET]: [ type.TYPE_NUMBER, type.TYPE_LEFT_BRACKET ], [type.TYPE_RIGHT_BRACKET]: [ type.TYPE_OPERATION_ADD, type.TYPE_OPERATION_SUB, type.TYPE_OPERATION_MUL, type.TYPE_OPERATION_DIV, type.TYPE_RIGHT_BRACKET ] }
In this way, we can simply use the following grammatical decision methods:
while (tokens.length) { // ... let next = tokens.pop() if (!syntax[token.type].includes(next.type)) throw `syntax error: ${token.value} > ${next.value}` // ... }
For (), the reference count is used here. If it is (, count + 1, if it is), count  1. It's good to determine the count at the end of detection.
// ... if (token.type === type.TYPE_LEFT_BRACKET) { bracketCount++ } // ... if (next.type === type.TYPE_RIGHT_BRACKET) { bracketCount } // ... if (bracketCount < 0) { throw `syntax error: toooooo much ) > )` } // ...
0x006 summary

There are some problems in this article:
 I can't deduce why I want to use the inverse Polish formula. I just know that there is such a solution that I can use it instead of deducing the solution from the problem.
 Not enough literary background, not enough cool.

There are also some problems with this implementation:
 Instead of using the idea of compiling principle to realize it, we should try to find out the solution by ourselves, practice first, and then understand the problem.
 It doesn't refer to too many other people's implementations. It feels like building a car behind closed doors.

Reflection:
 Processing of () may be done recursively, and a new expression parsing can be restarted after entering ().
 Not enough thinking, not enough unit testing coverage, not knowing where many pits are
In short: So far, there are a lot of areas that are not detailed enough. Please forgive me for having more exchanges and growing up together.