Emerge
What
Emerge is a command-line tool for automatically generating lexers and parsers in Go from an EBNF description of any arbitrary grammar (language).
Since Emerge is agnostic about a given input grammar (language), building other's modules of the compiler's front-end (semantic analyzer, intermediate code generator, and machine-independent code optimizer) are beyond the scope of this project.
For building the back-end for a compiler's front-end you can tap into out-of-the-shell projects such as LLVM.
Why
Currently, Go does not provide a first-class support for building compilers written completely in Go.
When it comes to lexical analysis, you can either use the built-in text/scanner
package for simple use-cases or build the scanner by hand for more sophisticated use-cases.
For the syntactic analysis part, you can use goyacc
for auto-generating a parser from a Yacc file.
Since, goyacc
is just a port of yacc
For Go, it does not play very well with the language. In particular,
- The Yacc domain-specific language (DSL) is more than 50 years old, and as a result it is very hard-to-understand and hard-to-maintain.
- You need to remember many conventions to integrate it with Go in a working manner.
- When things go wrong, debugging the Yacc language and the generated parser takes intuition and ad-hoc solutions.
- For larger projecst and the projects that need collaboration, Yacc does not offer an easy modular approach.
How
Emerge is a response to all these challenges by reimplementing the underlying concepts and algorithms, that are well-established for a long time in the field of compiler design, in Go with having the requirements and best practices of today's world in mind.
Similar Projects
Here is a summary of some of the other similar projects.
- Lex is a program written in C that generates lexical analyzers (a.k.a. scanners).
- Flex is the fast lexical analyzer generator. It is a free and open-source alternative to lex also written in C.
- Yacc is program written in C that generates a parser from a notation similar to BNF.
- Bison is a GNU distribution of Yacc that takes an annotated context-free grammar and generates a deterministic parser.
- Goyacc is a version of Yacc for Go. It is written in Go and generates parsers in Go.
- ANTLR generates a parser that can build and walk parse trees from a grammar. It is used for reading, processing, executing, or translating structured text or binary files.
- Participle is simple Go library for building parsers from grammars. A grammar is an annotated Go structure used for defining both the grammar and the abstract syntax tree.