projects

WACC - Compiler

January 2017

An implementation of the front-end of a compiler for the WACC language. That is, a lexer, a parser and a semantic analyser for WACC programs and generating an AST internal representation of their structure.

This post is quite light on details as WACC is still used as a teaching language at Imperial College London. I will be happy to answer any questions about the project, however (send me an email!).

ANTLR

ANTLR is a top-notch parser generator for reading, processing and executing structured text. It strips away much of the difficulty of writing a compiler from scratch (which I’ve also experienced with Flint).

Lexer

lexer grammar WACCLexer;

LineComment: '#' ~[\r\n]* -> channel(HIDDEN);

...

QUOTE_OPEN: '"' -> pushMode(String);

...

SKIP_: 'skip';
READ: 'read';
FREE: 'free';

...

mode String;
QUOTE_CLOSE: '"' -> popMode;
STRING_CHARS
	:	STRING_CHAR+
	;

...
parser grammar WACCParser;

options {
  tokenVocab=WACCLexer;
}

program: BEGIN function* statement END EOF;

// Functions
function
  : type IDENTIFIER LPAREN parameterList? RPAREN IS statement END
  ;

...

Kotlin Compiler

I wrote the compiler in Kotlin, a JVM language that is very similar to Java. I had a bit of a functional craze going on at the time, and missed the terseness of Haskell. Kotlin gave me some of that back, and it would be the tool I’d use for JVM.

.
├── asm
│   ├── ASMContext.kt
│   ├── Generatable.kt
│   ├── RegisterSet.kt
│   ├── Stack.kt
│   └── value
│       ├── ASMValue.kt
│       ├── MemoryValue.kt
│       └── StackValue.kt
├── ast
│   ├── ASTVisitor.kt
│   ├── expression
│   │   ├── BinaryExpression.kt
│   │   ├── CallExpression.kt
│   │   ├── Expression.kt
│   │   ├── ExpressionVisitor.kt
│   │   ├── NewPairExpression.kt
│   │   ├── TypedOperator.kt
│   │   └── UnaryExpression.kt
│   ├── function
│   │   ├── Function.kt
│   │   └── FunctionVisitor.kt
│   ├── identifier
│   │   ├── Identifier.kt
│   │   └── IdentifierVisitor.kt
│   ├── literal
│   │   ├── ArrayLiteral.kt
│   │   ├── BoolLiteral.kt
│   │   ├── CharLiteral.kt
│   │   ├── IntLiteral.kt
│   │   ├── Literal.kt
│   │   ├── LiteralVisitor.kt
│   │   ├── PairLiteral.kt
│   │   ├── SpecialChar.kt
│   │   └── StringLiteral.kt
│   ├── Node.kt
│   ├── Program.kt
│   ├── reference
│   │   ├── ArrayReference.kt
│   │   ├── PairReference.kt
│   │   ├── Reference.kt
│   │   └── ReferenceVisitor.kt
│   ├── statement
│   │   ├── AssignmentStatement.kt
│   │   ├── BeginStatement.kt
│   │   ├── BlockStatement.kt
│   │   ├── DeclarationStatement.kt
│   │   ├── ExitStatement.kt
│   │   ├── FreeStatement.kt
│   │   ├── IfStatement.kt
│   │   ├── PrintStatement.kt
│   │   ├── ReadStatement.kt
│   │   ├── ReturnStatement.kt
│   │   ├── SkipStatement.kt
│   │   ├── Statement.kt
│   │   ├── StatementVisitor.kt
│   │   └── WhileStatement.kt
│   ├── tostringutilities
│   │   ├── Indentable.kt
│   │   └── ToStringUtilities.kt
│   └── type
│       ├── ArrayType.kt
│       ├── BaseType.kt
│       ├── GenericArrayType.kt
│       ├── GenericPairType.kt
│       ├── NestedPairType.kt
│       ├── PairType.kt
│       ├── TypeComparable.kt
│       └── Type.kt
├── ASTBuilder.kt
├── CompilerArguments.kt
├── Compiler.kt
├── diagnostic
│   ├── ColourUtilities.kt
│   ├── Diagnostic.kt
│   ├── Locatable.kt
│   └── SourceLocation.kt
├── errorlistener
│   ├── DiagnosticErrorListener.kt
│   └── ErrorListener.kt
├── ir
│   ├── builder
│   │   ├── BlockBuilder.kt
│   │   ├── FunctionBuilder.kt
│   │   └── ModuleBuilder.kt
│   ├── global
│   │   ├── IRGlobal.kt
│   │   ├── StringGlobal.kt
│   │   ├── TypeGlobal.kt
│   │   └── VectorGlobal.kt
│   ├── instruction
│   │   ├── AllocaInstruction.kt
│   │   ├── BinaryInstruction.kt
│   │   ├── BranchInstruction.kt
│   │   ├── CallInstruction.kt
│   │   ├── GetElementPointerInstruction.kt
│   │   ├── IRInstruction.kt
│   │   ├── IROperator.kt
│   │   ├── LoadInstruction.kt
│   │   ├── ReturnInstruction.kt
│   │   ├── StoreInstruction.kt
│   │   ├── TemporaryValueInstruction.kt
│   │   ├── UnconditionalBranchInstruction.kt
│   │   └── UnreachableInstruction.kt
│   ├── IRBlock.kt
│   ├── IRFunction.kt
│   ├── IRModule.kt
│   ├── type
│   │   ├── BasicType.kt
│   │   ├── IRType.kt
│   │   ├── PointerType.kt
│   │   ├── RegisterType.kt
│   │   ├── StructType.kt
│   │   └── VectorType.kt
│   └── value
│       ├── ConstantIntegerValue.kt
│       ├── GlobalRefValue.kt
│       ├── IRValue.kt
│       ├── LocalRefValue.kt
│       ├── NullValue.kt
│       └── RegisterContext.kt
├── irgen
│   ├── IRContext.kt
│   ├── IRExpressionGenerator.kt
│   ├── IRFunctionGenerator.kt
│   ├── IRGenerator.kt
│   ├── IRIdentifierGenerator.kt
│   ├── IRLiteralGenerator.kt
│   ├── IRReferenceGenerator.kt
│   └── IRStatementGenerator.kt
├── scope
│   ├── GenericScopeTree.kt
│   ├── ProgramScopeTree.kt
│   └── ScopeTree.kt
└── semantic
    ├── ExpressionSemantic.kt
    ├── FunctionSemantic.kt
    ├── IdentifierSemantic.kt
    ├── LiteralSemantic.kt
    ├── ReferenceSemantic.kt
    ├── SemanticChecker.kt
    ├── SemanticContext.kt
    └── StatementSemantic.kt

22 directories, 121 files