This program is written for linux systems. The easiest way to compile is to use the added makefiles with a simple make command.
To run any program, use ./c(pp)lox path/to/program_name
To start the built in REPL, use ./c(pp)lox command with no arguments.
Lox is a language invented by Robert Nystrom for the sake of his book. It is mostly an imperative language, with functions as first class citizens (allows dynamic assignment to variables). I have also added simple lists and mapping over them due to my recent fascination with functional programming seen in Haskell.
Cpplox is a port of jlox from the book - an abstract syntax tree walk interpreter. While slower, I find it a beautiful way to illustrate the process of translating a source code file into a running and working program.
Clox follows the second part of the book - a Virtual Machine Bytecode interpreter, written in C. While harder to understand from reading source code alone, this method allows for vast optimizations. I found the NaN Boxing technique to be particullarly fascinating, as it allows for fitting the whole Value representation inside of a single Double. Filling unused mantissa NaN representations is an enlightening idea.
Besides following the book I have expanded the project with following functionalities:
- translation from java to C++, using modern features such as std::variant to represent generic Value, as well as using a custom Visitor pattern implementation to mimick java Abstract Syntax Tree flow.
- modulo and bitwise operations: % (mod), & (AND), | (OR), ^ (XOR)
- c-like ternary operator: Expression ? Statement (True branch) : Statement (Otherwise branch)
- c-style increment (++) and decrement (--): both the prefix and postfix variety (where prefix means return affected value, and postfix means return, then affect)
- c-like compound assignment operations: like +=, -=, *=, %= for ease of use
- native printf: with variadic input, meaning taking any amount of arguments using the %(type) c-style syntax
- lists, typical to functional languages:
list = [a,b,c,..]- allowing for ease of multiple data assignment - fmap: allows performing a given function over all elements of a list with syntax: fmap(list, function);
- subscripts:
list[index]- both read and overwrite operation
string[index]- read only (strings are literals due to memory optimization) - constant folding (peephole optimization): precomputing const expressions at compile time to shorten runtime.
- IO operations: open and write to a file
- modules: allows for multi - file programs using export and import keywords.
- hash maps:
allowing for easy data storage and retrieval like:
var hero = { "name": "Conan", "hp": 100, 1: "The number one" }; var health = hero["hp"];
- simple debugging:
triggered through native code function debugger(), stops execution and allows for user actions:
[s]tep, [c]ontinue, [v]iew stack, [q]uit - a small test suite: showcasing some of the basic functionalities of this language
While writing a language can be an endless task, besides adding new syntax I hold a few ideas for further areas to consider:
- Alternative Bytecodes (using 3-4 bytes per instruction instead of 1-2 currently) - see lua: https://www.lua.org/doc/jucs05.pdf
- Other possible parsers beside current Pratt Parser implementation
- Further optimization (like using SIMD)
- More advanced garbage collection techniques
- https://craftinginterpreters.com/ - the aformentioned book
- https://gameprogrammingpatterns.com/ - another book of Nystrom, useful patterns explained through the lens of game dev