The code to read from the GOAL REPL is in `Compiler.cpp`, in `Compiler::execute_repl`. Compiling an `asm-file` form will call `Compiler::compile_asm_file` in `CompilerControl.cpp`, which is where we'll start.
I've divided the process into these steps:
1.__Read__: Convert from text to a representation of Lisp syntax.
2.__IR-Pass__: Convert the S-Expressions to an intermediate representation (IR).
3.__Register Allocation__: Map variables in the IR (`IRegister`s) to real hardware registers
4.__Code Generation__: Generate x86-64 instructions from the IR
5.__Object File Generation__: Put the instructions and static data in an object file and generate linking data
6.__Sending__: Send the code to the runtime
7.__Linking__: The runtime links the code so it can be run, and runs it.
8.__Result__: The code prints the message which is sent back to the REPL.
The reader converts GOAL/GOOS source into a `goos::Object`. One of the core ideas of lisp is that "code is data", so GOAL code is represented as GOOS data. This makes it easy for GOOS macros to operate on GOAL code. A GOOS object can represent a number, string, pair, etc. This strips out comments/whitespace.
(top-level (defun factorial-iterative ((x integer)) (let ((result 1)) (while (!= x 1) (set! result (* result x)) (set! x (- x 1))) result)) (define-extern _format function) (define format _format) (let ((x 10)) (format #t "The value of ~D factorial is ~D~%" x (factorial-iterative x))))
```
There are a few details worth mentioning about this process:
- The reader will expand `'my-symbol` to `(quote my-symbol)`
- Using `read_from_file` adds information about where each thing came from to a map stored in the reader. This map is used to determine the source file/line for compiler errors.
This pass converts code (represented as a `goos::Object`) into intermediate representation. This is stored in an `Env*`, a tree structure. At the top is a `GlobalEnv*`, then an `FileEnv` for each file compiled, then a `FunctionEnv` for each function in the file. There are environments within `FunctionEnv` that are used for lexical scoping and the other types of GOAL scopes. The Intermediate Representation (IR) is a list per function that's built up in order as the compiler goes through the function. Note that the IR is a list of instructions and doesn't have a tree or more complicated structure. Here's an example of the IR for the example function:
The `function-top-level` is the "top level" function, which is everything not in a function. In the example, this is just defining the function, defining `format`, and calling `format`.
You'll notice that there are a ton of `mov`s between `igpr`. The compiler inserts tons of moves. Because this is all done in a single pass, there's a lot of cases where the compiler can't know if a move is needed or not. But the register allocator can figure it out and will remove most unneeded moves. Adding moves can also prevent stack spills. For example, consider the case where you want to get the return value of function `a`, then call function `b`, then call function `c` with the return value of function `a`. If there are a lot of moves, the register allocator can figure out a way to temporarily stash the value in a saved register instead of spilling to the stack.
Another thing to notice is that GOAL nested function calls suck. Example:
requires loading `format`, `#t`, the string, and `x` into registers, then calling `(factorial-iterative x)`, then calling `format`. This has to be done this way, just in case the `factorial-iterative` call modifies the value of `format`, `#t`, or `x`.
An important type in the compiler is `Val`, which is a specification on how to get a value. A `Val` has an associated GOAL type (`TypeSpec`) and the IR Pass should take care of all type checking. A `Val` can represent a constant, a memory location (relative to pointer, or static data, etc), a spot in an array, a register etc. A `Val` representing a register is a `RegVal`, which contains an `IRegister`. An `IRegister` is a register that's not yet mapped to the hardware, and instead has a unique integer to identify itself. The IR assumes there are infinitely many `IRegister`s, and a later stage maps `IRegister`s to real hardware registers.
The general process starts with a `compile` function, which dispatches other `compile_<thing>` functions as needed. These generally take in `goos::Object` as a code input, emit IR into an `Env`/or modify things in the `Env`, and return a `Val*` describing the result.
This shouldn't return a `Val` for a register containing the value of `my_object->my_field`, but should instead return something that represents "the memory location of my_field in my_object". This way you can do
If the compiler actually needs the value of something, and wants to be sure its a value in a register, it will use the `to_reg` method of `Val`. This will emit IR into the current function to get the value in a register, then return a `Val` that represents this register. Example `to_reg` implementation for integer constants:
Where `x` should be the old value of `my-field`. The `Val` for `x` needs to be `to_reg`ed _before_ getting inside the `let`. There's also some potential confusion around the order that you compile and `to_gpr` things. In a case where you need a bunch of values in gprs, you should do the `to_gpr` immediately after compiling to match the exact behavior of the original GOAL. For example
When we `compile` the `(-> my-array (* x y))`, it will emit code to calculate the `(*x y)`, but won't actually do the memory access until we call `to_reg` on the result. This memory access should happen __before__`some-function` is called.
In general, each time you `compile` something, you should immediately `to_gpr` it, _before_`compile`ing the next thing. Many places will only accept a `RegVal` as an input to help with this. Also, the result for almost all compilation functions should be `to_reg`ed. The only exceptions are forms which deal with memory references (address of operator, dereference operator) or math.
Another important thing is that compilation functions should _never_ modify any existing `IRegister`s or `Val`s, unless that function is `set!`, which handles the situation correctly. Instead, create new `IRegister`s and move into those. I am planning to implement a `settable` flag to help reduce errors.
auto result = compile_error_guard(code, fe.get());
```
The `compile_error_guard` function takes in code (as a `goos::Object`) and a `Env*`, and returns a `Val` representing the return value of the code. It calls the `compile` function, but wraps it in a `try catch` block to catch any compilation errors and print an error message. In the case where there's no error, it just does:
In our case, the code starts with `(defun..`, which is actually a GOOS macro. It throws away the docstring, creates a lambda, then stores the function in a symbol:
As an example, I'm going to look at `compile_add`, which handles the `+` form, and is representative of typical compiler code for this part. We start by checking that the arguments look valid:
In the integer case, we first create a new variable in the IR called an `IRegister` that must be in a GPR (as opposed to an XMM floating point register), and then emit an IR instruction that sets this result register to the first argument.
which emits an IR to add the value to the sum. The `to_math_type` will emit any code needed to convert this to the correct numeric type (returns either a numeric constant or a `RegVal` containing the value).
An important detail is that we create a new register which will hold the result. This may seem inefficient in cases, but a later compile pass will try to make this new register be the same register as `first_val` if possible, and will eliminate the `IR_RegSet`.
This step figures out how to match up `IRegister`s to real hardware registers. In the case where there aren't enough hardware registers, it figures out how to "spill" variables onto the stack. The current implementation is a very greedy one, so it doesn't always succeed at doing things perfectly. The stack spilling is also not handled very efficiently, but the hope is that most functions won't require stack spilling.
This step is run from `compile_asm_function` on the line:
The actual algorithm is too complicated to describe here, but it figures out a mapping from `IRegister`s to hardware registers. It also figures out how much space on the stack is needed for any stack spills, which saved registers will be used, and deals with aligning the stack.
This part actually generates the static data and x86 instructions and stores them in an `ObjectGenerator`. See `CodeGenerator::do_function`. It emits the function prologue and epilogue, as well as any extra loads/stores from the stack that the register allocator added. Each `IR` gets to emit instructions with:
Each IR has its own `do_codegen` that emits the right instruction, and also any linking data that's needed. For example, instructions that access the symbol table are patched by the runtime to directly access the correct slot of the hash table, so the `do_codegen` also lets the `ObjectGenerator` know about this link:
here `0xbadbeef` is used as a placeholder offset - the runtime should patch this to the actual offset of the symbol.
There's a ton of book-keeping to figure out the correct offsets for `rip`-relative addressing, or how to deal with jumps to/from IR which become multiple (or zero!) x86-64 instructions. It should all be handled by `ObjectFileGenerator`, and not in `do_codegen` or `CodeGenerator`.
This actually lays out everything in memory. It takes a few passes because x86 instructions are variable length (may even change based on which registers are used!), so it's a little bit tricky to figure out offsets between different instructions or instructions and data. Finally it generates link data tables, which efficiently pack together links to the same symbols into a single entry, to avoid duplicated symbol names. The link table also contains information about linking references in between different segments, as different parts of the object file may be loaded into different spots in memory, and will need to reference each other.
which adds a message header then just sends the code over the socket.
The receive process is complicated, so this is just a quick summary
-`Deci2Server` receives it
- calls `GoalProtoHandler` in chunks, which stores it in a buffer (`MessBuffArea`)
- Once fully received, `WaitForMessageAndAck` will return a pointer to the message. The name of this function is totally wrong, it doesn't wait for a message, and it doesn't ack the message.
- The main loop in `KernelCheckAndDispatch` will see this message and run `ProcessListenerMessage`
-`ProcessListenerMessage` sees that it has code, copies the message to the debug heap, and links it.
- The `link_and_exec` function doesn't actually execute anything because it doesn't have the `LINK_FLAG_EXECUTE` set, it just links things. It moves the top level function and linking data to the top of the heap (temporary storage for the kernel) and keep both the main segment and debug segment of the code on the debug heap. It'll move them together and eliminate gaps before linking. After linking, the `ListenerFunction->value` will contain a pointer to the top level function, which is stored in the top temp area of the heap. This `ListenerFunction` is the GOAL `*listener-function*` symbol.
- The next time the GOAL kernel runs, it will notice that `*listener-function*` is set, then call this function, then set it to `#f` to indicate it called the function.
- After this, `ClearPending()` is called, which sends all of the `print` messages with the `Deci2Server` back to the compiler.
- Because the GOAL kernel changed `ListenerFunction` to `#f`, it does a `SendAck()` to send a special `ACK` message to the compiler, saying "I got the function, ran it, and didn't crash. Now I'm ready for more messages."