-
Notifications
You must be signed in to change notification settings - Fork 5
V2 Compiler Passes
- Pass 0: Lexical analysis
- Pass 1: Local variable identification and macroexpansion
- Pass 2: TBD
Done by the Lisp reader. 'Nuff said.
Hard to go further without knowing the code to be compiled. This involves macroexpansion to all levels. Examining a form (op arg1 arg2 ... )
, there are several possibilities:
-
op
is a special form -
op
signifies a macro -
op
is of the form.name
-
op
is a symbol of the formns/name
wherens
is an existingNamespace
or alias for one in theCurrentNamespace
andns
names a type -
op
is of the formname.
- otherwise
Result of macroexpansion is, respectively,
- the original form itself (no macroexpansion)
- Result of calling
Var
identified byop
on the entire form, the local variable environment, thenext
of the form. -
(. name arg1 arg2 ... )
(host expression, either static or virtual method call, depending on whetherarg1
names a type -
(. ns name arg1 arg2 ...)
(static method) -
(new name arg1 args 2 ... )
(new expression) - the original form itself (no macroexpansion)
Determining where op
is a macro:
-
op
is aSymbol
and not a local variable -
op
is aVar
or aSymbol
naming aVar
, theVar
is marked as a macro and is not marked as private
By inspection, macroexpansion and the identification of local variable scopes are intertwined. This pass must walk the form being compiled, keeping track of the local variable environment and macroexpanding along the way.
Output TBD: could be a simple structure with local variable introduction nodes and expression nodes and leave it at that. Or, one could go to gross level of analysis as done by Compiler.analyze
and bottom out with SymbolExpr
, KeywordExpr
, etc.
With the code expanded, types can be chased throughout the tree. User type tags, type info on Var
'd IFn
s, and through flow interop calls. Likely this will include all identification of known flow of value type values. We should add a boxing node type to the AST to mark explicitly where value types get boxed.
For remaining (non-interop) nodes (fn arg1 arg2 ...)
identify invocation type: regular, static, prim, ... . THis might need to be combined with Pass 2 above.
- Identification of constants to compile in (symbols, keywords, maps/lists/sets, etc.)
- Adornment of sequence points and other IL debug information
Could come in two flavors: optimizations on the AST nodes, or optimizations on the (abstract, pseudo) IL. I won't know what is possible or needed here until we see where the above gets us.
A question to be resolved is if there is an intermediate IL (a la Swift IL, e.g.) that sits between the AST representation and the final IL. We definitely want an explicit IL representation tied to MSIL that allows inspection and manipulation prior to going to ILGen
.