V2 Compiler Passes

Pass 0: Lexical analysis
Pass 1: Local variable identification and macroexpansion
Pass 2: TBD

Pass 0: Lexical analysis

Done by the Lisp reader. 'Nuff said.

Pass1: Local variable identification and macroexpansion

Hard to go further without knowing the code to be compiled. This involves macroexpansion to all levels. Examining a form (op arg1 arg2 ... ), there are several possibilities:

op is a special form
op signifies a macro
op is of the form .name
op is a symbol of the form ns/name where ns is an existing Namespace or alias for one in the CurrentNamespace and ns names a type
op is of the form name.
otherwise

Result of macroexpansion is, respectively,

the original form itself (no macroexpansion)
Result of calling Var identified by op on the entire form, the local variable environment, the next of the form.
(. name arg1 arg2 ... ) (host expression, either static or virtual method call, depending on whether arg1 names a type
(. ns name arg1 arg2 ...) (static method)
(new name arg1 args 2 ... ) (new expression)
the original form itself (no macroexpansion)

Determining where op is a macro:

op is a Symbol and not a local variable
op is a Var or a Symbol naming a Var, the Var is marked as a macro and is not marked as private

By inspection, macroexpansion and the identification of local variable scopes are intertwined. This pass must walk the form being compiled, keeping track of the local variable environment and macroexpanding along the way.

Output TBD: could be a simple structure with local variable introduction nodes and expression nodes and leave it at that. Or, one could go to gross level of analysis as done by Compiler.analyze and bottom out with SymbolExpr, KeywordExpr, etc.

Pass 2: Type inference & interop call resolution

With the code expanded, types can be chased throughout the tree. User type tags, type info on Var'd IFns, and through flow interop calls. Likely this will include all identification of known flow of value type values. We should add a boxing node type to the AST to mark explicitly where value types get boxed.

Pass 3: invocation resolution

For remaining (non-interop) nodes (fn arg1 arg2 ...) identify invocation type: regular, static, prim, ... . THis might need to be combined with Pass 2 above.

Other passes in no particular order

Identification of constants to compile in (symbols, keywords, maps/lists/sets, etc.)
Adornment of sequence points and other IL debug information

Pass: Optimizations

Could come in two flavors: optimizations on the AST nodes, or optimizations on the (abstract, pseudo) IL. I won't know what is possible or needed here until we see where the above gets us.

Pass: IL Generation

A question to be resolved is if there is an intermediate IL (a la Swift IL, e.g.) that sits between the AST representation and the final IL. We definitely want an explicit IL representation tied to MSIL that allows inspection and manipulation prior to going to ILGen.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly