Inside PowerShell 3: The New Parser and Compiler

If I’m not mistaken, Jeffery Snover was quoted, during Microsoft build, that PowerShell 3.0 scripts run up to six times faster than the same 2.0 scripts. This is a dramatic improvement! One of the reasons the new version is so much faster is that the PSObjects now derive from the DynamicObject. This allows for all kinds of optimization. I’ve pointed to Joel Bennett’s post about this before. After writing my last post about the internals of the parser in PowerShell 2.0, Jason Shirk reached out to me to point out that in PowerShell 3.0 the parser has been completly rebuilt and is now public. This is awesome news! This means that the same hack that I performed in the last post can now be done with a public API that is new and optimized!

Accessing the new Parser 

The full type name for the parser is System.Management.Automation.Language.Parser. It exposes a Parse method that accepts a string that contains the contents of a script, just like the previous Parser version. Also, just like the previous version, the method returns a root node of a parse tree. Rather than the ParseTreeNode class that we saw before, we will now have an instance of the ScriptBlockAst class. This class derives from the Ast class which is the base for all the nodes within the tree. I can only imagine AST refers to an abstract syntax tree. An AST is slightly different than a parse tree in that a parse tree will account for each element within the script while a AST will only account for required syntactical items and will imply others, such as grouping tokens.

Passing a simple Get-Process call into the parser and examining the members of the AST returns this.

$tokens = $null
$parseErrors = $null
$ast = [System.Management.Automation.Language.Parser]::ParseInput("Get-Process", [ref]$tokens, [ref]$parseErrors)

$ast | Get-Member

   TypeName: System.Management.Automation.Language.ScriptBlockAst

Name              MemberType Definition
----              ---------- ----------
Equals            Method     bool Equals(System.Object obj)
Find              Method     System.Management.Automation.Language.Ast Find(System.Func[System.Management.Automation.Language.Ast,bool] pr...
FindAll           Method     System.Collections.Generic.IEnumerable[System.Management.Automation.Language.Ast] FindAll(System.Func[System....
GetHashCode       Method     int GetHashCode()
GetHelpContent    Method     System.Management.Automation.Language.CommentHelpInfo GetHelpContent()
GetScriptBlock    Method     scriptblock GetScriptBlock()
GetType           Method     type GetType()
ToString          Method     string ToString()
Visit             Method     System.Object Visit(System.Management.Automation.Language.ICustomAstVisitor astVisitor), System.Void Visit(Sy...
BeginBlock        Property   System.Management.Automation.Language.NamedBlockAst BeginBlock {get;}
DynamicParamBlock Property   System.Management.Automation.Language.NamedBlockAst DynamicParamBlock {get;}
EndBlock          Property   System.Management.Automation.Language.NamedBlockAst EndBlock {get;}
Extent            Property   System.Management.Automation.Language.IScriptExtent Extent {get;}
ParamBlock        Property   System.Management.Automation.Language.ParamBlockAst ParamBlock {get;}
Parent            Property   System.Management.Automation.Language.Ast Parent {get;}
ProcessBlock      Property   System.Management.Automation.Language.NamedBlockAst ProcessBlock {get;}

There are some pretty interesting members to examine in the class. If, for instance, we can call the GetScriptBlock() method and call the Invoke() on the returned block, we will see the Get-Process cmdlet called.

$ast.GetScriptBlock().Invoke()

There are also blocks for each of begin, process and end blocks found in advanced functions. The pre-release documentation for this entire namespace is already available on MSDN. One extremely interesting member is the Visit method. This method enables a ICustomAstVisitor to be specified. According to the documentation, that can allow for custom traversal of the AST. This could lead to some interesting implementations. I'm excited to see what developers and third parties do with this kind of access to the PowerShell interpreter.
Under the hood

Now that we have seen how the public API has been exposed, let's take a look at some of the guts within the Language namespace. Just what makes PowerShell so much faster now?

Just browsing through some of the classes in the Language namespace, one of them caught my attention; the Compiler class. The Compiler class is an ICustomAstVisitor. This means that the compiler tells the AST how it should be traversed.

After some perusing around the members within this class I ran into the Compile method. The method accepts an AST and outputs an Expression. This means that the compiler visits each node into the AST and compiles it into a LINQ expression tree. Pretty wicked stuff!

This expression can then be compiled and invoked. When expressions are compiled they are actually converted into MSIL, just like C# or VB.NET during .NET compilation. They are then stored as a DynamicMethod within the current process. Since it is compiled it won't need to be reinterpreted in the future. There is an ExpressionCache class that most likely handles caching pre-compiled expressions which make scripts run faster.   <-(actually not what I thought) We can see that it compiles the lambda in the CompileTree method.

I hope you found this exercise as interesting as I did! Thanks to Jason Shirk for pointing this out and nice work on the new version!

Edit:

One thing I'm interested about. Could you emit this compiled expression tree to a DLL? If so could it be invoked easily enough?

Compiling to a DLL

You can not simply compile the lambdas that come out of the PowerShell compiler into DLLs. This was the result of my tests.

$da = [System.AppDomain]::CurrentDomain.DefineDynamicAssembly((New-Object  System.Reflection.AssemblyName("dyn")), [System.Reflection.Emit.AssemblyBuilderAccess]::Save)

$dm = $da.DefineDynamicModule("dyn_mod", "dyn.dll")
$dt = $dm.DefineType("dyn_type")
$method = $dt.DefineMethod("Foo",  [System.Reflection.MethodAttributes]::Public -bor [System.Reflection.MethodAttributes]::Static)

$tokens = $null
$parseErrors = $null
$ast = [System.Management.Automation.Language.Parser]::ParseInput("$a = 1 + 2", [ref]$tokens, [ref]$parseErrors)

$compiledBlock = $ast.GetScriptBlock() 

$CompiledScriptBlockType = [SYstem.Reflection.Assembly]::LoadWithPartialName("System.Management.Automation").GetType("System.Management.Automation.CompiledScriptBlock")
$CompiledScriptBlockDataType = [SYstem.Reflection.Assembly]::LoadWithPartialName("System.Management.Automation").GetType("System.Management.Automation.CompiledScriptBlockData")
$BindingFlags = [System.Reflection.BindingFlags]::Instance -bor [System.Reflection.BindingFlags]::NonPublic

$CompiledScriptBlockType.GetMethod("Compile", $BindingFlags).Invoke($compiledBlock, @($true))
$lambda = $CompiledScriptBlockType.GetProperty("EndBlockTree", $BindingFlags).GetValue($compiledBlock, @())

$lambda.CompileToMethod($method)
$dt.CreateType()
$da.Save("dyn.dll")

Exception calling "CompileToMethod" with "1" argument(s): "CompileToMethod cannot compile constant '' because it is a non-trivial value, such as a live object. 
Instead, create an expression tree that can construct this value."
At C:\Users\Administrator\Documents\test.ps1:28 char:1
+ $lambda.CompileToMethod($method)
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : InvalidOperationException

According to the comments on the CompileToMethod documentation, the method is very limited in what it can actually accomplish.

 

You can leave a response, or trackback from your own site.

7 Responses to “Inside PowerShell 3: The New Parser and Compiler”

  1. James Tryand says:

    Nice work Adam. Clear, straightforward and simple article on a topic that could easily have been convoluted. :)
    It’s just crazy how impressive a leap this makes powershell, especially as it’s prototyping capabilities were already so good.

  2. [...] Inside PowerShell 3: The New Parser and Compiler (Adam Driscoll) [...]

  3. Rob Campbell says:

    I noticed something had changed with scriptblocks in V3. The type is now CompiledScriptblock, instead of just ScriptBlock. Also you cannot set the .isfilter property on a script block in V3. In V2 you used to be able to create an anonymous filter by just creating a script block and setting it’s .isfilter to $true.

    This seems to explain what’s happening there.

  4. ingted says:

    Dear,

    I compiled successfully!!!!

    I thought there are something to modify and the code below is what I modified from your codes^^

    ============================================================================================
    $da = [System.AppDomain]::CurrentDomain.DefineDynamicAssembly((New-Object System.Reflection.AssemblyName(“dyn”)), [System.Reflection.Emit.AssemblyBuilderAccess]::Save)

    $dm = $da.DefineDynamicModule(“dyn_mod”, “dyn.dll”)
    $dt = $dm.DefineType(“dyn_type”)
    $method = $dt.DefineMethod(“Foo”, [System.Reflection.MethodAttributes]::Public -bor [System.Reflection.MethodAttributes]::Static)

    $tokens = $null
    $parseErrors = $null
    $ast = [System.Management.Automation.Language.Parser]::ParseInput(‘$a = 1 + 2′, [ref]$tokens, [ref]$parseErrors)

    $compiledBlock = $ast.GetScriptBlock()

    $SMA = [SYstem.Reflection.Assembly]::LoadWithPartialName(“System.Management.Automation”)
    $CompiledScriptBlockType = $SMA.GetType(“System.Management.Automation.ScriptBlock”)
    $CompiledScriptBlockDataType = $SMA.GetType(“System.Management.Automation.CompiledScriptBlockData”)
    $BindingFlags = [System.Reflection.BindingFlags]::Instance -bor [System.Reflection.BindingFlags]::NonPublic

    $CompiledScriptBlockType.GetMethod(“Compile”, $BindingFlags).Invoke($compiledBlock, @($true))
    $blocktree = $CompiledScriptBlockType.GetProperty(“EndBlockTree”, $BindingFlags)
    $lambda = $blocktree.GetValue($compiledBlock, @())

    $lambda.CompileToMethod($method)
    $dt.CreateType()
    $da.Save(“dyn.dll”)

  5. ingted says:

    PS. I found that this script should be executed step by step
    or if I one-time press “F5″, it failed with “InvokeMethodOnNull”.

  6. Patrick says:

    Hi, I want to get the last line of a specific line. I already accomplished this with powershell v3 code, simple with the ..Language.Parser => $function.Body.Extent.EndLineNumber. But I this script should also run on v2, i tried with ..Language.PSParser and with ScriptBlock. But without any success. Someone knows how I could get the last line ( as an int) of a specific function name in a .ps1 file?

Leave a Reply

In an effort to prevent automatic filling, you should perform a task displayed below.



eight + = 14