|
Spaces home Keith Hill's BlogProfileFriendsFilesMore ![]() | ![]() |
Keith Hill's BlogWindows PowerShell MVP
|
||||||||||||||||||||||||||||||||||||||||||||||||
|
May 11 Effective PowerShell Item 13: Comparing Arrays in Windows PowerShellPowerShell has a lot of useful operators such as -contains which tests if an array contains an particular element. But as far as I can tell PowerShell doesn't "seem" to provide an easy way to test if two array's contents are equal. This if often quite handy and I was a bit surprised by this apparent omission. I came upon this need to compare arrays while answering a question on the microsoft.public.windows.powershell newsgroup. The poster wanted to find UTF8 encoded files by inspecting their BOM or byte order mark. One relatively straight forward approach to this is:
While it is easy enough to visually inspect this and see we have a match, visual inspection doesn't work in a script. :-) You could also test each individual element which isn't bad for a three element array but when you hit say 10 elements that approach might starting looking tedious. You might think that we could just compare these two arrays directly like so:
But comparing arrays via the -eq operator doesn't actually compare the contents of two arrays. As you can see above, this results in no output. When the left hand side of the -eq operator is an array, PowerShell return the elements of the array that match the value specified on the right hand side (shown above where I test for -eq to 0xbb). OK so it looks like we need to roll our own mechanism to compare arrays. Here is one way: function AreArraysEqual($a1, $a2) { while ($enum1.MoveNext() -and $enum2.MoveNext()) { And it works as expected:
However there turns out to be a way to do this within PowerShell but it isn't exactly obvious. At least it wasn't to me - at first.
Good old Compare-Object will compare the arrays and if there are no differences it won't output anything. A simple compare to $null will confirm that. How about performance of these two approaches:
Compare-Object beats out my PowerShell function by a good margin which isn't too surprising[1]. After all, one is compiled code and the other is interpreted script. So there you have it. If you need a quick way to compare to arrays, just remember that arrays are objects too and that is what Compare-Object does best - compare two objects. [1] - Except for comparing against the same array - seems the Compare-Object cmdlet could benefit from a System.Object.ReferenceEquals check. Admittedly this is a bit of a corner case scenario. May 09 Effective PowerShell Item 12: Understanding ByValue Pipeline Bound ParametersIn item 11, I covered ByPropertyName pipeline bound parameters. In this post, I'll cover the other variety of pipeline binding - ByValue. ByValue binding takes the input object itself and attempts to bind it by type using type coercion if possible to parameters decorated as ByValue. For example, most of the *-Object utility cmdlets operate on whatever object is presented to them. The help on Where-Object shows this: -inputObject <psobject> Required? false It turns out that ByValue isn't nearly as popular as ByPropertyValue. How can I make such a statement you ask? Well this is one of the things that I love about PowerShell. It provides so much metadata about itself. It is very "self describing". You can easily walk every parameter on every cmdlet that is currently loaded into PowerShell. First let's see what information is available for a parameter:
The interesting properties for us here are the Name and ValueFromPipeline* properties. Given this information it is easy to figure out how many of each type there are:
So from here we can see the following:
So indeed binding by property name is much more common. Binding by value from the pipeline is primarily for cmdlets that manipulate objects. In the query below we can see that the InputObject parameter is by far the most common "ByValue" pipeline bound parameter:
A little further digging reveals the cmdlets that use the ByValue bound InputObject parameters as shown below. Note that a single parameter can appear in more than one parameter set on a cmdlet, which explains why there are only 36 cmdlets that account for the 40 instances of InputObject.
As you can see most of these cmdlets are designed to deal with objects in general. Note to cmdlet developers - pipeline bound parameters is how your cmdlets receive pipeline objects. When writing a cmdlet there is no $_. If your cmdlet wants to "participate" in the pipeline it must set the ParameterAttribute property ValueFromPipeline and/or ValueFromPipelineByPropertyName to true on at least one of its parameters. As mentioned above most ByValue parameters are of the InputObject (type psobject or psobject[]) variety so they pretty much accept anything. However not all cmdlets work that way. The -Id parameter (type [long[]]) on Get-History is pipeline bound ByValue. The follow Trace-Command output shows how PowerShell works hard when necessary to convert the input object's type to the expected type. In this case a scalar string value of '1' to an array of Int64:
Note that on the first attempt, PowerShell tries to convert the string to an array of Int64 and fails. Then it tries again by treating the input as psobject. It throws that psobject at an internal help class method LanguagePrimitives.ConvertTo() that successfully converts the string '1' to an Int64[] containing the value 1. When a parameter is both ByValue and ByPropertyName bound, PowerShell attempts to bind in this order:
There is more to the parameter binding algorithm like finding the best match amongst different parameter sets. BTW one last tidbit related to parameters. The PowerShell help topics aren't completely automatically generated and as a result they aren't always correct. For instance, look up the parameters on Get-Content and see if you find a -Wait parameter - you won't. :-) However the metadata is always complete and correct e.g.:
Hopefully this post has given you more knowledge about ByValue parameters and how to explore and get more information on cmdlet parameters in general. In summary, there actually isn't much you need to know about ByValue pipeline bound parameters because in most cases they just work intuitively. Just be sure to keep your eye out for those parameters that bind ByPropertyName. They are the ones whose pipeline bound usage isn't as obvious. May 03 Windows PowerShell V2 CTP2 Is AvailableThe PowerShell team just posted the announcement late last night. There is a download link in the announcement. This drop has lots of new features like Module support for organizing and loading related functionality as well as transaction support for registry operations using the Registry provider. There have also been some changes since the first CTP so take a look at the release notes. Let me reiterate the caution that these are "preview" (the P in CTP) bits. There aren't even consider beta quality although I would qualify that caution mostly applies to the new functionality. That said, I would not put these bits on a production machine. In fact, there have been reports about incompatibility between the System Center Virtual Machine Manager latest drop and V2 CTP bits. If you don't have a spare/test PC on which to play with these bits then go grab Virtual PC (free), Virtual Server or VMWare and create a sandbox VM with which to play with this CTP. You will want to create that image based on either Vista SP1 or Windows Server 2008 if you want to take the remote management features for a spin. One last thing. The Graphical PoweShell is new in this release and is in need of feedback. Folks, now is the time to influence the direction of this feature. If you wait too long the team won't have time to make anything but very minor changes. So please use it for a while and send your feedback to gPSFback@microsoft.com. April 25 PowerShell Makes It Into Top 50 Programming Languages!At #46 in the TIOBE Programming Community Index for April 2008, PowerShell sits above such languages as:
Now what can we do to bump it up the list a few spots past REXX (44), Tcl/Tk (39) and Bash (37)? Hmm.... April 06 Effective PowerShell Item 11: Understanding ByPropertyName Pipeline Bound ParametersEverybody likes to be efficient, right? I mean we all generally like to solve a problem in an efficient way. In PowerShell that usually culminates in a "one-liner". Honestly for pedagogical purposes I find it better much better to expand these terse, almost 'Obfuscated C' style commands into multiple lines. However there is no denying that when you want to bang out something quick at the console - given PowerShell's current line editing features - a one-liner helps stave off repetitive stress injuries. It's not PowerShell's fault. They're just using the antiquated console subsystem in Windows that hasn't changed much since NT shipped in 1993. One trick to less typing is to take advantage of pipeline bound parameters. Quite often I see folks write a command like:
That works but the use of the Foreach-Object cmdlet is technically unnecessary. Many PowerShell cmdlets bind their "primary" parameter to the pipeline. This is indicated in the help file for Get-Content as shown below: -path <string[]> Required? true <snip> -literalPath <string[]> Required? true
Note that there are actually four parameters on Get-Content that accept pipeline input ByPropertyName. Two of which are shown above. The other two are ReadCount and TotalCount. The qualifier ByProperyName simply means that if the incoming object has a property of that name it is available to be "bound" as input to that parameter. That is, if a type match can be found or coerced. For instance, we could simplify the command above by eliminating the Foreach-Object cmdlet altogether:
While it is intuitive that Get-Content should be able to handle the System.IO.FileInfo objects that Get-ChildItem outputs, it isn't obvious based on the ByPropertyValue rule I just mentioned. Why? Well the FileInfo objects output by Get-ChildItem don't have either a Path property or a LiteralPath property even accounting for the extended properties like PSPath. So how the heck does Get-Content determine the path of a file in this pipeline scenario? There are at least two ways to find this out. The first is the easier approach. It uses a PowerShell cmdlet called Trace-Command that shows you how PowerShell binds parameters. The second approach involves spelunking in the PowerShell assemblies using Lutz Roeder's .NET Reflector. Let's tackle this problem initially using Trace-Command. Trace-Command is a built-in tracing facility that shows a lot of the inner workings of PowerShell. I will warn you that it tends to be prolific with its output. One particularly useful area you can trace is parameter binding. Here's how we would do this for the command above:
This outputs a lot of text and unfortunately it is "Debug" stream text that isn't easily searchable or redirectable to a file. Oh well. The interesting output from this command are the following lines: BIND PIPELINE object to parameters: [Get-Content] This output has been simplified a bit to be more readable in this post. I also changed the initial command to output just a single FileInfo object to reduce the amount of output. The information we get from Trace-Command shows us that PowerShell tries to bind the FileInfo object to the Get-Content parameters and fails (NO COERCION) on all except for the LiteralPath parameter. OK well that tells us definitively how Get-Content is getting the path but it doesn't make sense. There is no LiteralPath property on a FileInfo object and there is no extended property called LiteralPath either. This is where the second technique of using .NET Reflector (download here) can be used to see a reverse compiled version of the PowerShell source. After starting .NET Reflector and loading the Microsoft.PowerShell.Commands.Management.dll assembly, we find the GetContentCommand and inspect the LiteralPath parameter shown below: [Alias(new string[] { "PSPath" }), Parameter(Position = 0, ParameterSetName = "LiteralPath", Mandatory = true, ValueFromPipeline = false, ValueFromPipelineByPropertyName = true)] public string[] LiteralPath { } Note the Alias attribute on this parameter. It creates another valid name for the LiteralPath parameter - PSPath which corresponds to the extended PSPath property on all FileInfo objects. That is what allows the ByPropertyName pipeline input binding to succeed. The property named PSPath matches the parameter name albeit via an alias. Where does that leave us? There are a number of cases where we can pipe an object directly to a cmdlet in the next stage of the pipeline because of pipeline input binding where PowerShell searches for the most appropriate parameter to bind that object to. Here is another example of piping directly to another cmdlet without resorting to the use of the Foreach-Object cmdlet:
You also now have a way to determine how PowerShell binds pipeline input to a parameter of a cmdlet. And thanks to Reflector we know that some parameters have aliases like PSPath to assist in this binding process. That's it for ByPropertyName pipeline input binding. There is another type of pipeline input binding called ByValue that I'll cover in a future post. March 02 Nothing's Perfect Including PowerShellToday I needed to count the number of errors in a log file. Pretty straightforward stuff that I would typically accomplish like so:
And that normally works well for me - except for today. It turns out that this log file is big, really big - as in 600MB worth of log file! The command above runs for quite some time and then fails ignominiously with a System.OutOfMemoryException. Sure enough, a quick execution of "gps -id $pid" revealed that the PowerShell process was consuming 1.7 GB of private memory. No wonder we hit an OOM exception. So back to the drawing board on how to accomplish this in PowerShell. But first I had to do something about the memory footprint of my current PowerShell session. In PowerShell Community Extensions we have a Collect function (which just calls [System.GC]::Collect()). This brought the private memory footprint back down to ~76MB which tells me that PowerShell's pipeline or one of the cmdlets above is hoarding memory. No matter. One of the best things about PowerShell is this awesome escape hatch it provides - direct access to the .NET Framework. Fortunately there is a simple class in the .NET Framework called System.IO.StreamReader that allows you to read text files a line at a time which is important when you' re dealing with huge log files. Here is the resulting solution I came up with:
I monitored the private memory usage of the PowerShell process during the execution of this script. The private memory usage increased about 200K and then didn't budge until the script was finished. No doubt this contributed to the script finishing much faster as compared to the time it took my first attempt to finish, err, run out of memory - 1 min 43 secs versus 7 min 16 secs respectively. When it comes to reading files, another useful .NET Framework method is the static method: [System.IO.File]::ReadAllText(string path) which returns a single string containing the file's entire contents. If you ever need to load the entire contents of a file into a variable for manipulation (say you need to execute a regex over an entire file's contents - not just line-by-line) this method is a good way to go. I find the ReadAllText() method a bit easier to use in this case than Get-Content piped to Out-String. The other benefit of ReadAllText() is that it doesn't add an extra line terminator to the end of the string which is something Out-String will do. It seems like Get-Content should have a parameter to indicate that it should read the entire file into a single string and output that. February 20 PowerShell Community Extensions - It's the Little ThingsToday I was posting about an issue I was having with Team Foundation Server 2008 HTML alert emails that get generated for check-ins. The second column in the details area of the email is way too narrow. I was able to copy/paste the HTML into Expression Web so I could get to the HTML. I wanted to post that HTML into the MSDN Forum post editor but, well, it doesn't like you putting in non-escaped HTML. Fortunately PowerShell and PSCX are at my beckon call. Here is all I had to do to escape my HTML so I could paste it into the HtmlView of the MSDN forum editor:
I just love tools that save me time! January 07 Microsoft Dev Days in DenverIf you want to get some free exposure to the VS 2008 and Office technologies, sign up for Dev Days. It is coming to my local MS office (Denver) - details listed below. When: Thursday, January 31, 2008 8:30 AM - 5:00 PM January 02 XPath Expressions and PSCX's Select-XmlMoW wrote up a nice post on invoking XPATH expressions. Check it out here. Just wanted to let the PSCX users out there know that the equivalent of MoW's PS I:\PowerShell> Function invoke-XpathExpression ([xml]$xml,$expression) {
>> $xn = $xml.PSBase.CreateNavigator()
>> $xn.Evaluate($expression)
>> }
>>
PS I:\PowerShell> # Example using the function
PS I:\PowerShell>
PS I:\PowerShell> invoke-XpathExpression -xml (type gl.xml) -exp "sum(GroceryList/Item/Price)"
29.15 in PSCX would be:
Note that the PSCX cmdlet Select-Xml is oriented towards "selecting" node-sets hence the name. Unfortunately xpath expressions that don't result in node-sets will error. No worries though because PowerShell's Measure-Object cmdlet (measure alias provided by PSCX) can compute the sum easily. OK so that was a bit shorter but what's the big deal. Here's the deal. If your XML uses XML namespaces then this all gets a good bit harder to deal with yourself. Not impossible mind you. I have written a number of posts on handling XML that uses XML namespaces but with Select-Xml it is pretty simple. For instance, let's tweak the XML ever so slightly: <GroceryList xmlns="tempuri.org"> Note the default namespace declaration on the root element. Now the previous XPath expressions won't work but here is all we need to do with Select-Xml to make this work:
All we needed to do was provide the namespace and a temp prefix (ns) to use in the xpath query. Note that the -Namespace parameter will take an array of strings that match this format: "<prefix>=<namespace>". Renewed as Windows PowerShell MVP for 2008Woohoo! Just got word earlier today. I look forward to another awesome year of PowerShell adoption and just maybe v 2.0 - hopefully. December 18 Windows PowerShell Help Available OnlineJust got word that the PowerShell help is now available online on the TechNet site. This is very nice because we will now be able to reference PowerShell help topics by URL like say the help topic for the Certificate provider. December 02 Visual Studio 2008 Training Kit Requires PowerShellI just downloaded this training kit and took a look at its requirements and guess what - PowerShell is required! Cool.
After looking through the installed files, it appears that PowerShell is being used for primarily testing code via verify.ps1 and testLib.ps1 scripts. Here are all the PowerShell script files used by the training kit:
It is really great to see the devdiv folks adopting PowerShell. The requirement to have PowerShell installed will help adoption in the developer space which I think is on the verge of exploding anyway. November 24 Effective PowerShell Item 10: Understanding PowerShell Parsing ModesThe way PowerShell parses commands can be surprising especially to those that are used to shells with more simplistic parsing like CMD.EXE. Parsing in PowerShell is a bit different because PowerShell needs to work well as both an interactive command line shell and a scripting language. This need is driven by use cases such as:
Part and parcel with providing a powerful scripting language is to support more types than just the string type. In fact, PowerShell supports most .NET types including String, Int8, Int16, Int32, Decimal, Single, Double, Boolean, Array, ArrayList, StringBuilder among many, many other .NET types. That's very nice you say but what's this got to do with parsing modes? Think about this. How would you expect a language to represent a string literal? Well most folks would probably expect this representation: "Hello World" And in fact, that is recognized by PowerShell as a string e.g.:
And if you type a string at the prompt and hit the Enter key, PowerShell, being a very nice REPL environment, echoes the string back to the console as shown above. However what if I had to specify filenames using quotes as shown below?
That would immediately "feel" different than any other command line shell out there. Even worse, typing all those quotes would get really annoying, really fast. What to do, what to do? Well my guess is that the team, pretty early on, decided that they were going to need two different ways to parse. First they would need to parse like a traditional shell where strings (filenames, dir names, process names, etc) do not need to be quoted. Second they would need to be able to parse like a traditional language where strings are quoted and expressions feel like those you would find in a programming language. In PowerShell, the former is called Command parsing mode and the latter is called Expression parsing mode. It is important to understand which mode you are in and more importantly, how you can manipulate the parsing mode. Let's look at an example. Obviously we would prefer to type the following to delete files:
That's better. No bloody quotes required on the filenames. PowerShell treats these filenames as strings even without the quotes in command parsing mode. But what happens if my path has a space in it? You would naturally try:
And that works as you would expect. OK now what if I want to execute a program with a space in its path:
That didn't work because are far as PowerShell is concerned we gave it a string, so it just echoes it back to the screen. It did this because it parsed this line in expression mode. We need to tell PowerShell to parse the line in command mode. To do that we use the call operator '&' like so:
Tip: Help prevent repetitive stress injuries to your wrists and use tab (and shift+tab) completion for auto-completing the parts of a path. If the resulting path contains a space PowerShell will insert the call operator for you as well as surround the path with quotes. What's going on with this example is that PowerShell looks at the first non-whitespace character of a line to determine which mode to start parsing in. If it sees [_aA-zZ] or & or . or \ then PowerShell parses in Command mode. One exception to these rules happens when the line starts with a name that corresponds to a PowerShell language keyword like "if", "do", "while", etc. In this case, PowerShell uses expression parsing mode and expects you to provide the rest of the syntax associated with that keyword. The benefits of Command mode are:
So why do we need expression parsing mode? Well as I mentioned before it sure would be nice to be able to evaluate expressions like so:
It isn't a stretch to see how some shells might interpret this example as trying to invoke a command named '64-2'. So how does PowerShell determine if the line should be parsed in expression mode? If the line starts with a number [0-9] or one of these characters: @, $, (, ' or " then the line is evaluated in expression mode. The benefits of expression mode are:
One consequence of the rules for expression parsing mode is that if you want to execute an EXE or script whose name starts with a number you have to quote the name and use the call operator e.g.:
If you were to attempt to execute "64E1" without using the call operator, PowerShell can't tell if you want to interpret that as the number 64E1 (640) or execute the exe named 64E1.exe or the script named 64E1.ps1. It is up to you to make sure you have placed PowerShell in the correct parsing mode to get the behavior you want which in this case means putting PowerShell into command parsing mode by using the call operator. Note I have observed that if you specify the full command name e.g. 64E1.ps1 or 64E1.exe, it isn't necessary to quote the command. Now what if you want to mix and match parsing modes on the same line? Easy. Just use either a grouping expression (), a subexpression $() or an array subexpression @(). This will cause the parser to re-evaluate the parsing mode based on the first non-whitespace character inside the parens. Sidebar: What's the difference between grouping expressions (), subexpressions $() and array subexpressions @()? A grouping expression can contain just a single statement. A subexpression can contain multiple semicolon separated statements. The output of each statement contributes to the output of the subexpression. An array subexpression behaves just like a subexpression except that it guarantees that the output will be an array. The two cases where this makes a difference are 1) there is no output at all so the result will be an empy array and 2) the result is a scalar value so the result will be a single element array containg the scalar value. If the output is already an array then the use of an array subexpession will have no affect on the output i.e. array subexpressions do not wrap arrays inside of another array. In the following example I have embedded a command "Get-ChildItem C:\Windows" into a line that started out parsing in expression mode. When it encounters the grouping expression (get-childitem c:\windows), it begins parsing mode re-evaluation, finds the character 'g' and kicks into command mode parsing for the remainder of the text inside the grouping expression. Note that ".Length" is parsed using expression mode because it is outside the grouping expression, so PowerShell reverts back to the previous parsing mode. ".Length" instructs PowerShell to get the Length property of the object output by what was evaluated inside the grouping expression. In this case, it is an array of FileInfo and DirectoryInfo objects. The Length property tells us how many items are in that array.
We can do the opposite. That is, put expressions in lines that started out parsing in command mode. In the example below (admittedly lame) we use an expression to calculate the number of objects to select from the sequence of objects.
Using the ability to start new parsing modes, we can nest commands within commands. This a powerful feature and one I recommend mastering. In the example below PowerShell is happily parsing the command line in command mode when it encounters '@(' a.k.a. the start of an array subexpression. This causes PowerShell to re-evaluate the parsing mode but in this case it finds a nested command. One that grabs the new filename from the first line of the file to be renamed. I used the array subexpression syntax in this case because it guarantees that we will get an array of lines even if there is just one line. If you use a grouping expression instead and the file happens to contain only a single line then PowerShell will interpret the [0] to be "get me the first character in the string" which is "f" in the example below.
There is one final subtlety that I would like to point out and that is the difference between using the call operator (&) to invoke commands and "dotting" commands. Consider invoking a simple script that sets the variable $foo = 'PowerShell Rocks!. Let's execute this script using the call operator and observe the impact on the global session:
Note that using the call operator invokes the command in a child scope that gets thrown away when the command (script, function, etc) exits. That is, the script didn't impact the value of $foo in the global scope. Now let's try this again by dotting the script:
When dotting a script, the script executes in the current scope. As a result, the variable $foo in script.ps1 effectively becomes a reference to the global $foo when the script is dotted from the command line resulting in changing the global $foo variable's value. This shouldn't be too surprising since "dot sourcing", as it's also known, is common in other shells. Note that these rules also apply to function invocation. However for external EXEs it doesn't matter whether you dot source or use the call operator since EXEs execute in a separate process and can't impact the current scope. Here's a handy reference to help you remember the rules for how PowerShell determines the parsing mode.
Once you learn the subtleties of these two parsing modes you will be able to quickly get past those initial surprises like how you execute EXEs at paths containing spaces to putting these parsing modes work for you. | ||||||||||||||||||||||||||||||||||||||||||||||||