Keith's profileKeith Hill's BlogPhotosBlogListsMore ![]() | Help |
|
September 29 Effective PowerShell Item 9: Regular Expressions - One of the Power Tools in PowerShellWindows PowerShell is based on the .NET Framework. That is, it is built using the .NET Framework and it exposes the .NET Framework to the user. One very nice feature of the .NET Framework is the Regex class in the System.Text.RegularExpressions namespace. It is a very capable regular expression engine. PowerShell uses this regular expression engine in a number of scenarios:
Obviously to get the most out of these operators and the Select-String cmdlet it helps to have a good grasp of regular expressions. PowerShell provides a help topic named "about_Regular_Expression" that you can view like so: PS C:\> help about_reg* This topic provides a nice quick reference on the various metacharacters in a regular expression but you are not going to learn a great deal about creating powerful regular expressions. To learn how to get the most out of regular expressions and hence PowerShell, I highly recommend Jeffrey Friedl's book Mastering Regular Expressions. Right now on the Amazon site it has 117 reviews and its rating is 4 1/2 stars out of 5. There is a shortcoming in PowerShell's support for regular expressions that you need to know about. Most other script languages support regular expression syntaxes where you can find all matches in a string. For example in Perl I could do this: $_ = "paul xjohny xgeorgey xringoy stu pete brian"; # PERL script Unfortunately the Select-String cmdlet doesn't have this feature - yet. So for now you can work around this limitation by using the System.Text.RegularExpressions.Regex class directly. Fortunately you don't have to type that long class name because PowerShell has a type alias: [regex]. Very convenient! PS C:\> $str = "paul xjohny xgeorgey xringoy stu pete brian" One thing to watch out for is when your regular expression is written to search across line boundaries. For instance, if you use Get-Content to grab the contents of a file to apply the regular expression against, keep in mind that Get-Content streams the file one line at a time. For regular expressions that operate across lines you will need to apply the regex to the file contents represented as a single string. In that case, I would do this: PS C:> $regex = (?<CMultilineComment>/\*[^*]*\*+(?:[^/*][^*]*\*+)*/) Note the use of the PowerShell Community Extensions cmdlet "Join-String" which takes the individual strings output by Get-Content and creates a single string separated by newline characters. Also note that this example shows the usage of a named capture: CMultilineComment. Now it would be even better if Select-String supported a "MatchAll" parameter that found all string matches in the specified file or string. That said, this example does show that when PowerShell is missing a feature, the access that it provides to the .NET Framework is a great escape hatch! If I have one beef with regular expressions it is that there are a number of engines and their support for various features and metacharacters varies. I'm especially annoyed that Visual Studio's regular expression find & replace doesn't use the .NET regular expression engine. I constantly have to switch mental contexts when moving between the two. Oh well, as long as you stay within PowerShell I think you will find that a good grasp of regular expressions will help you be more productive. September 24 Effective PowerShell Item 8: Output Cardinality - Scalars, Collections and Empty Sets - Oh My!In the last post Effective Powershell Item 7: Understanding "Output", we covered a lot about PowerShell output. However there is a bit more you need to understand to use PowerShell effectively. This post concerns the cardinality of PowerShell output. That is, when does PowerShell output a scalar versus a collection (or array) versus no output (empty set). In this post I use the term collection in a broad manner for various types of collections including arrays. Working with Scalars PS C:\> $num = 1 However you may be dealing with scalars when you think you are working with collections. For instance, when you send a collection down the pipe, PowerShell will automatically "flatten" the collection meaning that each individual element of the collection is sent down the pipe, one after the other. For example: PS C:\> filter GetTypeName {$_.GetType().Fullname} So in fact, the down stream pipeline stages do *not* operate on the original collection as a whole. The vast majority of the time, PowerShell's collection flattening behavior within the pipe is what you want. Otherwise, you would wind up with code like this to manually flatten the collection: PS C:\> foreach($item in $array){$item} | GetTypeName Note that this would require us to manually flatten every collection with the insertion of an extra foreach statement in the pipe. Since pipes are typically used to operate on the elements of a sequence and not the sequence as a whole, it is very sensable that PowerShell does this flattening automatically. However there may be times when you need to defeat the flattening. There's good news and bad news on this topic. First the bad news. Technically you can't defeat this behavior. PowerShell always flattens collections. The good news is that we can work around PowerShell flattening behavior by creating a new collection that contains just one element - our original collection. This sounds like it would be a real pain to do this but fortunately PowerShell provides us with a nice shortcut. For example this is how I would modify the previous example to send an array intact down the pipe and not each element: PS C:\> ,$array | GetTypeName The change is subtle. Notice the comma just before $array? That is the unary comma operator and it instructs PowerShell to wrap the object following it, whatever that object is, in a new array that contains a single element - the original object. So PowerShell is still doing its flattening work, we just introduced another collection to get the result that we wanted. Another feature of PowerShell that is somewhat unique with respect to scalar handling is how the foreach statement handles scalars. For example, the following script might surprise some C# developers: PS C:\> $vars = 1 This is because in languages like C#, the variable $vars would have to represent a collection (IEnumerable) or you would get a compiler error. This isn't a problem in PowerShell because if $vars is a scalar, PowerShell will treat $vars as if it were a collection containing just that one scalar value. Again this is a good thing in PowerShell otherwise if we wrote code like this: PS C:\> $files = Get-ChildItem *.sys Would need to modify it to handle the case where Get-ChildItem finds only one .SYS file. Our script code does not have to suffer the "line noise" necessary to do the check between scalar versus collection data shapes. Now the astute reader may ask - What if Get-ChildItem doesn't find *any* .SYS files? Hold that thought for a bit. Working with Collections PS C:\> $nums = 1,2,3+7..20 Sometimes you may always want to treat the result of some command as a collection even if it may return a single (scalar) value. PowerShell provides a convenient operator to ensure this - the array subexpression operator. Let's look at our Get-ChildItem command again. This time we will force the result to be a collection: PS C:\> $files = @(Get-ChildItem *.sys) In this case, only one file was found. It is important for you to know when you are dealing with a scalar versus a collection because both collections and FileInfo's have a Length property. I have seen this trip up more than a few people. Given that the unary comma operator always wraps the original object in a new array, what does the array subexpression operator do when it operates on an array? Let's see: PS C:\> $array = @(1,2,3,4) As we can see, in this case the array subexpression operator has no effect. Again the astute reader should be asking - what about the case where Get-ChildItem returns nothing? Working with Empty Sets
Seems simple right? Well these rules combine in somewhat surprising ways that can cause problems in your scripts. Here's an example: PS C:\> function GetSysFiles { } So far so good. GetSysFiles has no output so the foreach statement had nothing to iterate over. Let's try a variation. Let's say for sake of argument that our function took a long argument list and we wanted to put the function invocation on its own line: PS C:\> $files = GetSysFiles SomeReallyLongSetOfArguments Hmm, now we got output and all we did was introduce an intermediate variable to contain the output of the function. Honestly this violates the Principle of Least Surprise in my opinion. Let me explain what is happening. By using the temp variable we have invoked rule #2 - assigning to a variable results in our empty set being represented by $null in $files. Seems reasonable so far. Unfortunately our foreach statement abides by rule #3 so it iterates over the scalar value $null. Now PowerShell handles references to $null quite nicely. Notice that our string substitution above in the foreach statement didn't error when it encountered the $null. It just didn't print anything for $null. However, .NET framework methods aren't nearly as forgiving: PS C:\> foreach ($file in $files) { "Basename: $($file.Substring(2))" } Bummer. That means that you really need to be careful when using foreach to iterate over the results of something where you aren't sure whether the results could be an empty set and your script won't tolerate iterating over $null. Note that using the array subexpression operator can help here but it is crucial to use it in the correct place - again an issue with the language that shouldn't exist IMO. For example, the following placement does *not* work: PS C:\> foreach ($file in @($files)) { "Basename: $($file.Substring(2))" } Since $files was already set to $null, the array subexpression operator just creates an array with a single element - $null - which foreach happily iterates over. What I recommend is to put the function call entirely within the foreach statement if the function call is terse. The foreach statement obviously knows what to do when the function has no output. If the function call is lengthy, then I recommend that you do it this way: PS C:\> $files = @(GetSysFiles SomeReallyLongSetOfArguments) When you apply the array subexpression operator directly to a function that has no output, you will get an empty array and not an array with a $null in it. If you find this situation as confusing and error prone as I do, please feel free to vote on the following defect submission: Foreach should not execute the loop body for a scalar value of $null function ReturnArrayAlways {
$result = @()
# Do something here that may add 0, 1 or more elements to array $result
# $result = 1
# or
# $result = 1,2
,$result
}
In summary, watch out for how the foreach statement deals with the scalar value $null which can get synthesized automatically by PowerShell when a function has no output. September 16 Effective PowerShell Item 7: Understanding "Output"In shells that you may have used in the past, everything that appears on the stdout and stderr streams is considered "the output". In these other shells you can typically redirect stdout to a file using the redirect operator '>'. And in some shells like Korn shell, you can capture stdout output to a variable like so: DIRS=$(find . | sed.exe -e 's/\//\\/g') If you wanted to capture stderr in addition to stdout then you can use the stream redirect operator like so: DIRS=$(find . | sed.exe -e 's/\//\\/g' 2>&1) You can do the same in PowerShell: $Dirs = Get-ChildItem -recurse Looks about the same in PowerShell so what's the big deal? Well there are a number of differences and subtleties in PowerShell that you need to be aware of. Output is Always a .NET Object First, remember that PowerShell output is always a .NET object. That output could be a System.IO.FileInfo object or a System.Diagnostics.Process object or a System.String object. Basically it could be any .NET object whose assembly is loaded into PowerShell even your own .NET objects. Be sure not to confuse PowerShell output with the text you see rendered to the screen when you don't capture output. In Effective PowerShell Item 3: Know Your Output Formatters we covered this notion that when a .NET object is about to "hit" the host (console) PowerShell uses some fancy formatting technology to try to determine the best "textual" representation for the object. When you capture output to a variable, you are *not* capturing the text that was rendered to the host. You are catching the .NET object. Let's look at an example: PS C:\> Get-Process PowerShell Handles NPM(K) PM(K) WS(K) VM(M) CPU(s) Id ProcessName Now let's capture that output and examine its type: PS C:\> $Proc = Get-Process PowerShell As you can see, a System.Diagnostics.Process object has been stored in $Proc and not the text that was rendered to the screen. But what if we really wanted to capture the rendered text? In this case, we could use the Out-String cmdlet to render the output as a string which we could then capture in a variable e.g.: PS C:\> $Proc = Get-Process PowerShell | Out-String Handles NPM(K) PM(K) WS(K) VM(M) CPU(s) Id ProcessName Another nice feature of Out-String is that it has a Width parameter that allows you to specify the maximum width of the text that is rendered. This is handy when there is wide output that you don't want wrapped or truncated to the width of your host. Function Output Consists of Everything That Isn't Captured I've seen this problem bite folks time and time again on the PowerShell newsgroup. It usually happens to those of us with programming backgrounds who are familiar with C style functions. What you need to be aware of is that in PowerShell, a function is quite a bit different. While a function in PowerShell does provide a separate scope for variables and a convenient way to invoke the same functionality multiple times without breaking the DRY principle, the way it deals with output can be confusing at first. Essentially a function handles output in the same way as any PowerShell script that isn't in a function. What the heck does that mean? Let's look at an example. For instance a programmer might look at this function definition: function foo { "hi"; read-host "press enter"; "there" } And expect it to prompt with "press enter" and then display "hi" and "there" but you would be wrong: PS C:\> foo there Even though you can think of "hi" and "string" as outputs of the function, those outputs are output "immediately". They aren't returned from the function and then output to the host or to a capturing variable. This is probably not surprising to those familiar with other shells but if your background is in programming it goes against your preconceived notions of what a function is. PowerShell also allows us to use a C style construct - the return statement - in a way that furthers this incorrect impression that PowerShell functions are like C functions e.g.: PS C:\> function bar { That should return us an array of System.Diagnostic.Process objects, right? We told PowerShell to "return $Proc". Let's check the output: PS C:\> $result | foreach {$_.GetType().Fullname} Whoa! Why is the first object System.String? Well a quick look at its value and you'll see why: PS C:\> $result[0] Notice that the informational message we thought we were displaying to the host actually got returned as part of the output of the function. There are a couple of subtleties to understand here. First, the return keyword allows you to exit the function at any particular point. You may also "optionally" specify an argument to the return statement that will cause the argument to be output just before returning. "return $Proc" does *not* mean that the functions only output is the contents of the $Proc variable. In fact this construct is semantically equivalent to "$Proc; return". The second subtlety to understand is this: The line: is equivalent to this line: That makes it clear that the string is considered part of the "output". Now what if we wanted to make that information available to the end user but not the script consuming the output of the function? Then we could have used Write-Host like so: PS C:\> function bar { Write-Host does not contribute to the output of the function. It writes directly to the host. This might all seem obvious now but you have to be diligent when you write a PowerShell function to ensure you get only the output you want. This usually means redirecting unwanted output to $null (or optionally type casting the expression with the unwanted output to [void]). Here's an example: PS C:\> function LongNumericString { Note that we don't *need* to use the return keyword like we do in C style function. Whatever expressions and statements that have output will contribute to the output of our function. In the function above, we obviously want the output of $strBld.ToString() to be the function's output. So what is the output? PS C:\> LongNumericString Capacity MaxCapacity Length Yikes! That is probably more than what you were expecting. The problem is that the StringBuilder.Append() method returns the StringBuilder object so that you can cascade appends. Unfortunately now our function outputs 20 StringBuilder objects and one System.String object. It is simple to fix though, just throw away the unwanted output like so: PS C:\> function LongNumericString { Other Types of Output That Can't Be Captured In the previous section we saw one instance of a particular output type - Write-Host - that doesn't contribute to the stdout output stream. In fact, this type of output can't be captured. The argument to Write-Host's -object parameter is sent directly to the host console bypassing the "stdout" output stream. So unlike stderr output that can be captured as shown below, Write-Host output doesn't use streams and therefore can't be redirected. PS C:\> $result = remove-item ThisFilenameDoesntExist 2>&1 Write-Host output can only be captured using the big stick - the Start-Transcript cmdlet. Start-Transcript logs everything that happens during a PowerShell session. If you need to create a comprehensive log file that captures everything then Start-Transcript is the only game in town. [Update: 05/25/2008 - Well except for one major hole. Transcripts don't capture any output from EXEs. Thanks to Shay for pointing that out.] Keep in mind that Start-Transcript is meant more for session logging than individual script logging. For instance, if you normally invoke Start-Transcript in your profile to log your PowerShell session, a script that calls Start-Transcript will generate an error because you can't start another transcript if one has already been started. You have to stop the previous one first. Here is the run down on the forms of output that can't be captured except via Start-Transcript:
If you happen to agree with me that this situation should be better in a future version of PowerShell, please feel free to vote on these two issues: Capture Warning, Verbose, Debug and Host Output Via Alternate Streams That's it. Just remember to keep an eye on what statements and expressions are contributing to the output of your PowerShell functions. Testing is always a good way to verify that you are getting the output you expect. September 04 Effective PowerShell Item 6: Know What Objects Are Flowing Down the PipeTo use Windows PowerShell pipes effectively, it really helps to know what objects are flowing down the pipe. Sometimes objects get transformed from one type to another. Without the ability to inspect what type is being used at each stage of the pipeline the results you see at the end can be mystifying. For example, the following question came up on the microsoft.public.windows.powershell newsgroup: Given a set of subdirs in a known directory, I need to cd into each directory and execute a command. One approach to solving this is: PS C:\> Get-Item * | where {$_.PSIsContainer} | push-location -passthru | That worked fine for my du utility because it works off the current directory. However in the spirit of experimentation I thought I would try specifying the full path. I was a bit surprised when it didn't work: PS C:\> Get-Item * | where {$_.PSIsContainer} | push-location -passthru | Du v1.31 - report directory disk usage No matching files were found. So what is going on here? Let's see how you could find out using our old friend Get-Member: PS C:\> Get-Item * | where {$_.PSIsContainer} | Get-Member TypeName: System.IO.DirectoryInfo Name MemberType Definition OK that is what I expected so far - DirectoryInfo objects. Let's look further down the pipe: PS C:\> Get-Item * | where {$_.PSIsContainer} | Set-Location -PassThru | Get-Member TypeName: System.Management.Automation.PathInfo Name MemberType Definition WTF! Set-Location took our DirectoryInfo objects and turned them into PathInfo objects and passed those down the pipe honoring my -PassThru parameter. However in this case, Set-Location didn't actually "pass through" the object. It gave us an entirely new object! You will notice that the PathInfo object doesn't have a Fullname parameter but it does have several path related parameters. I wonder which one we should use? Don't forget item 4 - know your output formatters (aka format-list * is your friend). Let's try it: PS C:\> Get-Item * | where {$_.PSIsContainer} | Set-Location -PassThru | Drive : Now that we can see the property values it is pretty obvious that the ProviderPath property is the one to use when passing the path to a legacy EXE. It is very doubtful that such an EXE would understand how to interpret the Path property. Note that in this example I also used Select -First 1 to pick off the first directory. This is handy if the command outputs a *lot* of objects. There's no use waiting for potentially thousands of objects to be processed when all you need is to see the property values for one of them. One thing to note about Get-Member for this scenario is that it outputs a lot of type member information that is just noise when all you want to know is the type names of the objects. Get-Member also only shows you the type information once for each unique type of object. This gives you no sense of how many objects of the various types are passing down the pipe. This information is easy to access via the GetType() method that is available on all .NET objects e.g.: PS C:\> Get-ChildItem | Foreach {$_.GetType().FullName} GetType() returns a System.RuntimeType object that has all sorts of interesting information. The property we are interested in is FullName. If I had used Get-Member instead I would have gotten about 125 lines of text surrounding the two lines indicating the type names. In fact this sort of filter is so handy that it is worth putting in your profile: PS C:\> filter gtn { if ($_ -eq $null) {'<null>'} else {$_.GetType().Fullname }} The PowerShell Community Extensions provides this filter however its implementation is a bit more robust. For instance, there are occasions when it is also important to know that *no* objects were passed down the pipeline. Our simple gtn (short for Get-TypeName) filter isn't so helpful here: PS C:\> @() | gtn We get no output which is perhaps a reasonable indication that no objects were output down the pipe. However with the PSCX implemention of this filter, we wanted to provide a bit more guidance in this situation e.g.: PS C:\> @() | gtn 126> ,@() | gtn -full In summary, when debugging the flow of objects down the pipe be sure to take advantage of Get-Member to show you what properties and methods are available on those objects. Use Format-List * to show you all the property values on those objects. And use our handy little filter gtn (aka Get-TypeName) to see the type names of each and every individual object passed down the pipe in the order that the next cmdlet will see them. Interface Design: Pragmatism vs DogmatismSomeone pointed out this blog post to me: http://kirillosenkov.blogspot.com/2007/08/choosing-interface-vs-abstract-class.html The following quote concerns me: "An interface should define at most one member." I cringe at hard-and-fast rules like this. I understand the author's sentiment but I guess I would have phrased it like so: Interfaces should be designed to be provide a minimal surface area to accomplish a specific task. If that can be one with a single member - great! Now if that surface area grows too large, consider factoring the one interface into multiple interfaces if the tasks can be cleanly separated. What constitutes too “large large”? That’s where software developers get paid the big bucks to use good engineering judgment. :-) Effective PowerShell Item 5: Use Set-PSDebug -Strict In Your Scripts - ReligiouslyWindows PowerShell is like most dynamic languages in that it allows you to use a variable without declaring its type and without having assigned to it. This is handy for interactive use, you can do stuff like this: PS C:\Temp> Get-ChildItem | Foreach -Process {$sum += $_.Name.Length} -End {$sum} Here $sum isn't a defined variable and yet we are adding a value to it and assigning to it. PowerShell just assumes a value of $null and coerces that 0 in the case above. Try this at the prompt: PS C:\Temp> $xyzzy -eq $null It is not likely that this variable is already defined somewhere. Of course we could verify that like so: PS C:\> Test-Path Variable:\xyzzy and indeed it isn't defined. So what has this to do with using Set-PSDebug -Strict in scripts - religiously? Well, once you get burned by an unfortunate typo that takes time to notice and time to track down, you will want a way to avoid repeating that mistake. Take this script for example: $Suceeded = test-path C:\ProjectX\Src\BuiltComponents\Release\app.exe if ($Succeeded) { This script has a problem with it that PowerShell won't tell you about. It will happily indicate that every build fails even though that may not be true. This is all because of a minor typo where I misspelled $Succeeded when testing the path. In this snippet, the typo may be obvious to you but when you have a several hundred line script file, typos aren't always so obvious. You can prevent this particular problem from ever happening by placing Set-PSDebug -Strict at the top of your script file just after the param() statement (if any). For example, given this script as FOO.PS1: Set-PSDebug -Strict if ($Succeded) { PS C:\Temp> .\foo.ps1 What would have happened if we had omitted the Set-PSDebug -Strict invocation? This script would have output "doh". NOTE: that in some cases we may need to initialize a variable in order to avoid the error above. This is a small price to pay to avoid this sort of problem. BTW the title of this post was perhaps a bit "over the top". There may very well be times not to use Set-PSDebug -Strict in your scripts. As always, use your judgment. There you have it. A simple way to avoid a major headache with debugging large scripts. Effective PowerShell Item 4: Commenting Out Lines in a Script FileOK the last couple of items have been long. I promise a short one here. Windows PowerShell doesn't provide multiline comments. Multiline comments come in handy when you need to comment out multiple lines in a script file. However there is a reasonable workaround. Use a here string. A here string allows you to enter multiple lines of text and prevent PowerShell from interpreting commands. However the extent of PowerShell's interpretation depends on which type of here string you use. For instance, in double quoted here strings, PowerShell expands variables and also execute subexpressions. This is an example of a double quoted here string e.g.: PS C:\> @" However a single quoted here string doesn't do this: PS C:\> @' Use the single quoted here string to comment out lines of script since it evaluates nothing in the contained string. Just note, the here string is an expression so if you do nothing more, the whole string will be emitted to the console. Usually you don't want that when you are commenting out code. All you need to do is cast the string to [void] (or redirect the string to $null): [void]@' This will effectively comment out those lines of script. NOTE: There are a couple of gotchas to be aware of with here strings. There can be *NO* whitespace after the initial @' character sequence. If there is one single space after this sequence you will get this cryptic error: Unrecognized token in source text. The other gotcha is that the closing '@ character sequence has to start in column zero otherwise you get this error message: Encountered end of line while processing a string token. The final gotcha to watch out for is that you can't nest here strings within another here string of the same ilk (single quoted or double quoted). What this means for our commenting out script scenario is that you won't be able to surround a chunk of script that uses a single quoted here strings with another single quoted here string to comment out that code. Effective PowerShell Item 3: Know Your Output FormattersI have mentioned previously that Windows PowerShell serves up .NET objects for most everything. Get-ChildItem (alias Dir) outputs a sequence of System.IO.FileInfo and System.IO.DirectoryInfo objects output. Get-Date outputs a System.DateTime object. Get-Process outputs System.Diagnostics.Process objects and Get-Content outputs System.String objects (or arrays of them based on how -ReadCount is set). You get the idea - PowerShell's currency is .NET objects. This isn't always obvious because of the way that PowerShell renders these .NET objects to text for display on the host console. Let's imagine for a moment that we had to figure out how to solve this problem ourselves. Our first approach might be to rely on the ToString() method that is available on every .NET object. That would work fine for some .NET objects e.g.: PS C:\> (get-date).ToString() But not so well for others: PS C:\> (Get-Process)[0].ToString() Hmm, that is certainly less than satisfying. Guess we need to think a little harder. OK let's not strain our brains. :-) Let's just look at how the PowerShell team solved this problem. They invented the notion of "views" for the common .NET types as well as a default view for any particular .NET type they provide a view for. You don't have to use the formatting cmdlets. If you don't specify a formatting cmdlet then PowerShell will choose a formatter based on the default view for a .NET type which could be tabular, list, wide or custom. Quick defintion break: Types for objects. The System.DateTime class is a .NET type, there is only one of these. The Get-Date cmdlet outputs an instance of this type a.k.a an object. There can be many DateTime objects based off the one definition of System.DateTime. PowerShell defines a view for the type that gets applied to all instances (objects) of that type. OK so what if PowerShell doesn't define a view for a .NET type? This is a certainty because the possible set of .NET types is infinite. I could create one right now called Plan9FromOuterSpace, compile it into a .NET assembly and load it into PowerShell. How's PowerShell going to deal with the type it isn't familiar with? Let's see: @' PS C:\> csc /t:library Plan9.cs Director Genre NumStars It seems that up to a certain number of public properties (IIRC 5), PowerShell will use a tabular view. If you more than that number of public properties then PowerShell falls back to a list view. OK back to the topic of views. There can be (and often is) multiple views defined for a single .NET type. These views are defined in XML format files in the PowerShell install directory: PS C:\> Get-ChildItem $PSHOME\*format* Directory: Microsoft.PowerShell.Core\FileSystem::C:\Windows\System32\ Mode LastWriteTime Length Name These views look like this: <View> <Name>process</Name> <ViewSelectedBy> <TypeName>System.Diagnostics.Process</TypeName> <TypeName>Deserialized.System.Diagnostics.Process</TypeName> </ViewSelectedBy> <TableControl> <TableHeaders> <TableColumnHeader> <Label>Handles</Label> <Width>7</Width> <Alignment>right</Alignment> </TableColumnHeader> <TableColumnHeader> <Label>NPM(K)</Label> <Width>7</Width> <Alignment>right</Alignment> </TableColumnHeader> <TableColumnHeader> <Label>PM(K)</Label> <Width>8</Width> <Alignment>right</Alignment> </TableColumnHeader> <TableColumnHeader> <Label>WS(K)</Label> <Width>10</Width> <Alignment>right</Alignment> </TableColumnHeader> <TableColumnHeader> <Label>VM(M)</Label> <Width>5</Width> <Alignment>right</Alignment> </TableColumnHeader> <TableColumnHeader> <Label>CPU(s)</Label> <Width>8</Width> <Alignment>right</Alignment> </TableColumnHeader> <TableColumnHeader> <Width>6</Width> <Alignment>right</Alignment> </TableColumnHeader> <TableColumnHeader /> </TableHeaders> <TableRowEntries> <TableRowEntry> <TableColumnItems> <TableColumnItem> <PropertyName>HandleCount</PropertyName> </TableColumnItem> <TableColumnItem> <ScriptBlock>[int]($_.NPM / 1024)</ScriptBlock> </TableColumnItem> <TableColumnItem> <ScriptBlock>[int]($_.PM / 1024)</ScriptBlock> </TableColumnItem> <TableColumnItem> <ScriptBlock>[int]($_.WS / 1024)</ScriptBlock> </TableColumnItem> <TableColumnItem> <ScriptBlock>[int]($_.VM / 1048576)</ScriptBlock> </TableColumnItem> <TableColumnItem> <ScriptBlock> if ($_.CPU -ne $()) { $_.CPU.ToString("N") } </ScriptBlock> </TableColumnItem> <TableColumnItem> <PropertyName>Id</PropertyName> </TableColumnItem> <TableColumnItem> <PropertyName>ProcessName</PropertyName> </TableColumnItem> </TableColumnItems> </TableRowEntry> </TableRowEntries> </TableControl> </View> The XML definition above is of the "table view" for the Process type. It defines the column attributes of the view as well as the data that goes into each column, in some cases massaging the data into a more easily consumable value (KB vs bytes or MB vs bytes). Here is the "wide view" definition for the Process type: <View> <Name>process</Name> <ViewSelectedBy> <TypeName>System.Diagnostics.Process</TypeName> </ViewSelectedBy> <WideControl> <WideEntries> <WideEntry> <WideItem> <PropertyName>ProcessName</PropertyName> </WideItem> </WideEntry> </WideEntries> </WideControl> </View> In this "wide view" the only property that PowerShell will display is the ProcessName. In searching the DotNetTypes.format.ps1xml, we can find more definitions. The following StartTime "named view" isn't invoked by default, you have to specify it by name to the Format-Table cmdlet: <View> <Name>StartTime</Name> <ViewSelectedBy> <TypeName>System.Diagnostics.Process</TypeName> </ViewSelectedBy> <GroupBy> <ScriptBlock>$_.StartTime.ToShortDateString()</ScriptBlock> <Label>StartTime.ToShortDateString()</Label> </GroupBy> <TableControl> <TableHeaders> <TableColumnHeader> <Width>20</Width> </TableColumnHeader> <TableColumnHeader> <Width>10</Width> <Alignment>right</Alignment> </TableColumnHeader> <TableColumnHeader> <Width>13</Width> <Alignment>right</Alignment> </TableColumnHeader> <TableColumnHeader> <Width>12</Width> <Alignment>right</Alignment> </TableColumnHeader> </TableHeaders> <TableRowEntries> <TableRowEntry> <TableColumnItems> <TableColumnItem> <PropertyName>ProcessName</PropertyName> </TableColumnItem> <TableColumnItem> <PropertyName>Id</PropertyName> </TableColumnItem> <TableColumnItem> <PropertyName>HandleCount</PropertyName> </TableColumnItem> <TableColumnItem> <PropertyName>WorkingSet</PropertyName> </TableColumnItem> </TableColumnItems> </TableRowEntry> </TableRowEntries> </TableControl> </View> Why I am showing you all this? I think it is important to understand the "magic" behind how a .NET object - this binary entity - gets rendered into text on your host console. With this knowledge, you should never forget that you are dealing with .NET objects first and foremost. You also may be wondering if there is an easier way to figure out what views are available for any particular .NET type. There is if you have the PowerShell Community Extensions installed. PSCX provides a handy script provided by Joris van Lier called Get-ViewDefinition and you can use it like so: PS C:\> get-viewdefinition System.Diagnostics.Process Name : process Name : Priority Name : StartTime Name : process From this output you can see that there are quite a few views that you might not have been aware of related to the System.Diagnostics.Process .NET type that Get-Process outputs. Let's check out these alternate views: PS C:\> Get-Process | Format-Wide audiodg csrss
ProcessName Id HandleCount WorkingSet PriorityClass: Normal ProcessName Id HandleCount WorkingSet PriorityClass: High ProcessName Id HandleCount WorkingSet
ProcessName Id HandleCount WorkingSet StartTime.ToShortDateString(): 8/31/2007 ProcessName Id HandleCount WorkingSet StartTime.ToShortDateString(): 8/29/2007 BTW what if you have forgotten what formatters are available to you in PowerShell? Don't forget that you can use the first of the "big four" cmdlets, Get-Command like so: PS C:\> get-command format-* CommandType Name Definition You are probably already pretty familiar with Format-Table. It presents data in tabular format. This is the default format for many views including those for System.Diagnostics.Process and . Format-Wide is also pretty straight-forward. PowerShell displays a single property defined by PowerShell (ie the most interesting) in multiple columns. Format-Custom is interesting but probably not a formatter that you will use that often - it will be implicitly invoked for those .NET types that have custom views like System.DateTime: <View> <Name>DateTime</Name> <ViewSelectedBy> <TypeName>System.DateTime</TypeName> </ViewSelectedBy> <CustomControl> <CustomEntries> <CustomEntry> <CustomItem> <ExpressionBinding> <PropertyName>DateTime</PropertyName> </ExpressionBinding> </CustomItem> </CustomEntry> </CustomEntries> </CustomControl> </View> DateTime is a ScriptProperty that PowerShell that is defined like so: PS C:\> get-date | Get-Member -Name DateTime TypeName: System.DateTime Name MemberType Definition This brings me to my favorite formatter that I use when I'm spelunking PowerShell output. Notice that the Definition column above is truncated. Often when I want to see everything I will use the Format-List cmdlet. This formatter outputs the various property values on individuals lines so that data is rarely truncated e.g.: PS C:\> get-date | Get-Member -Name DateTime | Format-List TypeName : System.DateTime Now we can see the entire definiton of the DateTime ScriptProperty. NOTE: PowerShell often defines an abbreviated set of these property values to display by default with the Format-List cmdlet. It doesn't want you to be overwhelmed with information. However, when you're spelunking you want to see all the gory details. All you have to do to get all the property values is execute "format-list *". Check out the default list format for a Process object: PS C:\> (Get-Process)[0] | Format-List Id : 1284 versus what you get when you ask Format-List to give you everything: PS C:\> (Get-Process)[0] | Format-List * __NounName : Process See what I mean? Look at how much information you would have missed if you forgot that little 'ol "*"! In summary, if there is one and only one thing you get out of this long post please let it be this. "Format-List *" is the formatter to use when you want to look at all the property values of an object. September 03 Effective PowerShell Item 2: Use the Objects Luke. Use the Objects!Using Windows PowerShell requires a shift in your mental model for how command line shells deal with information. In most shells like CSH, Korn shell, BASH, etc you deal primarily with information in text form. For instance the output of ls or ps is captured into a string variable and cut, prodded and parsed to coax out the required piece of information. As it turns out, PowerShell provides very handy text manipulation functions like:
Note that by default, PowerShell treats all text (actually System.String objects) case-insensitive when doing things like a comparison or a regular expression search or replace. Because of these handy string manipulation features, it is very easy to "fall back" into the old way of string cutting, parsing and string comparisons. Sometimes this is unavoidable even in PowerShell but many times you can just use the object provided to you. The benefits are often:
Let's look at an example. The following issue came up in the public.microsoft.windows.powershell newsgroup recently. How do you test the output of dir a.k.a. get-childitem to filter out directories leaving only the files to be operated on further down the pipeline? Here's an approach to this problem that I think of as "falling back" into the old ways: PS C:\Windows\System32> get-childitem | where-object {$_.mode -ne "d"} First let me point out that this command doesn't work but more importantly it relies on string comparisons to determine whether or not an item passing down the pipeline is a folder. If you are bent on doing the filtering the "old way" then the following will work however vis-a-vie the previous solution it illustrates how easy it is to get the string comparison wrong: PS C:\Windows\System32> get-childitem | where-object {$_.mode -notlike "d*"} Yet there is a better approach for this type of problem - the PowerShell way. PowerShell decorates every item that is output by the Get-ChildItem and *-Item CMDLETs with additional properties. This is even independent of which provider is being used - file system, registry, function, etc. We can see those extra properties, all of which are prefixed with PS, by using our old friend Get-Member like so: PS Function:\> new-item -type function "foo" -value {} | gm TypeName: System.Management.Automation.FunctionInfo Name MemberType Definition One of those extra properties is PSIsContainer and this property tells us that the object is a container object. For the registry, this means RegistryKey and for the file system it means directory (DirectoryInfo object). So this problem can be solved more directly like so: PS C:\Windows\System32> get-childitem | where-object {!$_.PSIsContainer} That is a bit less to type and is much less error prone. However what about this performance claim? OK let's try both of these approaches (I'll also throw in the regex-based -notmatch) and measure their performance: PS C:\Windows\System32> OK so what are the results: PS C:\Windows\System32> $oldWay1 | measure-object TotalSeconds -ave Count : 1 PS C:\Windows\System32> $oldWay2 | measure-object TotalSeconds -ave Count : 1 PS C:\Windows\System32> $poshWay | measure-object TotalSeconds -ave Count : 1 So doing a little math - in PowerShell - we get: PS C:\Windows\System32> "{0:P0}" -f (169.26 / 61.53) Yikes! The string comparison approach using the Mode property is over 275% slower than using the PSIsContainer property. With PowerGadgets we can see this: PS C:\> $data = @{ PowerGadgets are pretty sweet. I use them a lot in presenting various reports to my management chain. This is off topic but I have one chart that displays the checkin activity per day. It is interesting to see the spike in checkins just before each iteration milestone is reached. :-) In summary, keep in mind that even though the PowerShell console output gives you the illusion that you are only dealing with text, there are .NET objects behind all that text output! You are often dealing with objects richer in information than System.String and many times those objects have just the information you are looking for in the form of a property. You can then extract that information without resorting to text parsing. For an additional example of operating object properties vs text, check out my post on Sorting IPAddresses the PowerShell Way. Effective PowerShell Item 1: The Four Cmdlets That are the Keys to Finding Your Way Around PowerShellI have been a big fan of the Effective series of books over the years from Effective COM to Effective XML. Without trying to be too presumptuous, I thought I would capture some of the tidbits I've picked up over the last couple of years using Windows PowerShell interactively and writing production build & test scripts. This first item is pretty "basic" and I debated whether or not it belongs in an "Effective PowerShell" article. In the end, these four cmdlets are so critical to figuring out for yourself how to make PowerShell work for you that I thought it was worth it. Unfortunately this is a long post. I'm going to try hard to keep my future Effective PowerShell posts much shorter than this first one. The following four CMDLETs are the first four that you should learn backwards and forwards. With these four simple-to-use cmdlets you can get started using PowerShell - effectively. #1. Get-Command - This CMDLET is the sure cure to the blank, PowerShell prompt of death. That is, you just installed PowerShell, fired it up and you're left looking at this: Now what? Many applications suffer from the "blank screen of death" i.e. you download the app, install it and run it and now you're presented with a blank canvas or an empty document. Often it isn't obvious how to get started using a foreign application. In PowerShell, what you need to get started is Get-Command to find all the commands that are available from PowerShell. This includes all your old console utilities, batch files, VBScript files, Perl files, etc. Basically anything that is executable can be executed from PowerShell. Of course, you didn't download PowerShell just to run these old executables and scripts. You want to see what PowerShell can do. Try this: PS C:\Users\Keith> get-command CommandType Name Definition By default, get-command list all the CMDLETs that PowerShell provides. Notice that Get-Command is one of those CMDLETs. Get-Command can list more information but how would you figure that out? This brings us to the second command you need to know and will be using frequently in PowerShell. #2. Get-Help - This CMDLET will provide information on what a CMDLET does, what parameters it takes and usually it will include examples of how to use the command. It will also provide help on general PowerShell topics like globbing and providers like Registry and FileSystem. Say you want to know what *all* the help topics are in PowerShell - easy - just do this: PS C:\Users\Keith> get-help * Name Category Synopsis And if you only want to see the "about" help topics try this: PS C:\Users\Keith> get-help about* Name Category Synopsis Now, let's try Get-Help on Get-Command and see what else we can do with Get-Command: PS C:\Users\Keith> get-help get-command -detailed NAME SYNOPSIS PARAMETERS -verb <string[]> -noun <string[]> -commandType <CommandTypes> TIP: you will want to use the -Detailed parameter with Get-Help otherwise you get very minimal parameter information. Hopefully in PowerShell v.next they will fix the "default view" of CMDLET help topics to be a bit more informative. There's a couple things to learn from the help topic. First, you can pass Get-Command a commandType to list other types of commands. Let's try this to see what PowerShell functions are available by default: PS C:\Users\Keith> get-command -commandType function CommandType Name Definition Excellent. We could do the same for filters, aliases, applications, etc. Also note that Get-Command allows you search for CMDLETs based on either a Noun or a Verb. There's a more compact form that most of the PowerShell regulars use instead of these parameters though: PS C:\Users\Keith> get-command write-* CommandType Name Definition You can swap the wildcard char to find all verbs associated with a particular noun (usually the more useful search): PS C:\Users\Keith> get-command *-object CommandType Name Definition Finally, we can pass a name to Get-Command to find out if this name will be interpreted as a command and if so, what type of command: cmdlet, function, filter, alias, externalscript, script or application. In this usage, Get-Command is like the Unix 'which' command on steriods. Let me show you what I mean: PS C:\Users\Keith> get-command more CommandType Name Definition Note that PowerShell tells me not only the location of applications like more.com, it also tells me what type of command each is (function vs application) as well as the functions defintion and the priority order of the commands. In this case, PowerShell will execute its 'more' function if you were to use the command 'more'. [Update 11/16/07: The output order is does *not* indicate the priority order in which PowerShell will execute commands with the same name. This is disappointing]. If you wanted to use the Windows executable, you would need to use the command 'more.com'. However there is even more information to be found here that meets the eye. This brings us to our third and final important cmdlet to become familiar with. #3. Get-Member - The single biggest concept that takes a while to sink in with most people using PowerShell for the first time is that just about *everything* is (or can be) a .NET object. That means when you pipe information from one CMDLET to another it quite often isn't text and if it is, it is still an object. To be specific, it is a System.String object. However, quite often it is some other type of object and being new to PowerShell, quite often you won't know what type of object it is or what you can do with that object. Let's take a further look at what information (ie objects) Get-Command outputs. In order to do this, we will use Get-Member like so: PS C:\Users\Keith> get-command more.com | get-member TypeName: System.Management.Automation.ApplicationInfo Name MemberType Definition Well now, isn't this interesting. Unlike the Unix 'which' command that only gives us the path to the application, here we get a bit more information. Let's examine the FileVersionInfo on this command: PS C:\Users\Keith> get-command more.com | foreach {$_.FileVersionInfo} ProductVersion FileVersion FileName This is just an inkling of the power of being able to access objects instead of unstructured information like plain text. Get-Member is also very handy for discovery what properties and methods are available on .NET classes. PS C:\Users\Keith> get-date | get-member TypeName: System.DateTime Name MemberType Definition You can also find out information about static properties and methods like so: PS C:\Users\Keith> [System.Math] | get-member -static TypeName: System.Math Name MemberType Definition #4 Get-PSDrive - Next to "everything is an object" the next biggest notion to digest in PowerShell is that the file system is one of many stores than can be manipulated by many of the same cmdlets you use to manipulate the file system. First, how do you find out which drives are available in PowerShell? Why use the Get-PSDrive command: PS C:\> Get-PSDrive Name Provider Root CurrentLocation All these drives can be manipulating using same cmdlets you use to manipulate the file system. What are those? Use Get-Command *-Item* to find out: PS C:\> Get-Command *-Item* CommandType Name Definition There you have it. The four CMDLETS that you *need* to know to really find your way around Windows PowerShell. Get-Command to find out what commands/functionality is available. Get-Help to find out how to use that functionality. Get-Member to figure out what information and functionality is available on those .NET objects you'll be dealing with in PowerShell. And Get-PSDrive to find out which stores you can operate on besides the obvious one (the file system). Comments are welcome. Update: Added Get-PSDrive based on Jeffrey's comment. |
|
|