Keith's profileKeith Hill's BlogPhotosBlogListsMore Tools Help
    May 26

    Poor Man's File/Directory Name Indexer Using Windows PowerShell

    The following is a script that I have set up on my dev PCs to run nightly via a scheduled task:

    ## CatalogFileSystem.ps1
    param([
    string[]]$paths)
    Set-PSDebug -Strict function Main { foreach ($path in $Paths) { if (!(Test-Path $path -PathType Container)) { Write-Error "'$path' doesn't exist or isn't a directory" exit 1 } $filelist = Join-Path $path filelist.txt $dirlist = Join-Path $path dirlist.txt Remove-Item $filelist -ErrorAction SilentlyContinue Remove-Item $dirlist -ErrorAction SilentlyContinue CatalogFolder $path } } function CatalogFolder($path) { Get-ChildItem $path -ErrorAction SilentlyContinue | sort FullName | foreach { if ($_.PSIsContainer) { $_.FullName >> $dirlist CatalogFolder $_.FullName } else { $_.FullName >> $filelist } } } . Main
    ## End of CatalogFileSystem.ps1

    My scheduled task is configured like so (and runs everyday at 4 am):

    Program: C:\Windows\System32\WindowsPowerShell\v1.0\PowerShell.exe
    Arguments: -Command C:\Bin\CatalogFileSystem C:, D:

    I also set this task to run with highest privileges on Vista so it can catalog more of the nooks and crannies of my filesystem.  This script creates two files setting at the root of each path supplied.  In my case:

    C:\dirlist.txt
    C:\filelist.txt
    D:\dirlist.txt
    D:\filelist.txt

    Searching for files on my filesystem is now very easy:

    PS> Select-String afxwin\.h \filelist.txt | select Line
    Line
    ----
    C:\Program Files\Microsoft Visual Studio 9.0\VC\atlmfc\include\afxwin.h
    C:\Program Files\Microsoft Visual Studio 9.0\VC\ce\atlmfc\include\afxwin.h

    In fact I have created a shortcut function for the above I call Find-File (alias ff):

    Set-Alias ff Find-File
    function
    Find-File($pattern) { $filelist = Join-Path $pwd.drive.root filelist.txt if (!(Test-Path $filelist -PathType Leaf)) { Write-Error "$filelist doesn't exist or isn't a file" } Select-String $pattern $filelist | foreach {$_.Line} }

    And then this sort of filename search gets even easier:

    PS> ff afxwin\.h
    C:\Program Files\Microsoft Visual Studio 9.0\VC\atlmfc\include\afxwin.h
    C:\Program Files\Microsoft Visual Studio 9.0\VC\ce\atlmfc\include\afxwin.h

    Now you might be wondering why I don't just use the search built into Vista.  Well first off, Vista's search doesn't index your whole hard drive or any volumes other than the contents of C: by default.  While it can be set up to index more locations I worry about the performance hit because the second issue is that the Vista search indexer runs in the background all the time.  The CatalogFileSystem script runs just once a day (or however often you schedule it) when you're typically not using the computer.  Yeah, it could be as much as 24 hours out of date but the vast majority of the time I'm searching for files that have been on my system for a while: C++ header files, SDK header files, C runtime source files, windows DLLs, etc.

    I'm sure some folks are going to think this is goofy but honestly I do search the filelist.txt file quite often (not so much the dirlist.txt file).  Do you have any similar convenience scripts like this?  If you do, please add a comment and let me know.

    May 12

    Northern Colorado .NET User Group Windows PowerShell Presentation

    Tonight I gave a one hour introductory talk on Windows PowerShell to my local .NET user group.  The primary focus of this talk was to show how PowerShell should be interesting to .NET developers.  As promised, here is the slide deck and the samples.  If you have any questions, please let me know.

    May 11

    Effective PowerShell Item 13: Comparing Arrays in Windows PowerShell

    PowerShell has a lot of useful operators such as -contains which tests if an array contains an particular element.  But as far as I can tell PowerShell doesn't "seem" to provide an easy way to test if two array's contents are equal.  This if often quite handy and I was a bit surprised by this apparent omission. 

    I came upon this need to compare arrays while answering a question on the microsoft.public.windows.powershell newsgroup.  The poster wanted to find UTF8 encoded files by inspecting their BOM or byte order mark.  One relatively straight forward approach to this is:

    PS> $preamble = [System.Text.Encoding]::UTF8.GetPreamble()
    PS> $preamble | foreach {"0x{0:X2}" -f $_}
    0xEF
    0xBB
    0xBF
    PS> $fileHeader = Get-Content Utf8File.txt -Enc byte -Total 3
    PS> $fileheader | foreach {"0x{0:X2}" -f $_}
    0xEF
    0xBB
    0xBF

    While it is easy enough to visually inspect this and see we have a match, visual inspection doesn't work in a script.  :-)  You could also test each individual element which isn't bad for a three element array but when you hit say 10 elements that approach might starting looking tedious. 

    You might think that we could just compare these two arrays directly like so:

    PS> $preamble -eq $fileHeader | Get-TypeName # Get-TypeName is from the PowerShell Community Extensions
    WARNING: Get-TypeName did not receive any input. The input may be an empty collection. You can either 
    prepend the collection expression with the comma operator e.g. ",$collection | gtn" or you can pass the
    variable or expression to Get-TypeName as an argument e.g. "gtn $collection".
    PS> $preamble -eq 0xbb 187

    But comparing arrays via the -eq operator doesn't actually compare the contents of two arrays.  As you can see above, this results in no output.  When the left hand side of the -eq operator is an array, PowerShell return the elements of the array that match the value specified on the right hand side (shown above where I test for -eq to 0xbb).

    OK so it looks like we need to roll our own mechanism to compare arrays.  Here is one way:

    function AreArraysEqual($a1, $a2) {
        if ($a1 -isnot [array] -or $a2 -isnot [array]) {
          throw "Both inputs must be an array"
        }
        if ($a1.Rank -ne $a2.Rank) {
          return $false 
        }
        if ([System.Object]::ReferenceEquals($a1, $a2)) {
          return $true
        }
        for ($r = 0; $r -lt $a1.Rank; $r++) {
          if ($a1.GetLength($r) -ne $a2.GetLength($r)) {
                return $false
          }
        }

        $enum1 = $a1.GetEnumerator()
        $enum2 = $a2.GetEnumerator()   

        while ($enum1.MoveNext() -and $enum2.MoveNext()) {
          if ($enum1.Current -ne $enum2.Current) {
                return $false
          }
        }
        return $true
    }

    And it works as expected:

    PS> AreArraysEqual $preamble $fileHeader
    True

    However there turns out to be a way to do this within PowerShell but it isn't exactly obvious.  At least it wasn't to me - at first. 

    PS> @(Compare-Object $preamble $fileHeader -sync 0).Length -eq 0
    True

    Good old Compare-Object will compare the arrays and if there are no differences it won't output anything.  If we wrap the output of Compare-Object in an array subexpression @() then we will get an array with either 0 or more elements.  A simple compare of the length to 0 will confirm that there was no output, hence the arrays are equal. 

    [Updated: 5/12/2008 - need to use -SyncWindow 0 to get correct result - thanks Arnoud and Roman]  Let me elaborate more on this updated information.  As Roman points out in the comments on this post, Compare-Object compares two objects to see if they have the same set of elements.  Normally it does not care if the elements are in the same sequence in each object (each array in this case).  For example:

    PS> $a1 = 1,1,2
    PS> $a2 = 1,2,1
    PS> @(Compare-Object $a1 $a2).length -eq 0
    True

    Obviously that isn't what we want when comparing arrays for equality.  Fortunately, as Arnoud points out, we can use the SyncWindow parameter with a value 0 to get Compare-Object to "force sequence equality" as Arnoud succinctly phrases it.

    How about performance of these two approaches:

    PS> $a1 = 1..10000
    PS> $a2 = 1..10000
    PS> (Measure-Command { AreArraysEqual $a1 $a2 }).TotalSeconds
    1.236252
    PS> (Measure-Command { @(Compare-Object $a1 $a2 -sync 0).Length -eq 0 }).TotalSeconds
    0.3259954

    Compare-Object beats out my PowerShell function by a good margin which isn't too surprising[1].  After all, one is compiled code and the other is interpreted script.  So there you have it.  If you need a quick way to compare to arrays, just remember that arrays are objects too and that is what Compare-Object does best - compare two objects.

    [1] - Except for comparing against the same array where my function is two orders of magnitude faster.  It seems that the Compare-Object cmdlet could benefit from a quick System.Object.ReferenceEquals check.  :-)  Admittedly this is a bit of a corner case scenario.

    May 09

    Effective PowerShell Item 12: Understanding ByValue Pipeline Bound Parameters

    In item 11, I covered ByPropertyName pipeline bound parameters.  In this post, I'll cover the other variety of pipeline binding - ByValue.  ByValue binding takes the input object itself and attempts to bind it by type using type coercion if possible to parameters decorated as ByValue.  For example, most of the *-Object utility cmdlets operate on whatever object is presented to them.  The help on Where-Object shows this:

    -inputObject <psobject>
        Specifies the objects to be filtered. If you save the output of a command in a variable,
        you can use InputObject to pass the variable to Where-Object. However, typically, the
        InputObject parameter is not typed in the command. Instead, when you pass an object
        through the pipeline, Windows PowerShell associates the passed object with the
        InputObject parameter.

        Required?                    false
        Position?                    named
        Default value
        Accept pipeline input?       true (ByValue)
        Accept wildcard characters?  false

    It turns out that ByValue isn't nearly as popular as ByPropertyValue.  How can I make such a statement you ask?  Well this is one of the things that I love about PowerShell.  It provides so much metadata about itself.  It is very "self describing".  You can easily walk every parameter on every cmdlet that is currently loaded into PowerShell.  First let's see what information is available for a parameter:

    PS> Get-Command -CommandType cmdlet | Select -Expand ParameterSets | 
    >> Select -Expand Parameters -First 1 | Get-Member
    >>
    TypeName: System.Management.Automation.CommandParameterInfo Name MemberType Definition ---- ---------- ----------
    ... Aliases Property System.Collections.ObjectModel.ReadOnlyCollection`1[[...
    Attributes Property System.Collections.ObjectModel.ReadOnlyCollection`1[[...
    HelpMessage Property System.String HelpMessage {get;}
    IsDynamic Property System.Boolean IsDynamic {get;} IsMandatory Property System.Boolean IsMandatory {get;}
    Name Property System.String Name {get;}
    ParameterType Property System.Type ParameterType {get;} Position Property System.Int32 Position {get;} ValueFromPipeline Property System.Boolean ValueFromPipeline {get;}
    ValueFromPipelineByPropertyName Property System.Boolean ValueFromPipelineByPropertyName {get;}
    ValueFromRemainingArguments Property System.Boolean ValueFromRemainingArguments {get;}

    The interesting properties for us here are the Name and ValueFromPipeline* properties.  Given this information it is easy to figure out how many of each type there are:

    PS> (Get-Command -CommandType cmdlet | Select -Expand ParameterSets | Select -Expand Parameters |
    >> Where {$_.ValueFromPipeline -and !$_.ValueFromPipelineByPropertyName} | Measure-Object).Count
    >>
    55 PS> (Get-Command -CommandType cmdlet | Select -Expand ParameterSets | Select -Expand Parameters |
    >> Where {!$_.ValueFromPipeline -and $_.ValueFromPipelineByPropertyName} | Measure-Object).Count
    >>
    196 PS> (Get-Command -CommandType cmdlet | Select -Expand ParameterSets | Select -Expand Parameters |
    >> Where {$_.ValueFromPipeline -and $_.ValueFromPipelineByPropertyName} | Measure-Object).Count
    >>
    66

    So from here we can see the following:

    Type of Pipeline Binding Count
    ValueFromPipeline 55
    ValueFromPipelineByPropertyName 196
    Both 66

    So indeed binding by property name is much more common.  Binding by value from the pipeline is primarily for cmdlets that manipulate objects.  In the query below we can see that the InputObject parameter is by far the most common "ByValue" pipeline bound parameter:

    PS> Get-Command -CommandType cmdlet | Select -Expand ParameterSets | Select -Expand Parameters |
    >> Where {$_.ValueFromPipeline -and !$_.ValueFromPipelineByPropertyName} |
    >> Group Name -NoElement | Sort Count -Desc >> Count Name ----- ---- 40 InputObject 4 Message 3 String 2 SecureString 1 ExecutionPolicy 1 Object 1 AclObject 1 DifferenceObject 1 Id 1 Command

    A little further digging reveals the cmdlets that use the ByValue bound InputObject parameters as shown below.  Note that a single parameter can appear in more than one parameter set on a cmdlet, which explains why there are only 36 cmdlets that account for the 40 instances of InputObject.

    PS> $CmdletName = @{Name='CmdletName';Expression={$_.Name}}
    PS> Get-Command -CommandType cmdlet | Select $CmdletName -Expand ParameterSets |
    >> Select CmdletName -Expand Parameters |
    >> Where {$_.ValueFromPipeline -and !$_.ValueFromPipelineByPropertyName} | >> Group Name | Sort Count -Desc | Select -First 1 | Foreach {$_.Group} | >> Sort CmdletName -Unique | Format-Wide CmdletName -AutoSize >> Add-History Add-Member ConvertTo-Html Export-Clixml Export-Csv ForEach-Object
    Format-Custom Format-List Format-Table Format-Wide Get-Member Get-Process
    Get-Service Get-Unique Group-Object Measure-Command Measure-Object Out-Default
    Out-File Out-Host Out-Null Out-Printer Out-String Restart-Service
    Resume-Service Select-Object Select-String Sort-Object Start-Service Stop-Process
    Stop-Service Suspend-Service Tee-Object Trace-Command Where-Object Write-Output

    As you can see most of these cmdlets are designed to deal with objects in general.  Note to cmdlet developers - pipeline bound parameters is how your cmdlets receive pipeline objects.  When writing a cmdlet there is no $_.  If your cmdlet wants to "participate" in the pipeline it must set the ParameterAttribute property ValueFromPipeline and/or ValueFromPipelineByPropertyName to true on at least one of its parameters. 

    As mentioned above most ByValue parameters are of the InputObject (type psobject or psobject[]) variety so they pretty much accept anything.  However not all cmdlets work that way.  The -Id parameter (type [long[]]) on Get-History is pipeline bound ByValue.  The follow Trace-Command output shows how PowerShell works hard when necessary to convert the input object's type to the expected type. In this case a scalar string value of '1' to an array of Int64:

    PS> Trace-Command -Name ParameterBinding -PSHost -Expression {'1' | get-history}
    BIND NAMED cmd line args [Get-History] BIND POSITIONAL cmd line args [Get-History] MANDATORY PARAMETER CHECK on cmdlet [Get-History] CALLING BeginProcessing BIND PIPELINE object to parameters: [Get-History] PIPELINE object TYPE = [System.String] RESTORING pipeline parameter's original values Parameter [Id] PIPELINE INPUT ValueFromPipeline NO COERCION BIND arg [1] to parameter [Id] Binding collection parameter Id: argument type [String], parameter type
    [System.Int64[]], collection type Array, element type [System.Int64],
    no coerceElementType
    Creating array with element type [System.Int64] and 1 elements Argument type String is not IList, treating this as scalar BIND arg [1] to param [Id] SKIPPED Parameter [Id] PIPELINE INPUT ValueFromPipeline WITH COERCION BIND arg [1] to parameter [Id] COERCE arg type [System.Management.Automation.PSObject] to [System.Int64[]]
    ENCODING arg into collection
    Binding collection parameter Id: argument type [PSObject], parameter type
    [System.Int64[]], collection type Array, element type [System.Int64],
    coerceElementType
    Creating array with element type [System.Int64] and 1 elements Argument type PSObject is not IList, treating this as scalar COERCE arg type [System.Management.Automation.PSObject] to [System.Int64]
    CONVERT arg type to param type using LanguagePrimitives.ConvertTo
    CONVERT SUCCESSFUL using LanguagePrimitives.ConvertTo: [1]
    Adding scalar element of type Int64 to array position 0 Executing VALIDATION metadata: [System.Management.Automation.ValidateRangeAttribute]
    BIND arg [System.Int64[]] to param [Id] SUCCESSFUL
    MANDATORY PARAMETER CHECK on cmdlet [Get-History] CALLING ProcessRecord CALLING EndProcessing

    Note that on the first attempt, PowerShell tries to convert the string to an array of Int64 and fails.  Then it tries again by treating the input as psobject.  It throws that psobject at an internal help class method LanguagePrimitives.ConvertTo() that successfully converts the string '1' to an Int64[] containing the value 1.

    When a parameter is both ByValue and ByPropertyName bound, PowerShell attempts to bind in this order:

    1. Bind ByValue with no type conversion
    2. Bind ByPropertyName with no type conversion
    3. Bind ByValue with type conversion
    4. Bind ByPropertyName with type conversion

    There is more to the parameter binding algorithm like finding the best match amongst different parameter sets.  BTW one last tidbit related to parameters.  The PowerShell help topics aren't completely automatically generated and as a result they aren't always correct.  For instance, look up the parameters on Get-Content and see if you find a -Wait parameter - you won't.  :-)  However the metadata is always complete and correct e.g.:

    PS> Get-Command Get-Content -Syntax
    Get-Content [-Path] <String[]> [-ReadCount <Int64>] [-TotalCount <Int64>] [-Filter <String>] 
    [-Include <String[]>] [-Exclude <String[]>] [-Force] [-Credential <PSCredential>] [-Verbose]
    [-Debug] [-ErrorAction <ActionPreference>] [-ErrorVariable <String>] [-OutVariable <String>]
    [-OutBuffer <Int32>] [-Delimiter <String>] [-Wait] [-Encoding <FileSystemCmdletProviderEncoding>]
    Get-Content [-LiteralPath] <String[]> [-ReadCount <Int64>] [-TotalCount <Int64>] [-Filter <String>]
    [-Include <String[]>] [-Exclude <String[]>] [-Force] [-Credential <PSCredential>] [-Verbose]
    [-Debug] [-ErrorAction <ActionPreference>] [-ErrorVariable <String>] [-OutVariable <String>]
    [-OutBuffer <Int32>] [-Delimiter <String>] [-Wait] [-Encoding <FileSystemCmdletProviderEncoding>]

    Hopefully this post has given you more knowledge about ByValue parameters and how to explore and get more information on cmdlet parameters in general.  In summary, there actually isn't much you need to know about ByValue pipeline bound parameters because in most cases they just work intuitively.  Just be sure to keep your eye out for those parameters that bind ByPropertyName. They are the ones whose pipeline bound usage isn't as obvious.

    May 03

    Windows PowerShell V2 CTP2 Is Available

    The PowerShell team just posted the announcement late last night.  There is a download link in the announcement.  This drop has lots of new features like Module support for organizing and loading related functionality as well as transaction support for registry operations using the Registry provider.   There have also been some changes since the first CTP so take a look at the release notes.

    Let me reiterate the caution that these are "preview" (the P in CTP) bits.  There aren't even consider beta quality although I would qualify that caution mostly applies to the new functionality.  That said, I would not put these bits on a production machine.  In fact, there have been reports about incompatibility between the System Center Virtual Machine Manager latest drop and V2 CTP bits. 

    If you don't have a spare/test PC on which to play with these bits then go grab Virtual PC (free), Virtual Server or VMWare and create a sandbox VM with which to play with this CTP.  You will want to create that image based on either Vista SP1 or Windows Server 2008 if you want to take the remote management features for a spin.

    One last thing.  The Graphical PoweShell is new in this release and is in need of feedback.  Folks, now is the time to influence the direction of this feature.  If you wait too long the team won't have time to make anything but very minor changes.  So please use it for a while and send your feedback to gPSFback@microsoft.com.