Summary: Guest blogger, Bill Stewart, discusses a Windows PowerShell function to determine folder size.
Microsoft Scripting Guy, Ed Wilson, is here. Guest blogger today is Bill Stewart. Bill Stewart is a scripting guru and a moderator for the Official Scripting Guys forum.
Here’s Bill…
You have probably asked this question hundreds of times, “How big is that folder?” Of course, the typical GUI way to find out is to right-click the folder in Windows Explorer and open the folder’s properties. As with all things GUI, this does not scale well. For example, what if you need the size for 100 different folders?
If you have worked with Windows PowerShell for any length of time, you know that it provides a pretty comprehensive set of tools. The trick is learning how to combine the tools to get the results you need. In this case, I know that Windows PowerShell can find files (Get-ChildItem), and I know that it can count objects and sum a property on objects. A simple example would be a command like this:
Get-ChildItem | Measure-Object -Sum Length
Get-ChildItem outputs a list of items in the current location (in files and folders, if your current location is in a file system), and Measure-Object uses this list as input and adds together every input object’s Length property (file size). In other words, this command tells you the count and sum of the sizes of all the files in the current directory.
The Measure-Object cmdlet outputs an object with five properties (including Count, Average, and Sum). However, we only care about the Count and Sum properties, so let us refine our command a bit:
Get-ChildItem | Measure-Object -Sum Length | Select-Object Count, Sum
Now we are using Select-Object to select (hence the name) only the two properties we care about. The end result is a new output object that contains only those two properties.
This is good as far as it goes, but I wanted my output object to include the directory’s name. In addition, while we are at it, let us use the names “Files” and “Size” instead of “Count” and “Sum.” To do this, I am going to output a custom object like this:
$directory = Get-Item .
$directory | Get-ChildItem |
Measure-Object -Sum Length | Select-Object `
@{Name="Path"; Expression={$directory.FullName}},
@{Name="Files"; Expression={$_.Count}},
@{Name="Size"; Expression={$_.Sum}}
I need $directory as a separate variable so I can include it in the output object. In addition, you can see here that I am using Select-Object with a set of hash tables as a shorthand technique for creating a custom output object.
In the following image (Figure 1), you can see the output from all three of the commands.
The output does not look that great, but remember that the presentation is less important than the content: We are outputting objects, not text. Because the output is objects, we can sort, filter, and measure.
To get the output we want (include the path name and change a couple of the property names), the commands can start to get a bit lengthy. So it makes sense to encapsulate the needed code in a script. Get-DirStats.ps1 is the script, and its syntax is as follows:
Get-DirStats [[-Path] <Object>] [-Only] [-Every] [-FormatNumbers] [-Total]
or
Get-DirStats -LiteralPath <String[]> [-Only] [-Every] [-FormatNumbers] [-Total]
As you can see, you can run the script by using two sets of mutually exclusive parameters. Windows PowerShell calls these parameter sets. The parameter sets’ names (Path and LiteralPath) are defined in the statement at the top of the script, and the script’s CmdletBinding attribute specifies that Path is the default parameter set.
The Path parameter supports pipeline input. Also, the Path parameter is defined as being first on the script’s command line, so the Path parameter name itself is optional. The LiteralPath parameter is useful when a directory name contains characters that Windows PowerShell would normally interpret as wildcard characters (the usual culprits are the [ and the ] characters.) The Path and LiteralPath parameters are in different parameter sets, so they’re mutually exclusive.
The Only parameter calculates the directory size only for the named path(s), but not subdirectories (like what is shown in Figure 1). Normally, when we ask about the size of a directory, we’re asking about all of its subdirectories also. If you only care about counting and summing the sizes of the files in a single directory (but not its subdirectories), you can use the Only parameter.
The Every parameter outputs an object for every subdirectory in the path. Without the Every parameter, the script outputs an object for the first level of subdirectories only. The following image shows what I mean.
If we use the following command, the script will output an object for every directory in the left (navigation) pane (if we expand them all, as shown in the previous image).
Get-DirStats -Path C:\Temp -Every
If we omit the Every parameter from this command, the script will only output the directories in the right pane. The script will still get the sizes of subdirectories if you omit the Every parameter; the difference is in the number of output objects.
The FormatNumbers parameter causes the script to output numbers as formatted strings with thousands separators, and the Total parameter outputs a final object after all other output that adds up the total number of files and directories for all output. These parameters are useful when running the script at a Windows PowerShell command prompt; but you shouldn’t use them if you’re going to do something else with the output (such as sorting or filtering) because the numbers will be text (with FormatNumbers) and there will be an extra object (with Total). The following image shows an example command that uses the FormatNumbers and Total parameters with US English thousands separators.
Get-DirStats.ps1 supports pipeline input, so it uses the Begin, Process, and End script blocks. The script uses the following lines of code within the Begin script block to detect the current parameter set and whether input is coming from the pipeline:
$ParamSetName = $PSCmdlet.ParameterSetName
if ( $ParamSetName -eq "Path" ) {
$PipelineInput = ( -not $PSBoundParameters.ContainsKey("Path") ) -and ( -not $Path )
}
elseif ( $ParamSetName -eq "LiteralPath" ) {
$PipelineInput = $false
}
The script uses the $ParamSetName and $PipelineInput variables later in the Process script block. The logic behind the definition of the $PipelineInput variable is thus: “If the Path parameter is not bound (that is, it was not specified on the script’s command line), and the $Path variable is $null, then the input is coming from the pipeline.” Both the $ParamSetName and $PipelineInput variables are used in the script’s Process script block.
The beginning of script’s Process script block has the following code:
if ( $PipelineInput ) {
$item = $_
}
else {
if ( $ParamSetName -eq "Path" ) {
$item = $Path
}
elseif ( $ParamSetName -eq "LiteralPath" ) {
$item = $LiteralPath
}
}
The $item variable will contain the path that the script will process. Thus, if the script’s input is coming from the pipeline, $item will be the current pipeline object ($_); otherwise, $item will be $Path or $LiteralPath (depending on the current parameter set).
Next, Get-DirStats.ps1 uses the Get-Directory function as shown here:
function Get-Directory {
param( $item )
if ( $ParamSetName -eq "Path" ) {
if ( Test-Path -Path $item -PathType Container ) {
$item = Get-Item -Path $item -Force
}
}
elseif ( $ParamSetName -eq "LiteralPath" ) {
if ( Test-Path -LiteralPath $item -PathType Container ) {
$item = Get-Item -LiteralPath $item -Force
}
}
if ( $item -and ($item -is [System.IO.DirectoryInfo]) ) {
return $item
}
}
The Get-Directory function uses Test-Path to determine if its parameter ($item) is a container object and a file system directory (that is, a System.IO.DirectoryInfo object).
If the Get-Directory function returned $null, the script writes an error to the error stream by using the Write-Error cmdlet and exits the Process script block with the return keyword.
After validating that the directory exists in the file system, the script calls the Get-DirectoryStats function, which is really the workhorse function in the script. The Get-DirectoryStats function is basically a fancy version of the commands run in Figure 1. Here it is:
function Get-DirectoryStats {
param( $directory, $recurse, $format )
Write-Progress -Activity "Get-DirStats.ps1" -Status "Reading '$($directory.FullName)'"
$files = $directory | Get-ChildItem -Force -Recurse:$recurse | Where-Object { -not $_.PSIsContainer }
if ( $files ) {
Write-Progress -Activity "Get-DirStats.ps1" -Status "Calculating '$($directory.FullName)'"
$output = $files | Measure-Object -Sum -Property Length | Select-Object `
@{Name="Path"; Expression={$directory.FullName}},
@{Name="Files"; Expression={$_.Count; $script:totalcount += $_.Count}},
@{Name="Size"; Expression={$_.Sum; $script:totalbytes += $_.Sum}}
}
else {
$output = "" | Select-Object `
@{Name="Path"; Expression={$directory.FullName}},
@{Name="Files"; Expression={0}},
@{Name="Size"; Expression={0}}
}
if ( -not $format ) { $output } else { $output | Format-Output }
}
This function uses the Write-Progress cmdlet to inform the user running the script that something’s happening, and it uses a combination of the Get-ChildItem, Where-Object, Measure-Object, and Select-Object cmdlets to output a custom object. Note the use of the scoped variables ($script:totalcount and $script:totalbytes). These are used with the script’s Total parameter, and they are output in the script’s End script block.
Drop this script into a directory in your path, and you can quickly find the sizes for directories in your file system. Remember that it outputs objects, so you can add tasks such as sort and filter, for example:
Get-DirStats -Path C:\Temp | Sort-Object -Property Size
This command outputs the size of directories in C:\Temp, sorted by size.
The entire script can be downloaded from the Script Repository.
~Bill
Thank you, Bill, for writing an interesting and useful blog. Join me tomorrow for more Windows PowerShell cool stuff.
I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy