Summary: Microsoft Scripting Guy, Ed Wilson, talks about using Windows PowerShell to compare two files.
Hey, Scripting Guy! I have a script that I wrote to compare two files, but it seems really slow. I am wondering what I can do to speed things up a bit.
—JW
Hello JW,
Microsoft Scripting Guy, Ed Wilson, is here. I looked at the script you supplied, where you use Compare-Object to compare two files. Here is your script:
$fileA = "C:\fso\myfile.txt"
$fileB = "C:\fso\CopyOfmyfile.txt"
$fileC = "C:\fso\changedMyFile.txt"
if(Compare-Object -ReferenceObject $(Get-Content $fileA) -DifferenceObject $(Get-Content $fileB))
{"files are different"}
Else {"Files are the same"}
When I run the script and compare FileA with FileB, the script returns the correct response:
When I change it to use FileC, the script also works:
The three files are shown here:
So JW, this is a very simple test case. What is really going on when using Compare-Object?
I can use the Windows PowerShell ISE to run a portion of the code and look at it. To do this, I highlight the Compare-Object statement and press F-8 to execute only that portion of the code. This is shown here:
PS C:\> Compare-Object -ReferenceObject $(Get-Content $fileA) -DifferenceObject $(Get-Content $fileC)
InputObject SideIndicator
----------- -------------
Additional values =>
And when I compare FileA with FileB, the following appears:
PS C:\> Compare-Object -ReferenceObject $(Get-Content $fileA) -DifferenceObject $(Get-Content $fileB)
PS C:\>
This triggers the ELSE portion of the code. Although this works, it can be a bit slow, and on more complex files, I would think it would also be a bit unreliable.
So a better way to do this is to use Get-FileHash and compare the HASH property. Your revised script is shown here:
$fileA = "C:\fso\myfile.txt"
$fileB = "C:\fso\CopyOfmyfile.txt"
$fileC = "C:\fso\changedMyFile.txt"
if((Get-FileHash $fileA).hash -ne (Get-FileHash $fileC).hash)
{"files are different"}
Else {"Files are the same"}
Now, when I look at the portion of the code that executes, I can see that I am dealing with a Boolean, instead of trying to evaluate whether output (which is basically ignored) appears or not (as in your previous script).
In the following, I execute only the Get-FileHash portion of the script:
PS C:\> (Get-FileHash $fileA).hash -ne (Get-FileHash $fileC).hash
True
PS C:\> (Get-FileHash $fileA).hash -ne (Get-FileHash $fileB).hash
False
In addition, the Get-FileHash code is rather efficient because Windows PowerShell is pretty fast when it comes to getting the file hash. Plus this operation simply obtains the file hashes, and compares the two hashes. Your original script reads in the complete file, and then compares it line-by-line, so it is much less efficient.
JW, that is all there is to using Windows PowerShell to compare two files. Troubleshooting Week will continue tomorrow when I will talk about more cool stuff.
I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy