Summary: Here is the answer to Problem 1 in Windows PowerShell Mini-Scripting Games 2014.
Microsoft Scripting Guy, Ed Wilson, is here. Here is an answer to Problem 1 in Windows PowerShell Mini-Scripting Games 2014. Last week, I posted the complete Problem 1 description and task outline.
Problem one synopsis
You are reading a CSV file that was created from another source. The CSV file has four fields that are to be read and then used to update four attributes in Active Directory. The problem is that the CSV file does not have all four fields filled out. In some cases, there is a space in the field, in others a tab, others are null, and others are empty.
When a field does not have a proper value, the attempt to update the attribute in Active Directory results in an error message. Your boss has decided to make the script your problem.
Solution
You need to write some logic that will identify all sorts of white space, including null fields. Detecting a null, a space, or a whatever seems to beg for a nice regular expression. It is also possible to do this manually by using a series of if statements.
Note For a great reference, see Regular Expression Language - Quick Reference.
For example, if I want to match a space, I can simply supply a space, as shown here:
PS C:\> $a = " "
PS C:\> $a -match " "
True
Is $a null? I can also test that:
PS C:\> $a = $null
PS C:\> $a -match $null
True
I can incorporate this into logic, and use code that looks like the following:
PS C:\> If ($a -match " " -OR $a -match $null) {'$a is either a space or null'}
$a is either a space or null
Match a whitespace character
I can use a character class to try to match the empty field. Whitespace characters are things like a space, a tab, a carriage return, or a line feed. The regular expression character class for whitespace characters is backslash with a lowercase s: ( \s ). Windows PowerShell uses a backtick with a lowercase t for a Tab: ( `t ).
Note Regular expression character classes are case sensitive. There is an excellent description of character classes on MSDN: Character Classes in Regular Expressions.
In my first example, I use the Windows PowerShell `t character to store a Tab character in the variable $a. I then display it to the console, which is really nothing to see.
PS C:\> $a = "`t"
PS C:\> $a
Now, I attempt to match a space, which does not match a Tab. This is shown here:
PS C:\> $a -match " "
False
Now I use the \s whitespace character class from Regular Expressions, and I can see that it does match:
PS C:\> $a -match "\s"
True
Of course, it will also match the Windows PowerShell Tab character, `t:
PS C:\> $a -match "`t"
True
But the whitespace class does not match a null, as shown here:
PS C:\> $a = $null
PS C:\> $a -match "\s"
False
But it will match one or more spaces. This is shown here:
PS C:\> $a = " "
PS C:\> $a -match "\s"
True
PS C:\> $a = " "
PS C:\> $a -match "\s"
True
How about not matching
Rather than trying to match every possible blank or empty sort of thing, how about if I simply don’t match any letter or number? I can easily do this by specifying a couple of character groups. All lower case letters, all upper case letters, and all numbers will pretty much do it. To specify a character group, I use square brackets for all lower case letters [a-z] or for all upper case letters [A-Z]. The asterisk ( * ) means zero or more instances of the characters.
I create my two character groups and store them in a variable I call pattern:
$pattern = "[a-z][A-Z]*"
Now I check to see if it does not match my various test conditions. First I check to see if It does not match a blank space:
PS C:\> $pattern = "[a-z][A-Z]*"
PS C:\> $a = " "
PS C:\> $a -notmatch $pattern
True
Then I check to see if it does not match a null:
PS C:\> $a = $null
PS C:\> $a -notmatch $pattern
True
Now I check to see if it does not match a Tab and a carriage return line feed:
PS C:\> $a = "`t"
PS C:\> $a -notmatch $pattern
True
PS C:\> $a = "`r`n"
PS C:\> $a -notmatch $pattern
True
Application to the problem
So, if I create a simple CSV file that has a few empty fields, it might look like the following:
"f1","f2","f3","f4"
"a","b","c","d"
"","b","c","d"
"a"," ","c","d"
Now, I write a script that reads this CSV file, and contains the pattern I used earlier. To walk through the CSV file, I use the GetEnumerator() method. I then use While and the MoveNext() method to proceed from each record to the next one. I then use an IF statement to see if I find a match. Here is the script:
PROBLEM1SOLUTION.PS1
$data = Import-Csv -Path C:\DataIn\q1.csv
$pattern = "[a-z][A-Z]*"
$e = $data.GetEnumerator()
While ($e.MoveNext())
{
If ($e.current.f1 -notmatch $pattern) {"$($e.current) does not match"}
If ($e.current.f2 -notmatch $pattern) {"$($e.current) does not match"}
If ($e.current.f3 -notmatch $pattern) {"$($e.current) does not match"}
If ($e.current.f4 -notmatch $pattern) {"$($e.current) does not match"}
}
That is one approach to the problem. Mini-Scripting Games Answer Week will continue tomorrow when I will have a solution for Problem 2.
I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy