Summary: Microsoft PowerShell MVPs, Don Jones and Jeffery Hicks, talk about a fundamental tool design consideration.
Microsoft Scripting Guy, Ed Wilson, is here. This week we will not have our usual PowerTip. Instead we have excerpts from seven books from Manning Press. In addition, each blog will have a special code for 50% off the book being excerpted that day. Remember that the code is valid only for the day the excerpt is posted. The coupon code is also valid for a second book from the Manning collection.
This excerpt is from Learn PowerShell Toolmaking in a Month of Lunches
By Don Jones and Jeffery Hicks
Here’s a basic tenant of good Windows PowerShell tool design: do one thing, and do it well. Broadly speaking, a function should do one—and only one—of these things:
- Retrieve data from someplace
- Process data
- Output data to some place
- Put data into some visual format meant for human consumption
This fits well with the command-naming convention in Windows PowerShell: If your function uses the verb Get, that’s what it should do: get. If it’s outputting data, you name it with a verb like Export, or Out, or something else. If each command (okay, function) worries about just one of those things, then they’ll have the maximum possible flexibility.
For example, let’s say we want to write a tool that will retrieve some key operating system information from multiple computers and then display that information in a nicely formatted onscreen table. It’d be easy to write that tool so that it opened Active Directory, got a bunch of computer names, queried the information from them, and then formatted a nice table as output.
The problem?
Well, what if tomorrow we didn’t want the data on the screen but rather wanted it in a CSV file? What if one time we needed to query a small list of computers rather than a bunch of computers from the directory? Either change would involve coding changes, probably resulting in many different versions of our tool lying around. Had we made it more modular and followed the basic philosophy we just outlined, we wouldn’t have to do that. Instead, we might have designed the following:
- One function that gets computer names from the directory
- One function that accepts computer names, queries those computers, and produces the desired data
- One function that formats data into a nice onscreen table
Suddenly, everything becomes more flexible. That middle function could now work with any source of computer names: the directory, a text file, or whatever. Its data could be sent to any other command to produce output. Maybe we’d pipe it to Export-CSV to make that CSV file or to ConvertTo-HTML to make an HTML page. What about the onscreen table we want right now? We’re betting Format-Table could do the job, meaning we don’t even have to write that third function at all—less work for us!
So let’s talk about function design. We’re going to suggest that there are really three categories of functions (or tools): input, functional, and output.
Input tools
Input tools are the functions that don’t produce anything inherently useful, but are rather meant to feed information to a second tool. So a function that retrieves computer names from a configuration management database is an input tool. You don’t necessarily want the computer names, but there might be an endless variety of other tools that you want to send computer names to—including any number of built-in Windows PowerShell commands.
Here’s a good example of how to draw a line between your functions. Let’s say you’re writing a hunk of commands intended to retrieve computer names from your configuration management database. Your intent today is to query some WMI information from those computers—but aren’t there other tools that need computer names as input? Sure! Restart-Computer accepts computer names. So does Get-EventLog, Get-Process, Invoke-Command, and a dozen more commands. That’s what suggests (to us, at least) that functionality for getting names from the database should be a standalone tool. It could potentially feed a lot more than only today’s current needs.
Windows PowerShell already comes with a number of input tools. Sticking with the theme of getting computer names, you might use Import-CSV, Get-Content, or Get-ADComputer to retrieve computer names from various sources. To us, this further emphasizes the fact that the task of getting computer names is a standalone capability, rather than being part of another tool.
Functional tools
This is the kind of tool you’ll be writing most often. The idea is that this kind of tool doesn’t spend time retrieving information that it needs to do its main job. Instead, it accepts that information via a parameter of some kind—that parameter being fed by manually entered data, by another command, and so on.
So if your functional tool is going to query information from remote computers, it doesn’t internally do anything to get those computers’ names; but instead, it accepts them on a parameter. It doesn’t care where the computer names come from—that’s another job.
When it’s been given the information it needs to operate, a functional tool does its job and then outputs objects to the pipeline. Specifically, it outputs a single kind of object, so that all of its output is consistent. This functional tool also doesn’t worry about what you plan to do with that output—it simply puts objects into the pipeline. This kind of tool doesn’t spend a nanosecond worrying about formatting, about output files, or about anything else. It does its job, perhaps produces some objects as output, and that’s it.
Note Not all functional tools will produce output of any kind. A command that just does something—perhaps reconfiguring a computer—might not produce any output, apart from error messages if something goes wrong. That’s fine.
Output tools
Output tools are specifically designed to take data (in the form of objects), which has been produced by a functional tool, and then put that data into a final form. Let’s stress that: final form. We looked up final in our dictionary, and it says something like, “pertaining to or coming at the end; last in place, order, or time.” In other words, when you send your data to an output tool, you’re finished with it. You don’t want anything else happening to the data. You want to save it in a file or a database, or display it onscreen, or fax it to someone, or tap it out in Morse code…whatever. Windows PowerShell verbs for this include Export, Out, and ConvertTo, to name a few.
Consider the inverse of this philosophy: If you have a tool that’s putting data into some final form, like a text file or an onscreen display, that tool should be doing nothing else. Why?
Consider a function that we’ve created, named Get-ComputerDetails. This function gets a bunch of information from a bunch of computers. It then produces a pretty, formatted table on the screen. That’s a text-based display. Doing so means we could never do this:
Get-ComputerDetails | Where OSBuildNumber –le 7600 |
Sort ComputerName | ConvertTo-HTML | Out-File computers.html
Why couldn’t we do that? Because, in this example, Get-ComputerDetails is producing text. Where-Object, Sort-Object, and ConvertTo-HTML can’t deal with text—they deal with objects. Get-ComputerDetails has put our data into its final form, meaning—according to the dictionary—that Get-ComputerDetails is “coming at the end” and should be “last in place.” Nothing can come after it—meaning we have less flexibility.
A better design would have had Get-ComputerDetails produce only objects and to create a second command, perhaps called Format-MyPrettyDisplay, which handles the formatting. That way we could get our originally desired output:
Get-ComputerDetails | Format-MyPrettyDisplay
But we could also do this:
Get-ComputerDetails | Where OSBuildNumber –le 7600 |
Sort ComputerName | ConvertTo-HTML | Out-File computers.html
This would allow us to change our minds about using Format-MyPrettyDisplay from time-to-time, instead sending our data objects on to other commands to produce different displays, filter the data, source the data, create files, and so on.
This blog discussed the basics of good Windows PowerShell tool design. A function should perform only one of the following actions:
- Retrieve data from someplace
- Process data
- Output data to some place
- Put data into a visual format meant for human consumption
We talked about three different categories of functions, or tools: input, functional, and output.
~Don and Jeffrey
Here is the code for the discount offer today at www.manning.com: scriptw7
Valid for 50% off Learn PowerShell Toolmaking in a Month of Lunches and SharePoint 2010 Owner's Manual
Offer valid from April 7, 2013 12:01 AM until April 8, midnight (EST)
I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy