Summary: Use a super simple, one-line command to remove columns easily from a CSV file using Windows PowerShell.
Hey, Scripting Guy! I have a CSV file from which I need only two of eight columns. Can Windows PowerShell delete certain columns? I have searched the Internet for days with no luck.
—RK
Hello RK,
Microsoft Scripting Guy Ed Wilson here. The Scripting Wife and I are absolutely enjoying Canada. The classes I have been teaching are populated with extremely bright students who have asked engaging questions. I love teaching because I get lots of good questions that force me to spend time learning new things about Windows PowerShell. Great questions are also one of the reasons I like reading the email sent to scripter@microsoft.com.
RK, your question jumped out at me today. The reason is that at first I thought it would be a pain to answer because I would need to parse through the file, and I hate parsing text. But then, I thought about it some more. A comma-separated value (CSV) file is structure text—it is not plain text. In addition, Windows PowerShell has several great cmdlets for working with CSV files. I then wondered if, after I modify the stream, I can pipe it back out to another CSV file?
I have a CSV file with process IDs, process names, and CPU time. I can use the Import-CSV cmdlet to read a CSV file and to display the contents on the Windows PowerShell console. This is shown in the following figure.
The text file itself contains the same information. The file is shown in the following figure.
It is obvious the problem with the data stored in this CSV file is that it is missing the CPU column for about half of the processes. This can be a distraction, so I want to create a new CSV file that contains the process name and the process ID information. To do this manually would be a major pain. To write a VBScript to do this would be an even bigger pain. But doing this with Windows PowerShell is surprisingly easy. The secret does not lie in either the Import-CSV cmdlet that creates an object from the CSV file, or in the Export-CSV cmdlet that writes the information to a CSV file. The secret is the Select-Object cmdlet. The cool thing about the Select-Object cmdlet is that it creates a custom object from the piped input.
Custom objects are cool. In fact, the Import-CSV cmdlet itself creates a custom object if no type information exists in the file. The following output illustrates this fact:
PS C:\> Import-Csv C:\fso\co.csv | Get-Member
TypeName: System.Management.Automation.PSCustomObject
Name MemberType Definition
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
CPU NoteProperty System.String CPU=
Id NoteProperty System.String Id=2008
ProcessName NoteProperty System.String ProcessName=AEADISRV
When I use the Select-Object cmdlet to choose the processname and the id properties, I still have a custom object, but this time it only contains two properties, as shown in this output:
PS C:\> Import-Csv C:\fso\co.csv | select processname,id | get-member
TypeName: Selected.System.Management.Automation.PSCustomObject
Name MemberType Definition
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
Id NoteProperty System.String Id=2008
ProcessName NoteProperty System.String ProcessName=AEADISRV
Because the Export-CSV cmdlet accepts piped input, all I need to do is pipe the custom object from the Select-Object cmdlet (select is an alias) to the Export-CSV cmdlet. The command is shown here:
Import-Csv C:\fso\co.csv | select processname,id | Export-Csv -Path c:\fso\modco.csv –NoTypeInformation
When I open the newly created CSV file in Notepad, I can see that the command worked. I no longer have any CPU time information recorded in the file. The file is shown in the following figure.
RK, that is all there is to using Windows PowerShell to remove columns from a CSV file. The secret lies in the way that Windows PowerShell pipes objects. And when working with objects, the easiest way to create a custom object is to use the Select-Object cmdlet. In fact, if you have been following the 2011 wrap-up articles, you will see that I was recommending using Select-Object to return an object from a function instead of using Format-Table or some other format type of cmdlet.
Join me tomorrow for more cool Windows PowerShell stuff. See you here!
I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy