Summary: Learn four easy ways to use Windows PowerShell to import CSV files into SQL Server.
Microsoft Scripting Guy, Ed Wilson, is here. I was chatting this week with Microsoft PowerShell MVP, Chad Miller, about the series of blogs I recently wrote about using CSV files. He thought a helpful addition to the posts would be to talk about importing CSV files into a SQL Server. I most heartily agreed. Welcome to Guest Blogger Week. We will start off the week with a bang-up article by Chad Miller. Chad has previously written guest blogs for the Hey, Scripting Guy! Blog. Here is a little information about Chad:
Chad Miller is a SQL Server database admin and the senior manager of database administration at Raymond James Financial. In his spare time, he is the project coordinator and developer of the CodePlex project SQL Server PowerShell Extensions (SQLPSX). Chad leads the Tampa Windows PowerShell User Group, and he is a frequent speaker at SQL Saturdays and Code Camps.
Contact information:
Blog: Sev17
Twitter: cmille19
Importing CSV files into SQL Server
Windows PowerShell has built in support for creating CSV files by using the Export-CSV cmdlet. However, the creation of a CSV file is usually only a short stop in an overall process that includes loading the file into another system. In this post, we'll look at a few scripted-based approaches to import CSV data into SQL Server. Note: SQL Server includes a component specifically for data migration called SQL Server Integration Services (SSIS), which is beyond the scope of this article.
T-SQL BULK INSERT command
The T-SQL BULK INSERT command is of the easiest ways to import CSV files into SQL Server. The BULK INSERT command requires a few arguments to describe the layout of the CSV file and the location of file. Let's look at an example of creating a CSV file by using Export-CSV, and then importing the information into a SQL Server table by using BULK INSERT.
Requirements
- Sysadmin or insert and bulkadmin to SQL Server
- Local access to SQL Server
Setup
1. Download the following script: Invoke-SqlCmd2.ps1
2. Create a table disk space by copying the following code in SQL Server Management Studio.
Note: The example uses a database named "hsg."
CREATE TABLE dbo.diskspace(
UsageDate datetime,
SystemName varchar(50),
Label varchar(50),
VolumeName varchar(50),
Size decimal(6,2),
Free decimal(6,2),
PrecentFree decimal(5,2)
)
The following image shows the command in SQL Server Management Studio.
3. Save the following script as Get-DiskSpaceUsage.ps1, which will be used as the demonstration script later in this post.
param($ComputerName=".")
Get-WmiObject -computername "$computername" Win32_Volume -filter "DriveType=3" | foreach {
new-object PSObject -property @{
UsageDate = $((Get-Date).ToString("yyyy-MM-dd"))
SystemName = $_.SystemName
Label = $_.Label
VolumeName = $_.Name
Size = $([math]::round(($_.Capacity/1GB),2))
Free = $([math]::round(($_.FreeSpace/1GB),2))
PercentFree = $([math]::round((([float]$_.FreeSpace/[float]$_.Capacity) * 100),2))
}
} | Select UsageDate, SystemName, Label, VolumeName, Size, Free, PercentFree
Now we will use the script Get-DiskSpaceUsage.ps1 that I presented earlier. It lists information about disk space, and it stores the information in a CSV file.
./get-diskusage.ps1 | export-csv -Path "C:\Users\Public\diskspace.csv" -NoTypeInformation
The generated CSV file shows that Export-CSV includes a text delimiter of double quotes around each field:
"UsageDate","SystemName","Label","VolumeName","Size","Free","PercentFree"
"2011-11-20","WIN7BOOT","RUNCORE SSD","D:\","59.62","31.56","52.93"
"2011-11-20","WIN7BOOT","DATA","E:\","297.99","34.88","11.7"
"2011-11-20","WIN7BOOT",,"C:\","48","6.32","13.17"
"2011-11-20","WIN7BOOT","HP_TOOLS","F:\","0.1","0.09","96.55"
Although many programs handle CSV files with text delimiters (including SSIS, Excel, and Access), BULK INSERT does not. To use BULK INSERT without a lot of work, we'll need to remove the double quotes. We can use a quick and dirty way of simply replacing all the quotes in the CSV file. In the blog post Remove Unwanted Quotation Marks from CSV Files by Using PowerShell, the Scripting Guys explains how to remove double quotes. This method can be used for circumstances where you know it won't cause problems. How do you know? Well, the data being generated from our Get-DiskspaceUsage should never have double quotes or commas in the data. So here's the code to remove the double quotes:
(Get-Content C:\Users\Public\diskspace.csv) | foreach {$_ -replace '"'} | Set-Content C:\Users\Public\diskspace.csv
UsageDate,SystemName,Label,VolumeName,Size,Free,PercentFree
2011-11-20,WIN7BOOT,RUNCORE SSD,D:\,59.62,31.56,52.93
2011-11-20,WIN7BOOT,DATA,E:\,297.99,34.88,11.7
2011-11-20,WIN7BOOT,,C:\,48,6.32,13.17
2011-11-20,WIN7BOOT,HP_TOOLS,F:\,0.1,0.09,96.55
Now we are ready to import the CSV file as follows:
. .\Invoke-SqlCmd2.ps1
$query = @"
BULK INSERT hsg.dbo.diskspace FROM 'C:\Users\Public\diskspace.csv'
WITH (FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '\n')
"@
Invoke-SqlCmd2 -ServerInstance "$env:computername\sql1" -Database hsg -Query $query
The following data shows that our CSV file was successfully imported.
UsageDate
|
System Name
|
Label
|
Volume Name
|
Size
|
Free
|
Percent Free
|
11/20/2011 12:00:00 AM
|
WIN7BOOT
|
RUNCORE SSD
|
D:\
|
59.62
|
31.56
|
52.93
|
11/20/2011 12:00:00 AM
|
WIN7BOOT
|
DATA
|
E:\
|
297.99
|
34.88
|
11.70
|
11/20/2011 12:00:00 AM
|
WIN7BOOT
|
|
C:\
|
48.00
|
6.32
|
13.17
|
11/20/2011 12:00:00 AM
|
WIN7BOOT
|
HP_TOOLS
|
F:\
|
0.10
|
0.09
|
96.55
|
BULK INSERT works reasonably well, and it is very simple. However, there are some drawbacks, including:
- You need elevated permissions on SQL Server.
- BULK INSERT doesn't easily understand text delimiters.
- Using the UNC path to files requires an additional setup, as documented under Permissions on the BULK INSERT site.
For these reasons, let's look at some alternate approaches.
Before there was Windows PowerShell, there was LogParser
LogParser is a command-line tool and scripting component that was originally released by Microsoft in the IIS 6.0 Resource Kit. LogParser provides query access to different text-based files and output capability to various data sources including SQL Server. Even though this little tool hasn't been updated since 2005, it has some nice features for loading CSV files into SQL Server.
Setup
Download and install LogParser 2.2.
LogParser can do a few things that we couldn't easily do by using BULK INSERT, including:
- Automatically create a table based on the CSV layout
- Handle the text delimiter of double quotes
Note: CSV files do not need to be local.
Using LogParser
You can use the LogParser command-line tool or a COM-based scripting interface. Let's look at examples of both.
LogParser command-line tool
LogParser requires some special handling, which is why we use Start-Process. Some switches and arguments are difficult to work with when running directly in Windows PowerShell. Also Windows Powershell_ISE will not display output from LogParser that are run via the command-line tool. Here is the syntax for running a command to generate and load a CSV file:
./get-diskspaceusage.ps1 | export-csv -Path "C:\Users\Public\diskspace.csv" -NoTypeInformation -Force
#Uncomment/comment set-alias for x86 vs. x64 system
#set-alias logparser "C:\Program Files\Log Parser 2.2\LogParser.exe"
set-alias logparser "C:\Program Files (x86)\Log Parser 2.2\LogParser.exe"
start-process -NoNewWindow -FilePath logparser -ArgumentList @"
"SELECT * INTO diskspaceLP FROM C:\Users\Public\diskspace.csv" -i:CSV -o:SQL -server:"Win7boot\sql1" -database:hsg -driver:"SQL Server" -createTable:ON
"@
Looking at SQL Server, we see that our newly created table contains the CSV file:
Filename
|
Row Number
|
Usage Date
|
System Name
|
Label
|
Volume Name
|
Size
|
Free
|
Percent Free
|
C:\Users\Public\diskspace.csv
|
2
|
2011-11-20
|
WIN7BOOT
|
RUNCORE SSD
|
D:\
|
59.62
|
31.56
|
52.93
|
C:\Users\Public\diskspace.csv
|
3
|
2011-11-20
|
WIN7BOOT
|
DATA
|
E:\
|
297.99
|
34.88
|
11.7
|
C:\Users\Public\diskspace.csv
|
4
|
2011-11-20
|
WIN7BOOT
|
|
C:\
|
48
|
6.32
|
13.16
|
C:\Users\Public\diskspace.csv
|
5
|
2011-11-20
|
WIN7BOOT
|
HP_TOOLS
|
F:\
|
0.1
|
0.09
|
96.55
|
The CreateTable switch will create the table if it does not exist; and if it does exist, it will simply append the rows to the existing table. Also notice that we got two new columns: Filename and Row Number, which could come in handy if we are loading a lot of CSV files. You can eliminate the Filename and Row Number columns by specifying the column list in the Select statement as we'll see in a moment.
LogParser COM scripting
Using the COM-based approach to LogParser is an alternative method to using the command line. Although the COM-based approach is a little more verbose, you don't have to worry about wrapping the execution in the Start-Process cmdlet. The COM-based approach also handles the issue with Windows Powershell ISE. Here is code to work with the COM object:
$logQuery = new-object -ComObject "MSUtil.LogQuery"
$inputFormat = new-object -comobject "MSUtil.LogQuery.CSVInputFormat"
$outputFormat = new-object -comobject "MSUtil.LogQuery.SQLOutputFormat"
$outputFormat.server = "Win7boot\sql1"
$outputFormat.database = "hsg"
$outputFormat.driver = "SQL Server"
$outputFormat.createTable = $true
$query = "SELECT UsageDate, SystemName, Label, VolumeName, Size, Free, PercentFree INTO diskspaceLPCOM FROM C:\Users\Public\diskspace.csv"
$null = $logQuery.ExecuteBatch($query,$inputFormat,$outputFormat)
The main drawback to using LogParser is that it requires, well…installing LogParser. For this reason, let's look at one more approach.
Use Windows PowerShell to collect server data and write to SQL Server
In my previous Hey, Scripting Guy! post, Use PowerShell to Collect Server Data and Write to SQL, I demonstrated some utility functions for loading any Windows PowerShell data into SQL Server. Let's revisit this solution using the CSV file example:
Setup
Download the following scripts:
Run the following code to create a CSV file, convert to a data table, create a table in SQL Server, and load the data:
. .\out-datatable.ps1
. .\Add-SqlTable.ps1
. .\write-datatable.ps1
. .\Invoke-SqlCmd2.ps1
$dt = .\Get-DiskSpaceUsage.ps1 | Out-DataTable
Add-SqlTable -ServerInstance "Win7boot\Sql1" -Database "hsg" -TableName diskspaceFunc -DataTable $dt
Write-DataTable -ServerInstance "Win7boot\Sql1" -Database "hsg" -TableName "diskspaceFunc" -Data $dt
invoke-sqlcmd2 -ServerInstance "Win7boot\Sql1" -Database "hsg" -Query "SELECT * FROM diskspaceFunc" | Out-GridView
The following image shows the resulting table in Grid view.
The observant reader will notice that I didn't write the information to a CSV file. Instead, I created an in-memory data table that is stored in my $dt variable. This is because by using this approach, there was not a need to create a CSV file, but for completeness let's apply the solution to our CSV loading use case:
. .\out-datatable.ps1
. .\Add-SqlTable.ps1
. .\write-datatable.ps1
. .\Invoke-SqlCmd2.ps1
./get-diskspaceusage.ps1 | export-csv -Path "C:\Users\Public\diskspace.csv" -NoTypeInformation -Force
$dt = Import-Csv -Path "C:\Users\Public\diskspace.csv" | Out-DataTable
Add-SqlTable -ServerInstance "Win7boot\Sql1" -Database "hsg" -TableName diskspaceFunc -DataTable $dt
Write-DataTable -ServerInstance "Win7boot\Sql1" -Database "hsg" -TableName "diskspaceFunc" -Data $dt
This post demonstrated three approaches to loading CSV files into tables in SQL Server by using a scripted approach. The approaches range from using the very simple T-SQL BULK INSERT command, to using LogParser, to using a Windows PowerShell function-based approach.
Thank you, Chad, for sharing this information with us. It looks like your last four scripts have the makings of an awesome NetAdminCSV module.
I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy