Importing a TextFile into a Database using a Dataset
This question comes up a bit on the forums (twice today already) where the user wants to import data from one Datasource (such as a TextFile) into a database (SQL Server / Access / Whatever). Now in the case of SQL Server my natural reaction would be to suggest DTS for SQL 2000 or check out some of the new cool features of SQL 2005 (which I've yet to experience). However, there are occasions - especially if some processing needs to be performing during the import, where it would be a lot easier if the import was done in .NET.
Now sure, if you want to read in your text file using a text reader - throw it into an array then utilise a DataCommand object and ExecuteNonQuery then go for it. However, if you like your coding a bit easier and find the DataTable a versatile object to store your data and perform various routine upon it before you upload, well, gather closer children...
OK - at my local pub they run a Bonus Ball thing. Basically, everyone has a number and pays one pound per week for privilege of that number. If the said number comes out in the National UK Lotto as the Bonus Ball on a Saturday, then you win the lot (up to £49). Now there's a lot of banter about who keeps winning (or seems to keep winning) and those with sob stories about how they haven't won in 'x' number of years. So on occasion I grab all the results from a website (Lottery Results) throw it into SQL and run a quick GROUP BY query to work out exactly when people have won.
So - for this post I've taken the results of the page and saved it as a text file. Got rid of the 'UK National Lotto Winning Numbers' and the 'All lotteries below have exceeded the 180 days expiry date' strings and second set of field titles just below this (maintaining the first line with the field names) and there's my text file for import.
I place the text file into an appropriate directory, create my table in a test Database, structure as is...
CREATE TABLE [dbo].[tblLottery] ([EntryId] [int] IDENTITY (1, 1) NOT NULL ,
[LotteryId] [int] NOT NULL ,
[BonusBall] [int] NOT NULL ,
[WeekDay] [nvarchar] (3) COLLATE Latin1_General_CI_AS NOT NULL ,
[DrawDay] [int] NOT NULL ,
[DrawMonth] [nvarchar] (10) COLLATE Latin1_General_CI_AS NOT NULL ,
[DrawYear] [int] NOT NULL
) ON [PRIMARY]
GO
and then utilise my code...
' DatasetDim ds As New DataSet
' Text File connection
Dim oleTextConn As New OleDbConnection
' Set the connection string
oleTextConn.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\MyImportDirectory\;Extended Properties=""text;HDR=Yes;FMT=Delimited"""
' Configure a dataadapter to load in the text data into my dataset
Dim daTextFile
As New OleDbDataAdapter
' Generate the Select Command
Dim selCommand
As New OleDbCommand
' Configure
With selCommand
.CommandText = "SELECT * FROM [Lottery.txt]"
.CommandType = CommandType.Text
.Connection = oleTextConn
End
With
' Add to the dataadapterdaTextFile.SelectCommand = selCommand
' Fill the dataset
oleTextConn.Open()
daTextFile.Fill(ds)
' Cleanup our text connections
oleTextConn.Close()
daTextFile.Dispose()
daTextFile = Nothing
oleTextConn.Dispose()
oleTextConn = Nothing
' Create the SQL Connection
Dim sqlConn
As New SqlConnection
sqlConn.ConnectionString =
"Data Source=ARNIE;Initial Catalog=MyLotteryDB;Integrated Security=SSPI;"
' Create a new SQL DataAdapter
Dim daSQL
As New SqlDataAdapter
' Generate the Insert Command
Dim insCommand
As New SqlCommand
With insCommand
.CommandText = "INSERT INTO tblLottery ( [LotteryId] , [BonusBall] , [WeekDay] , [DrawDay] , [DrawMonth] , [DrawYear] ) VALUES ( @LotteryId , @BonusBall , @WeekDay, @DrawDay , @DrawMonth, @DrawYear )"
.CommandType = CommandType.Text
.Connection = sqlConn
' Configure the insert parameters
.Parameters.Add(New SqlParameter("@LotteryId", SqlDbType.Int, 0, "No#"))
.Parameters.Add(New SqlParameter("@BonusBall", SqlDbType.Int, 0, "BN"))
.Parameters.Add(New SqlParameter("@WeekDay", SqlDbType.NVarChar, 3, "Day"))
.Parameters.Add(New SqlParameter("@DrawDay", SqlDbType.Int, 0, "DD"))
.Parameters.Add(New SqlParameter("@DrawMonth", SqlDbType.NVarChar, 10, "MMM"))
.Parameters.Add(New SqlParameter("@DrawYear", SqlDbType.Int, 0, "YYYY"))
End With
' Add to the dataadapter
daSQL.InsertCommand = insCommand
' Loop through each row and set its rowstate to added
' This will cause the dataadapter to think every row is new and needs inserting...
For Each dr
As DataRow
In ds.Tables(0).Rows
dr.SetAdded()
Next
' Open the SQL Connection and Update
sqlConn.Open()
daSQL.Update(ds.Tables(0))
' Cleanup
sqlConn.Close()
daSQL.Dispose()
daSQL = Nothing
sqlConn.Dispose()
sqlConn = Nothing
...and job done. Takes around 3 seconds to process on my machine which is pretty slick huh?
Design Notes:
- If you are not familiar with the Text File OleDb Driver, you specify the directory only in the connection string then the actual filename in the SELECT statement. In other words, each text file should be classed as its own data entity.
- When a datarow is loaded into a dataset, it rowstate will be set with the 'Unchanged' flag. If we then run our SQLDataAdapter nothing will happen as the dataset knows that the records have not been altered. In order for them to be inserted we reset this flag to 'Added' that will invoke the 'InsertCommand' when the 'Update' routine is called.
- Note: I use the 'SetAdded' method of the DataRow object to set the DataRow's RowState. This is a Framework 2.0 method only. But this can be worked around for Framework 1.*, e.g. create a duplicate DataTable and 'Add' the records from the sourcetable to new table - this should set the RowState flag.
HTH - M