Categories
Tech

MS SQL Server: Adding Line Number for Imported Flat File in CSV Format

This is the procedure for writing the line number of each record imported into a table from a flat file in csv format. This is for Microsoft SQL Server using SSIS and Visual Studio to create the ETL package. This assumes that you already know how to use these tools, more or less. I’m no expert, but I wanted to pass along what I’d learned because it took so long to piece this together from what was available online. Below is a diagram of my Data Flow for reference.

dataflow

  1. Make sure that a column to store the line numbers has already been created in the SQL Prep task.
  2. Create a variable with Data Flow scope called DummyVar or anything you wish since it’s just a dummy variable to enable the next step.
  3. Create a Derived Column transformation after the Data Conversion transformation. Give the derived column the same name as the column you created to hold the line numbers, such as LINENO and enter @[User::DummyVar] as the expression.
  4. Create a Script Component following the Derived Column transformation. Set the derive column LINENO as its input in read/write mode. Use the following Visual Basic script in the component.

======

‘ Microsoft SQL Server Integration Services user script component
‘ This is your new script component in Microsoft Visual Basic .NET
‘ ScriptMain is the entrypoint class for script components

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

Public Class ScriptMain
Inherits UserComponent
Private LineNumber As Integer

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

LineNumber += 1
Row.LINENO = LineNumber

End Sub

End Class

=====

Finally just make sure that the LINENO column is in your destination transformation and you’re all set. In my Data Flow I added a viewer just to check that things were working during development. Again, I’m sure that there is some more elegant way of doing this, but this seemed straight forward and simple. And I could not find a single description of doing this very simple thing online anywhere. While there is a Row Count transformation available, it does not allow you to keep a running tally of the record number as they are processed.

This feels a bit like a hack to me, so if anyone out there has a better way of doing this, please feel free to comment!