Friday, May 3, 2013

Loop through excel column names using SSIS

How do I get the field names from a specific excel sheet in an excel file? There was a question asked on SSIS MSDN forum which forced me to blog on this.

Suppose the excel file has more than one sheets and the sheet we are interested in is ImpData which has two columns (Data1 and Data2). The aim is to get these field names and check if these fields are present in a specific sql server table. If fields Data1 and Data2 are not present in the table then add them to the table. Once the columns are added to the table load the excel data to the sql server table.

Step1:Create a table ExcelFields (Field1 Varchar(20))

Step2:Create 4 package scoped string variables SheetName, FieldName, ExcelSheetName and AlterSQL. Set the value for SheetName as ImpData$ as this is the sheet that contains the relevant data to be fetched and loaded to sql server table.

Step3:Create an connection for the excel file. Use a foreach loop to iterate through excel sheets and save each of the sheet name in ExcelSheetName variable. Check my blog on how to create connection manager and save the excel sheet name in a variable.

Step4:Have a foreach loop within the foreach loop used to capture the sheet name (Step3) and click on Collection in the left pane in foreach loop editor. Select the enumerator as "Foreach ADO.NET Schema Rowset Enumerator". Under enumerator configuration select the connection created in Step2. Select "Columns" as Schema from the drop down box.
Click on "SetRestrictions" button and select Table_Catalog, Table_Schema and Table_Name. Highlight Table_Name and Select SheetName variable as shown:

Step5:Click on Variable Mappings in left pane and select the variable FieldName (created in Step1) and set the index as 3.

Following is the control flow so far:

Step6: Put a dummy script task inside the outer foreach loop and hook it to the inner foreach loop container. Add a precedence constraint (Evaluation operation as expression and constraint and value as success)using following expression: @[User::SheetName]==@[User::ExcelSheetName].

Step7: Set the evaulate as expression property for variable AlterSQL as true and set the following expression: "Alter Table ExcelFields Add " +  @[User::FieldName] + "  Varchar (50)"

Step8: Add an execute sql task inside the inner foreach loop and set the connection manager to point to the table that needs to be altered. Use AlterSQL as the sql source in execute sql task.

Step9: An execute sql task can be added out of the both the foreach loops and be used to load the excel data to table using the distributed queries mentioned here.

Note: The expression for AlterSQL can be modified to check and ignore if the fields are already present in the table

Saturday, April 3, 2010

Error handling using script component

This post is just to show how to trigger an event handler for a data flow task if any component inside the same data flow task fails. This approach will work only for those data flow task components which have error output (Merge, Merge Join etc.. has no error op).
Poblem: Suppose I have a lookup component inside Data flow task and any mismatch should be treated as error and an appropriate error message should be sent using Send Mail Task.
Solution: Take a data flow task and configure it using an OLEDB source, lookup component and take the error op of lookup to a script component. Check
here for error handling using script component. The data flow task would be like:
Create a string variable (in this case the name of variable is Error) and use this variable as ReadWriteVariables inside script component editor as shown below:

Use following code inside the script transformation editor (I am using VB.Net code):
Now create an event handler for the data flow task for "On Task Failed" event and add a Send Mail Task. For demo purpose I am using a script task (as I don't have smtp server configured) as shown below:

Value of variable (Error) can be verified by adding following piece of code inside scipt task editor (don't forget to set Error variable as ReadOnlyVariable inside Scrpt task editor):
Public Sub Main()
Dts.TaskResult = ScriptResults.Success

End Sub

Saturday, January 16, 2010

Executing an SSIS package using batch file or stored procedure or xp_cmdshell

Creating a batch file and schedule it trough Task Scheduler or SQL Agent
Following is the script (same as used in executing the package through Dtexec command line utility except the value of the variable Mode) that is used in creating a batch file (crate a text file, paste the script and rename it to .bat):

A batch file can be schedules using a task scheduler or SQL Agent. Create a job and this time the step would be OS type job step. Give the actual path of batch file as shown below:

This batch file can be scheduled using windows task scheduler.

Using sp_start_job stored procedure to execute a sql agent job
Stored procedure sp_start_job can be used to execute the already created job. This stored procedure accepts various parameters, job name or job id being the mandatory parameter. Value of the variable “Mode” cannot be changed using this procedure. To verify the result in table, edit the job and set the value of variable Mode as Procedure. Then execute the procedure as:
EXEC msdb.dbo.sp_start_job N'SSISMode'
The procedure is available in msdb database so use this database while executing the procedure. The name of the job is SSISMode. To verify the result, check the table for an entry with value for column Execution Mode as Procedure

Using xp_cmdshell to execute a package from SSMS
Xp_cmdshell is off by default and needs to be enabled by a sysadmin. Check following links to enable xp_cmdshell in SQL Server 2005 and 2008:
SQL Server 2005 and
SQL Server 2008 Once the xp_cmdshell is enabled, write the command to execute the package. The command would be like: EXEC XP_CMDSHELL 'DTEXEC <Place Holder>' Use DtexecUI to set the value of variable Mode as Xp_cmdshell. Then click on Command Line tab and copy the command and paste it in the place holder above (). After executing the package check and verify the result in the table by selecting a column with Execution Mode as Xp_cmdshell. Using this method to execute a package is not recommended because the use of xp_cmdshell requires sysadmin privilege and thus it opens up a security hole. Note: When pasting the command in management studio, don’t try to split the lengthy command into multiple lines (using enter key) to avoid scrolling left and right. Check this link to know why not to do so


Executing an SSIS package through code or sql agent

Using code to execute the package
Create simple windows application to execute a package. Open Visual Studio and create a Windows Application. Add a reference to Micosoft.SQLServer.ManagedDTS as shown below:

In case SQL Server 2005 is used, then select version 9.0 and in case of SQL Server 2008 use the 10.0 version.
Add a button to the form and double click the button to add the following code in button click event:
String path = @" C:\ExecutionModesofSSIS\Package.dtsx";
Microsoft.SqlServer.Dts.Runtime.Application app = new Microsoft.SqlServer.Dts.Runtime.Application();
Package pack = app.LoadPackage(path, null);
pack.Variables["Mode"].Value = "CSharp";
DTSExecResult Result = pack.Execute();
Add following using statement as well:
using Microsoft.SqlServer.Dts.Runtime
Build the solution and execute it. Check the result in the table. This time there will be an entry for Execution Mode as CSharp.

Using SQL Agent by creating a job (scheduled or unscheduled)
Create a job to execute the package from file system. Right click the SQL Agent by connecting to SQL Server Management Studio and select New Job. Give the job name as SSISMode and then click on Steps. Click on New to add a job step Execute as shown below:

SQL Agent Service Account is the default account which the SQL Agent will use while executing the package. Now click on Set Values tab shown in the above figure (Green colored box) and set the value of variable Mode as SQLAgent as shown below:

Once the value of variable Mode is set, finish the job by clicking OK. Don’t schedule the job (it can be scheduled by clicking on Schedule tab). Now right click the job and click on "Start Job at step…" as shown below:

Check and verify the result in table. There will be an entry for SQLAgent as Execution Mode inside the table. There is one column “Executing User Name” in the table which has not been considered yet. Check the value for that column. For all the execution modes other than SQLAgent, the executing user name will be same (the domain account like: ABCD\nitesh.rai) but for Execution Mode as SQLAgent, the user name will be NT AUTHORITY\SYSTEM. This is the account, which the SQL Agent is configured to use (SQL Agent Service account) which can be changed if required.


Executing an SSIS package through command line utility and power shell

Using Dtexec command line utility
Open command prompt and use following command to execute the package. Set the value of variable Mode as Dtexec as shown below and execute the command.

Verify the result in the table. There is lot of switches used in the above command and the tricky part here is that all this can be generated without writing even a single line inside the command prompt. Open the Execute package Utility using dtexecUI (as we have seen earlier in executing the package using DtExecUI). After selecting the package source and package name, set the variable’s value as Dtexec. Then click on Command Line tab and copy the command and paste it in command prompt after typing DTEXEC. The command that is to be copied from DtExecUI is shown below:

Using Windows Powershell
Windows powershell can also be used to execute the package in same way we executed the package using command line utility with a very small difference. Open windows powershell and use same command which was used in dtexec command line utility and execute as shown below:

In the above figure there is a message “option set is invalid” and the package does not gets executed (can be verified by checking the result in the table). The only small difference between the commands used in command prompt and powershell is double quote (“) after [User::Mode] as shown in red circle below:

Remove this double quote and add a double quote after the variable value powershell as shown below:
Check the result in the table. A new row with Execution Mode as Powershell is present in the table now.