Scenario:
I have a sample file which looks like this
I have a scenario, where in I need to split the above file into multiple files at every occurrence of the word "Group".
It means, I need to split the above "Test.his" file into 7 files (Why is it 7 ? Answer: Count the occurrences of the word "Group").
It means, I need to split the above "Test.his" file into 7 files (Why is it 7 ? Answer: Count the occurrences of the word "Group").
Solution:
The solution for this gets simpler using an Execute SQL Task , Control Table in the database , For Each Loop Container and Script Task functionality. Lets see how we could achieve this.
Step 1:
Lets create a control table in our database.
CREATE TABLE [dbo].[Test_ControlTable](
[GROUPNAME] [varchar](20) NULL,
[SEARCHINDEX] [varchar](50) NULL
)
GO
Step 2:
Execute the below mentioned script to insert rows into the table.
INSERT INTO [dbo].[Test_ControlTable] VALUES('PATIENT','Group:PATIENT');
GO
INSERT INTO [dbo].[Test_ControlTable] VALUES('APRENURS','Group:APRENURS');
GO
INSERT INTO [dbo].[Test_ControlTable] VALUES('PHYS','Group:PHYS');
GO
INSERT INTO [dbo].[Test_ControlTable] VALUES('PR','Group:PR');
GO
INSERT INTO [dbo].[Test_ControlTable] VALUES('PN','Group:PN');
GO
INSERT INTO [dbo].[Test_ControlTable] VALUES('PROCTIM','Group:PROCTIM');
GO
INSERT INTO [dbo].[Test_ControlTable] VALUES('STUDY','Group:STUDY');
GO
Your SELECT * from [dbo].[Test_ControlTable] gives the below results.
Step 3:
Create a new project in Visual Studio(BIDS) with Project Name as Test.
Step 4:
Drag and drop Execute SQL Task into work space of the package.
And configure it as below
ResultSet : Full Result Set
SQLStatement : SELECT GroupName, SearchIndex FROM [dbo].[Test_ControlTable]
In the Result set tab,
Click on Add, and give Result name as 0 (Zero) and map it to Object Variable (create one in variables window as shown in the image )
Step 5:
Drag and drop For Each Loop container and Script Task into the work space as shown below.
This means, the script task runs for every row that is coming from the control table.
Step 6:
Create two variables in the variables window as shown below.
Step 7:
Configure the For Each Loop container as shown
Step 8:
Edit the Script Task with the following script:
using System;
using System.Text;
using System.Linq;
using System.IO;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Collections.Generic;
using System.Text.RegularExpressions;
#endregion
public void Main()
{
// TODO: Add your code here
//Declare and Initialize the variables
String vFileName = "C:\\Users\\tmhsxk35\\Documents\\Test.his" ;
String vGroupName = Dts.Variables["User::vGroupName"].Value.ToString();
String vSearchIndex = Dts.Variables["User::vSearchIndex"].Value.ToString();
String vSplitFilePath = "C:\\Users\\tmhsxk35\\Documents\\SplitFiles\\";
// Reading the file
String s = File.ReadAllText(vFileName, Encoding.Default);
//Finding the location of vSearchIndex
Match match = Regex.Match(s, @"\b" + Regex.Escape(vSearchIndex) + @"\b", RegexOptions.IgnoreCase);
//If the search is successful, then capture the position of the Index into vPos
if (match.Success)
{
int vPos = match.Index;
//Assigning the position of next occurence of "Group" to vNextPos
int vNextPos = s.IndexOf("Group", vPos + 1);
//Capturing the content between the occurences of "Group" into vContent
String vContent = s.Substring(vPos, vNextPos - vPos - 2);
//Writing the content to a file
StreamWriter sw = File.CreateText(vSplitFilePath + vGroupName + ".csv");
sw.Write(vContent);
sw.Flush();
sw.Close();
Dts.TaskResult = (int)ScriptResults.Success;
}
Dts.TaskResult = (int)ScriptResults.Success;
}
Use the script mentioned above to split the files.
Save the package. And execute the package to see the file split to 7 files at the path mentioned above.
Please modify the script as required.
Thanks for taking time to check out the post!
Thanks for taking time to check out the post!
P.S:
I used For Each Loop Container, Control Table, Execute SQL Task because my searchIndex is a variable. If your searchIndex is constant, you dont have to use them. Just initialize everything in variables inside the script task.
No comments:
Post a Comment