mehtat_90
mehtat_90

Reputation: 628

Split flat file into multiple files

I need to create a package which splits the huge flat file into multiple flat files.

I have flat file which has 20 million rows and now I need to split this flat file (Each flat file needs to have 55 k rows)

Example:- if there are 111 rows in total, I would have to create 3 files.

file1.txt will have 1-55 rows file2.txt will have 55-110 rows file3.txt will have 1 rows .

What options do I have?

I am using Visual Studio 2012 for this project.

Upvotes: 0

Views: 1052

Answers (2)

SFrejofsky
SFrejofsky

Reputation: 772

you could try something like this... its pretty rudimentary and I am sure that someone will point out that it is not going to be the most efficient thing but its a solid option. Note that you will need to add some try catch error handling.

            int recper = 0; // this is where you will assign the number of records per file 
            int reccount = 0;
            int filecount = 1;
            string filename = "testfilename";
            string networkDirectory = @"c:\fakepath\";
            string fileToRead = @"c:\fakepath\textfile.txt";

            using (StreamReader reader = new StreamReader(fileToRead,Encoding.Default,true))
            {
                while (reader.Peek() > 0)
                {
                    using (StreamWriter writer = new StreamWriter(Path.Combine(networkDirectory, filename + filecount + ".txt"), true, Encoding.Default))
                    {
                        writer.Write(reader.ReadLine());
                    }
                    reccount++;
                    // checks on each iteration of the while loop to see if the 
                    // current record count matches the number of records per file
                    // if sso reset reccount and change increment filecount to change the file name 
                    if (reccount == recper)
                    {
                        reccount = 0;
                        filecount++;
                    }
                }
            }

Upvotes: 1

Tab Alleman
Tab Alleman

Reputation: 31775

Another way you can do this is in a dataflow:

First use your method of choice to add a "Row Number" column to your data flow (unless there is already one in your flat file output, in which case skip this step and use that):

https://www.google.com/search?sourceid=navclient&aq=&oq=add+rownumber+column+to+ssis+dataflow&ie=UTF-8&rlz=1T4GGNI_enUS551US551&q=add+rownumber+column+to+ssis+dataflow&gs_l=hp....0.0.0.6218...........0._Qm62-0x_YQ

Then add a MultiCast transformation to your dataflow, and use the row number to split the stream and send it to different destinations:

Row 1 - 55k, -> File1

Row 55001 - 110k -> File2

etc.

Upvotes: 1

Related Questions