Vasanthan
Vasanthan

Reputation: 310

Limit Output Flat File Size In SSIS

I am using Flat File Destination to export my data's to a Text file. I need to limit the output File size to 1 MB. Is it possible?

Upvotes: 1

Views: 3951

Answers (3)

Edmund Schweppe
Edmund Schweppe

Reputation: 5132

There isn't anything out-of-the-box that will do that in SSIS; however, it's relatively simple to code a Script Transformation that can act as a data flow destination and that will limit the size of its output file. It's not much harder to add some code to create multiple files (each smaller than the specified size) as needed to store all the resulting data.

For example, assume your source query is

SELECT 
    TABLE_CATALOG,
    TABLE_SCHEMA,
    TABLE_NAME,
    COLUMN_NAME,
    ORDINAL_POSITION,
    COLUMN_DEFAULT,
    IS_NULLABLE,
    DATA_TYPE,
    CHARACTER_MAXIMUM_LENGTH 
FROM INFORMATION_SCHEMA.COLUMNS

and you're trying to write this to one or more CSV files, without exceeding some particular size per file.

Define three package-level variables as follows:

  • User::TargetFolder (String containing the folder name you want to write to)
  • User::TargetFileNamePattern (String with the naming pattern for the output files; e.g. SampleOutput{0}.csv)
  • User::MaxFileLength (Int32 containing the maximum number of characters per file)

Create your data flow like this:

data flow screenshot

And code the script transformation thusly:

/* Microsoft SQL Server Integration Services Script Component
*  Write scripts using Microsoft Visual C# 2008.
*  ScriptMain is the entry point class of the script.*/

using System;
using System.Data;
using System.IO;
using System.Text;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    private int _fileCounter;
    private int _bytesWritten;
    private TextWriter _tw;

    private TextWriter CurrentWriter
    {
        get
        {
            if (_tw == null)
            {
                string fileName = String.Format(this.Variables.TargetFileNamePattern, _fileCounter);
                string filePath = Path.Combine(this.Variables.TargetFolder, fileName);
                _tw = File.CreateText(filePath);
            }
            return _tw;
        }
    }

    public override void PreExecute()
    {
        base.PreExecute();
        _fileCounter = 1;
        _bytesWritten = 0;
        _tw = null;
    }

    public override void PostExecute()
    {
        base.PostExecute();
        if (_tw != null)
        {
            _tw.Flush();
            _tw.Close();
        }
    }

    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        string thisLine = String.Format(
                "{0},{1},{2},{3},{4},{5},{6},{7},{8}",
                Row.TABLECATALOG,
                Row.TABLESCHEMA,
                Row.TABLENAME,
                Row.COLUMNNAME,
                Row.ORDINALPOSITION,
                Row.COLUMNDEFAULT_IsNull ? "NULL" : Row.COLUMNDEFAULT,
                Row.ISNULLABLE,
                Row.DATATYPE,
                Row.CHARACTERMAXIMUMLENGTH_IsNull ? "NULL" : Row.CHARACTERMAXIMUMLENGTH.ToString());
        if (_bytesWritten + thisLine.Length > this.Variables.MaxFileLength) 
        {
            _tw.Flush();
            _tw.Close();
            _tw = null;
            _fileCounter += 1;
            _bytesWritten = 0;
        }
        this.CurrentWriter.WriteLine(thisLine);
        _bytesWritten += thisLine.Length;
    }
}

For each row in the source query, this will build the string to be written and then check to see if adding that string to the current TextWriter would cause the file to be too big. If that's the case, the current file is flushed to disk and closed; the next call to this.CurrentWriter will create a new TextWriter object for the next file in sequence.

Upvotes: 1

Meff
Meff

Reputation: 5999

Some ideas:

Using script transform, you'd have the generated class which has the "get next row" method for your logic - On that class add a counter, and on each row you get work out the size of the data of the row and add it to the counter. If the counter is over 1mb worth of rows, do not put the row into the the output buffer.

Use a script component on the control flow, call .Net IO methods to read 1mb of the file and write only that back (This does mean you're creating a big file then shrinking it).

Really, SSIS isn't going to do either of these well. I'd go with the "SELECT TOP X" approach if it was me..

Upvotes: 0

Ben
Ben

Reputation: 35613

If the data will exceed the file size, you must make a decision as to what data to omit. That is something only you or your customer can decide.

Then write the query to return only the data which is required.

Upvotes: 0

Related Questions