Brian D. Brubaker
Brian D. Brubaker

Reputation: 349

Flat File Import: Remove Data

(Posted a similar question earlier but HR department changed conditions today)

Our HR department has an automated export from our SAP system in the form of a flat file. The information in the flat file looks like so.

G/L Account 4544000 Recruiting/Job Search
Company Code 0020

-------------------------- 
| Posting Date| LC amnt|
|------------------------|
| 01/01/2013 | 406.25 |
| 02/01/2013 | 283.33 |
| 03/21/2013 |1,517.18 |
--------------------------
G/L Account 4544000 Recruiting/Job Search
Company Code 0020

-------------------------- 
| Posting Date| LC amnt|
|------------------------|
| 05/01/2013 | 406.25 |
| 06/01/2013 | 283.33 |
| 07/21/2013 |1,517.18 |
--------------------------

When I look at the data in the SSIS Flat File Source Connection all of the information is in a single column. I have tried to use the Delimiter set to Pipe but it will not separate the data, I assume due to the nonessential information at the top and middle of the file.

I need to remove the data at the top and middle and then have the Date and Total split into two separate columns.

The goal of this is to separate the data so that I can get a single SUM for the running year.

Year    Total
2013    $5123.25

I have tried to do this in SSIS but I cant seem to separate the columns or remove the data. I want to avoid a script task as I am not familiar with the code or operation of that component.

Any assistance would be appreciated.

Upvotes: 0

Views: 997

Answers (3)

EricZ
EricZ

Reputation: 6205

I would create a temp table that can import the whole flat file, after that do filter on SQL level

An example

  1. Create TABLE tmp (txtline VARCHAR(MAX))

  2. BCP or SSIS file into tmp table

  3. Run Query like this to get result ( you may need adjust string length to fit your flat file)

    WITH cte AS ( SELECT CAST(SUBSTRING(txtline,2,10) AS DATE) AS PostingDate, CAST(REPLACE(REPLACE(SUBSTRING(txtline,15,100),'|',''),',','') AS NUMERIC(19,4)) AS LCAmount FROM tmp WHERE ISDATE(SUBSTRING(txtline,2,10)) = 1 ) SELECT YEAR(PostingDate), SUM(LCAmount) FROM cte GROUP BY YEAR(PostingDate)

Upvotes: 1

Edmund Schweppe
Edmund Schweppe

Reputation: 5132

Short of a script task/component (or a full-blown custom SSIS component), I don't think you'll be able to parse that specific format in SSIS. The Flat File Connection Manager does allow you to select how many rows of your text file are headers to be skipped, but the format you're showing has multiple sections (and thus multiple headers). There's also the issue of the horizontal lines, which the Flat File Connection won't be able to properly handle.

I'd first see if there's any way to get a normal CSV file with this data out of SAP. If that turns out to be impossible, then you'll need some sort of custom code to strip out the excess text.

Upvotes: 0

BWS
BWS

Reputation: 3836

maybe you could use MS-Excel to open the flat file, using pipe-character as the delimeter, and then create a CSV from that, if needed.

Upvotes: 0

Related Questions