Vince Ashby-Smith
Vince Ashby-Smith

Reputation: 1192

SSIS read excel spreadsheet and generate number and insert into db

Hi all i've got to tackle an issue of loading an excel spreadsheet using an ssis, interpreting the data, carrying out a number generation exercise and then insert into a sql server database. I can read the excel spreadsheet and get the data out fine. However the issue that i'm facing is the number generation part and then inserting into a database.

My excel spreadsheet looks something like this:

Range       Location    0   1   2   3   4   5   6   7   8   9                   
01132 21    Leeds       Y       Y   Y       Y   Y       Y

So for example, we read that the Leeds range has a 'Y' under the 2 column, this means we need to generate numbers between 01132212000 - 01132212999. I'm a bit ensure how we can read the excel speradsheet, check to see if a number has a 'Y' under it, generate those numbers for that range and then insert all the generated numbers into a database. Any ideas?

Upvotes: 0

Views: 1093

Answers (2)

Todd McDermid
Todd McDermid

Reputation: 1680

Not too difficult, actually - if I can assume that each of these "numbers" is supposed to appear as a row.

You need a Data Flow with an Excel Source in it (which will dictate that you run your package in 32-bit mode). You'd then use the Unpivot component to turn the "number" columns into rows, so each range/location has a row with a Y or N. Use a Conditional Split to filter out the N rows, and you're just left with Ys. Then you'll need a source - either a Script source or a crafty OLE DB Source - to generate 1000 rows numbered from 0 to 999. You'll do a cartesian join using Derived Columns, Sorts, and Merge Join between your Excel rows and the number rows next. Then you can use a Derived Column to generate your "real" number that you want, cast it as a string, and left pad it with zeroes.

To clarify on the "script source or a crafty OLE DB Source" to generate row numbers... Using a script as a source:

  1. Add one column that's just a DT_I4.
  2. Inside the CreateNewOutputRows, use a for loop to iterate 1000 times, inside the loop use Output0Buffer.AddRow, and set your column to the loop value.

Using an OLE DB Source:

  1. Create a CTE or some other T-SQL magic (which I'm not qualified to suggest - I just know it's possible) to create a "numbers table" from 1 to 1000, and SELECT from that.

Upvotes: 1

John N
John N

Reputation: 1815

Personally, I'd solve this in the database. Create a table in your DB to hold the raw data from your excel sheet. You can then solve the issue in SQL like this:

create table #excel (
    range       nvarchar(7),
    location    nvarchar(20),
    col_0       nvarchar(1),
    col_1       nvarchar(1),
    col_2       nvarchar(1),
    col_3       nvarchar(1),
    col_4       nvarchar(1),
    col_5       nvarchar(1),
    col_6       nvarchar(1),
    col_7       nvarchar(1),
    col_8       nvarchar(1),
    col_9       nvarchar(1)
)

/*Use SSIS to load your Excel sheet in, instead of this insert*/
insert into #excel
values ('0113221', 'Leeds', 'Y', NULL, 'Y', 'Y', NULL, 'Y', 'Y', NULL, 'Y', NULL)


;with numbers as
(
    select 0 x
    union all
    select x + 1
    from numbers
    where x < 99
)
select e.location, e.range + '0' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_0 = 'Y'
union all
select e.location, e.range + '1' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_1 = 'Y'
union all
select e.location, e.range + '2' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_2 = 'Y'
union all
select e.location, e.range + '3' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_3 = 'Y'
union all
select e.location, e.range + '4' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_4 = 'Y'
union all
select e.location, e.range + '5' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_5 = 'Y'
union all
select e.location, e.range + '6' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_6 = 'Y'
union all
select e.location, e.range + '7' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_7 = 'Y'
union all
select e.location, e.range + '8' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_8 = 'Y'
union all
select e.location, e.range + '9' + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.col_9 = 'Y'

If you can normalise your data as you load it into the format below (or similar), you can save yourself a lot of untidy code:

create table #excel (
    range       nvarchar(7),
    location    nvarchar(20),
    number      nvarchar(1),
    yes_no      nvarchar(1)
)

insert into #excel
values ('0113221', 'Leeds', '0', 'Y'),
    ('0113221', 'Leeds', '1', NULL),
    ('0113221', 'Leeds', '2', 'Y'),
    ('0113221', 'Leeds', '3', 'Y'),
    ('0113221', 'Leeds', '4', NULL),
    ('0113221', 'Leeds', '5', 'Y'),
    ('0113221', 'Leeds', '6', 'Y'),
    ('0113221', 'Leeds', '7', NULL),
    ('0113221', 'Leeds', '8', 'Y'),
    ('0113221', 'Leeds', '9', NULL)


;with numbers as
(
    select 0 x
    union all
    select x + 1
    from numbers
    where x < 99
)
select e.location, e.range + e.number + RIGHT('00' + CAST(n.x as nvarchar), 3)
from #excel e cross join
    numbers n
where e.yes_no = 'Y'

My SSIS is a little rusty, and I don't have an instance in front of my to play with, so I'm afraid I can't help you with the normalisaition piece.

Upvotes: 0

Related Questions