Reputation: 9
I have table which is partitioned by acquisition date
Table Structure (Table Name : Inventory)
Vendor_Name [varchar](80) NULL,
Model_Name [varchar](80) NULL,
AcquisitionDate [datetime] NOT NULL,
Qty [bigint] NOT NULL
Partition Function For Inventory Table :
CREATE PARTITION FUNCTION [Inventory_PF_New](datetime) AS RANGE LEFT FOR VALUES
(N'2012-07-01T00:00:00.000', N'2012-08-01T00:00:00.000',
N'2012-09-01T00:00:00.000', N'2012-10-01T00:00:00.000',
N'2012-11-01T00:00:00.000', N'2012-12-01T00:00:00.000',
N'2013-01-01T00:00:00.000', N'2013-02-01T00:00:00.000',
N'2013-03-01T00:00:00.000', N'2013-04-01T00:00:00.000',
N'2013-05-01T00:00:00.000', N'2013-06-01T00:00:00.000',
N'2013-07-01T00:00:00.000', N'2013-08-01T00:00:00.000',
N'2013-09-01T00:00:00.000', N'2013-10-01T00:00:00.000',
N'2013-11-01T00:00:00.000', N'2013-12-01T00:00:00.000',
N'2014-01-01T00:00:00.000', N'2014-02-01T00:00:00.000',
N'2014-03-01T00:00:00.000', N'2014-04-01T00:00:00.000',
)
I receive daily feed flat files which can have acquisition date up to 3 months older. I need to process these files and load data to actual partition table (Inventory Table).
Using Bulk Insert/SSIS I can simply load these files into temporary table but How Do I use ALTER SWITCH to SWITCH daily received data into actual partition table? I know The receiving partition must be empty and when I load data into temporary table I could have three months of data. So, I can't simply perform ALTER SWITCH IN.
Any thoughts How do I implement this. My requirement is to load daily received data quickly and my daily feed can have three months of data?
My table is partitioned by month using acquisition date and I receive data feed which can have 3 to 4 months of data. How do I use ALTER SWITCH to load this data into actual partition table?
Upvotes: 1
Views: 6321
Reputation: 1065
I guess the question I have for you is "are you sure the table is partitioned?" Generally speaking, whether or not you are staging your data to a temporary location, inserts into a properly partitioned table should correctly align the data along the primary key, used in the partition function.
The example below creates a catalog with a simple partition and inserts records into a partitioned table ( based loosely around the structure you've divulged, as a partitioned table needs to have it's partitioned column at least part of the primary key, which has not been displayed in the original example ). The results provided show the growth of just the expected files ( 20140101 and 20140102 ) after a single insert and finally, the actual allocation of those rows in the partitions. The latter queries may help you along determining if your data is being properly allocated and if not, you may be able to glean some insight on how to fix that from the rest of the example!
Code below modified to show the SWITCH usage as outlined in the comments below:
Create a Test catalog:
USE master;
GO
-- Make a data directory;
-- EXECUTE xp_cmdshell 'mkdir D:\MSSQL\DATA\PartitionExample\';
-- Create a catalog for testing;
IF NOT EXISTS ( SELECT 1
FROM sys.databases
WHERE name = 'PartitionExample' )
BEGIN
--DROP DATABASE [PartitionExample];
CREATE DATABASE [PartitionExample]
ON PRIMARY (
NAME = [PartitionExample_dat],
FILENAME = 'D:\MSSQL\DATA\PartitionExample\PartitionExample.mdf',
SIZE = 3MB,
FILEGROWTH = 1MB )
LOG ON (
NAME = [PartitionExample_log],
FILENAME = 'D:\MSSQL\DATA\PartitionExample\PartitionExample.ldf',
SIZE = 512KB,
FILEGROWTH = 512KB );
END;
GO
-- Add filegroups and files for partitions;
IF NOT EXISTS ( SELECT *
FROM PartitionExample.sys.sysfiles
WHERE name LIKE 'fg[_]DateRange[_]________[_]dat' )
BEGIN
-- Filegroups
ALTER DATABASE [PartitionExample]
ADD FILEGROUP [fg_DateRange_20140101];
ALTER DATABASE [PartitionExample]
ADD FILEGROUP [fg_DateRange_20140102];
ALTER DATABASE [PartitionExample]
ADD FILEGROUP [fg_DateRange_20140103];
-- Files
ALTER DATABASE [PartitionExample]
ADD FILE (
NAME = [fg_DateRange_20140101_dat],
FILENAME = 'D:\MSSQL\DATA\PartitionExample\fg_DateRange_20140101.ndf',
SIZE = 512KB,
FILEGROWTH = 512KB )
TO FILEGROUP [fg_DateRange_20140101];
ALTER DATABASE PartitionExample
ADD FILE (
NAME = [fg_DateRange_20140102_dat],
FILENAME = 'D:\MSSQL\DATA\PartitionExample\fg_DateRange_20140102.ndf',
SIZE = 512KB,
FILEGROWTH = 512KB )
TO FILEGROUP [fg_DateRange_20140102];
ALTER DATABASE PartitionExample
ADD FILE (
NAME = [fg_DateRange_20140103_dat],
FILENAME = 'D:\MSSQL\DATA\PartitionExample\fg_DateRange_20140103.ndf',
SIZE = 512KB,
FILEGROWTH = 512KB )
TO FILEGROUP [fg_DateRange_20140103];
END;
GO
Build Partition Scheme / Function
-- Use the newly created catalog;
USE [PartitionExample];
GO
-- Set up partition function and scheme;
IF NOT EXISTS ( SELECT 1
FROM sys.partition_functions
WHERE name = 'pf_DateRange' )
BEGIN
--DROP PARTITION SCHEME [ps_DateRange];
--DROP PARTITION FUNCTION [pf_DateRange];
CREATE PARTITION FUNCTION [pf_DateRange] ( DATETIME )
AS RANGE RIGHT FOR VALUES ( '20140101', '20140102', '20140103' );
CREATE PARTITION SCHEME [ps_DateRange]
AS PARTITION [pf_DateRange] TO ( [PRIMARY],
[fg_DateRange_20140101], [fg_DateRange_20140102], [fg_DateRange_20140103] );
END;
GO
Create Partitioned Table and Populate:
-- Create table;
IF NOT EXISTS ( SELECT 1
FROM sys.objects
WHERE name = 'Inventory'
AND type = 'U' )
BEGIN
--DROP TABLE dbo.Inventory;
CREATE TABLE dbo.Inventory
(
Inventory_PK INTEGER IDENTITY( 1, 1 ) NOT NULL,
AcquisitionDate DATETIME NOT NULL,
PRIMARY KEY ( Inventory_PK, AcquisitionDate ),
Vendor_Name VARCHAR( 80 ) NULL,
Model_Name VARCHAR( 80 ) NULL,
Qty BIGINT NOT NULL
) ON ps_DateRange( AcquisitionDate );
END;
GO
-- "Stage" data, for initial population purposes
SET NOCOUNT ON;
IF NOT EXISTS ( SELECT 1
FROM dbo.Inventory )
BEGIN
DECLARE @i INTEGER;
CREATE TABLE dbo.t_StageInventory
(
Inventory_PK INTEGER IDENTITY( 1, 1 ) NOT NULL,
AcquisitionDate DATETIME NOT NULL,
PRIMARY KEY ( Inventory_PK, AcquisitionDate ),
Vendor_Name VARCHAR( 80 ) NULL,
Model_Name VARCHAR( 80 ) NULL,
Qty BIGINT NOT NULL
) ON ps_DateRange( AcquisitionDate );
SET @i = 0;
WHILE ( @i < 100 )
BEGIN
INSERT INTO dbo.t_StageInventory ( Vendor_Name, Model_Name,
AcquisitionDate, Qty )
VALUES ( 'VendorName', 'ModelName', '20140101', 1 );
SET @i = @i + 1;
END;
SET @i = 0;
WHILE ( @i < 100 )
BEGIN
INSERT INTO dbo.t_StageInventory ( Vendor_Name, Model_Name,
AcquisitionDate, Qty )
VALUES ( 'VendorName', 'ModelName', '20140102', 1 );
SET @i = @i + 1;
END;
-- Insert data into the partitioned table;
INSERT INTO dbo.Inventory ( AcquisitionDate, Vendor_Name,
Model_Name, Qty )
SELECT AcquisitionDate, Vendor_Name, Model_Name, Qty
FROM dbo.t_StageInventory;
DROP TABLE dbo.t_StageInventory;
SET NOCOUNT OFF;
END;
GO
View Distribution of Data:
SELECT ObjectName = OBJECT_NAME( p.object_id ),
PartitionSchemeName = ps.name,
PartitionFunctionName = pf.name,
PartitionNumber = p.partition_number,
FileGroup = fg.name,
Rows = p.rows
FROM sys.partitions p
INNER JOIN sys.indexes i
ON p.object_id = i.object_id
INNER JOIN sys.partition_schemes ps
ON i.data_space_id = ps.data_space_id
INNER JOIN sys.partition_functions pf
ON ps.function_id = pf.function_id
INNER JOIN sys.destination_data_spaces dds
ON dds.partition_scheme_id = ps.data_space_id
AND dds.destination_id = p.partition_number
INNER JOIN sys.filegroups fg
ON dds.data_space_id = fg.data_space_id
WHERE p.object_id = OBJECT_ID( 'Inventory' );
Now that the test is set up, we can move on to getting ready for the SWITCH! The staging table is once again populated:
New Stage:
-- "Stage" data from ETL
SET NOCOUNT ON;
DECLARE @i INTEGER;
CREATE TABLE dbo.t_StageInventory
(
Inventory_PK INTEGER IDENTITY( 1, 1 ) NOT NULL,
AcquisitionDate DATETIME NOT NULL,
PRIMARY KEY ( Inventory_PK, AcquisitionDate ),
Vendor_Name VARCHAR( 80 ) NULL,
Model_Name VARCHAR( 80 ) NULL,
Qty BIGINT NOT NULL
) ON ps_DateRange( AcquisitionDate );
SET @i = 0;
WHILE ( @i < 10 )
BEGIN
INSERT INTO dbo.t_StageInventory ( Vendor_Name, Model_Name,
AcquisitionDate, Qty )
VALUES ( 'VendorName', 'ModelName', '20140102', 1 );
SET @i = @i + 1;
END;
SET @i = 0;
WHILE ( @i < 100 )
BEGIN
INSERT INTO dbo.t_StageInventory ( Vendor_Name, Model_Name,
AcquisitionDate, Qty )
VALUES ( 'VendorName', 'ModelName', '20140103', 1 );
SET @i = @i + 1;
END;
With the staging table populated, we need to determine what rows must be transferred from the report-stable table and move them.
Re-Staging Data:
-- Re-stage existing partition data;
DECLARE @UpperBound DATETIME,
@LowRange DATETIME,
@HighRange DATETIME;
SET @UpperBound = '99991231';
SELECT @LowRange = MIN( CAST( pprv.value AS DATETIME ) ),
@HighRange = MAX( ISNULL( CAST( prv.value AS DATETIME ), @UpperBound ) )
FROM sys.partitions p
INNER JOIN sys.indexes i
ON p.object_id = i.object_id
INNER JOIN sys.partition_schemes ps
ON i.data_space_id = ps.data_space_id
INNER JOIN sys.partition_functions pf
ON ps.function_id = pf.function_id
INNER JOIN sys.destination_data_spaces dds
ON dds.partition_scheme_id = ps.data_space_id
AND dds.destination_id = p.partition_number
INNER JOIN sys.filegroups fg
ON dds.data_space_id = fg.data_space_id
LEFT JOIN sys.partition_range_values prv
ON ps.function_id = prv.function_id
AND p.partition_number = prv.boundary_id
LEFT JOIN sys.partition_range_values pprv
ON ps.function_id = prv.function_id
AND p.partition_number - 1 = pprv.boundary_id
WHERE p.object_id = OBJECT_ID( 't_StageInventory' )
AND rows <> 0;
INSERT INTO dbo.t_StageInventory( AcquisitionDate, Vendor_Name, Model_Name, Qty )
SELECT AcquisitionDate, Vendor_Name, Model_Name, Qty
FROM dbo.Inventory
WHERE AcquisitionDate >= @LowRange
AND AcquisitionDate < @HighRange;
SWITCH Out, then In
CREATE TABLE dbo.t_SwapInventory
(
Inventory_PK INTEGER IDENTITY( 1, 1 ) NOT NULL,
AcquisitionDate DATETIME NOT NULL,
PRIMARY KEY ( Inventory_PK, AcquisitionDate ),
Vendor_Name VARCHAR( 80 ) NULL,
Model_Name VARCHAR( 80 ) NULL,
Qty BIGINT NOT NULL
) ON ps_DateRange( AcquisitionDate );
-- Dynamic here...
DECLARE @t_Partition TABLE
(
partition_number INTEGER
);
INSERT INTO @t_Partition ( partition_number )
SELECT DISTINCT p.partition_number
FROM sys.partitions p
WHERE p.object_id = OBJECT_ID( 't_StageInventory' )
AND p.rows <> 0;
SET @i = @@ROWCOUNT;
DECLARE @SQL NVARCHAR( MAX ),
@Partition INTEGER;
WHILE ( @i > 0 )
BEGIN
SELECT TOP 1 @Partition = partition_number
FROM @t_Partition;
DELETE @t_Partition
WHERE partition_number = @Partition;
SET @SQL = N'
ALTER TABLE dbo.Inventory
SWITCH PARTITION ' + LEFT( @Partition, 1024 ) + '
TO dbo.t_SwapInventory PARTITION ' + LEFT( @Partition, 1024 ) + ';';
EXECUTE dbo.sp_executesql @statement = @SQL;
SET @SQL = N'
ALTER TABLE dbo.t_StageInventory
SWITCH PARTITION ' + LEFT( @Partition, 1024 ) + '
TO dbo.Inventory PARTITION ' + LEFT( @Partition, 1024 ) + ';';
EXECUTE dbo.sp_executesql @statement = @SQL;
SET @i = @i - 1;
END;
GO
DROP TABLE dbo.t_SwapInventory;
DROP TABLE dbo.t_StageInventory;
SET NOCOUNT OFF;
As this point, the View Distribution of Data query above could be run for additional verification.
Upvotes: 1
Reputation: 12940
I'm not sure if this question and answer is a great fit for stackoverflow, because this is getting into a relatively complex architectural discussion, but I'll try to keep the ideas simple and let you run with it.
I'm assuming that the reason that you are loading into a temporary table from SSIS is so that you can take advantage of a BULK insert; if you're just doing this load once a day, it may not be worth trying to do my suggestion. Just insert into your target table; the time to complete the insert may be worth the cost of maintaining a more complex situation. However, if you need to do this load and have your data immediately available, then I would do it like so:
This should allow you to have the benefits of BULK INSERTS and near-immediate access to your data. The trickle load from the Most Recent to the Final table will slowly empty out the destination table for the ALTER SWITCH maneuver.
If there's a possibility that the trickle load will take longer than a day, you could actually build a partitioned view dynamically, adding a new partitioned table for each insert date, and dropping those tables as they get trickled to the final destination, but that's adding a LOT of complexity.
Upvotes: 1