Reputation: 1556
I'm trying to separate this table into 3 partitions and create a column with which partition the row is in. This table keeps historical data about documents by adding new rows and sets IsDeleted = 1
for old rows. You can see that each revision of the document deletes all of the lines of the old version and recreates it with the new line numbers.
I'm not sure where to start as I haven't used the partition clause before and any help is appreciated.
Current Table:
+----+----------------+------------+-----------+-------------------------+
| ID | DocumentNumber | LineNumber | IsDeleted | CreatedDate |
+----+----------------+------------+-----------+-------------------------+
| 1 | D001 | 1 | 1 | 2017-01-20 14:10:13.533 |
| 2 | D001 | 2 | 1 | 2017-01-20 14:10:13.533 |
| 3 | D001 | 3 | 1 | 2017-01-20 14:10:13.533 |
| 4 | D001 | 4 | 1 | 2017-01-20 14:10:13.533 |
| 5 | D001 | 1 | 1 | 2017-01-21 12:11:14.500 |
| 6 | D001 | 2 | 1 | 2017-01-21 12:11:14.500 |
| 7 | D001 | 1 | 0 | 2017-01-21 15:20:20.222 |
| 8 | D001 | 2 | 0 | 2017-01-21 15:21:21.111 |
+----+----------------+------------+-----------+-------------------------+
Expected Result:
+----+----------------+------------+-----------+-------------------------+-----------------+
| ID | DocumentNumber | LineNumber | IsDeleted | CreatedDate | PartitionNumber |
+----+----------------+------------+-----------+-------------------------+-----------------+
| 1 | D001 | 1 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 2 | D001 | 2 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 3 | D001 | 3 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 4 | D001 | 4 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 5 | D001 | 1 | 1 | 2017-01-21 12:11:14.500 | 2 |
| 6 | D001 | 2 | 1 | 2017-01-21 12:11:14.500 | 2 |
| 7 | D001 | 1 | 0 | 2017-01-21 15:20:20.222 | 3 |
| 8 | D001 | 2 | 0 | 2017-01-21 15:21:21.111 | 3 |
+----+----------------+------------+-----------+-------------------------+-----------------+
UPDATE:
In addition to Jason's answer, I added a partition by clause in order to reset the ranking for each document in my table. I hope this helps someone in the future.
SELECT ID,
DocumentNumber,
LineNumber,
IsDeleted,
CreatedDate,
SUM(CASE WHEN LineNumber = 1 THEN 1 ELSE 0 END)
OVER (PARTITION BY DocumentNumber ORDER BY CreatedDate)
AS 'PartitionNumber'
FROM CurrentTable
Upvotes: 0
Views: 5906
Reputation: 492
Is the createdDate is same for each partition...as in partition 3 it is different. If it is same then you can use DENSE_Rank()
SELECT *,
DENSE_RANK() OVER(PARTITION BY documentNumber,CreatedDate ORDER BY documentNumber,CreatedDate ) as PartitionNumber
FROM Table
Upvotes: 1
Reputation: 945
I got what your looking for by doing this:
SELECT ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate,
SUM(CASE WHEN LineNumber = 1 THEN 1 ELSE 0 END)
OVER (ORDER BY ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate)
AS 'PartitionNumber'
FROM CurrentTable
GROUP BY ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate
I used SUM
and CASE
to assign a value of 1 to all line number 1's and a 0 to the others. Then I used a window function to calculate a running total.
Results:
+----+----------------+------------+-----------+-------------------------+----------------+
| ID | DocumentNumber | LineNumber | IsDeleted | CreatedDate | PartitionNumber|
+----+--- ------------+------------+-----------+-------------------------+----------------+
| 1 | D001 | 1 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 2 | D001 | 2 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 3 | D001 | 3 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 4 | D001 | 4 | 1 | 2017-01-20 14:10:13.533 | 1 |
| 5 | D001 | 1 | 1 | 2017-01-21 12:11:14.500 | 2 |
| 6 | D001 | 2 | 1 | 2017-01-21 12:11:14.500 | 2 |
| 7 | D001 | 1 | 0 | 2017-01-21 15:20:20.223 | 3 |
| 8 | D001 | 2 | 0 | 2017-01-21 15:21:21.110 | 3 |
+----+--- ------------+----------------------------------- --------------+----------------+
Upvotes: 1
Reputation: 873
I think I follow you on this. The below gives you what you want but it will go into more partitions than 3 if there are more in the data, I assume that is expected.
if object_id('tempdb.dbo.#test') is not null drop table #test
create table #test
(
id int,
linenumber int,
isdeleted bit,
createddate datetime,
documentnumber varchar(50)
)
insert into #test
select 1 , 1 , 1 , '2017-01-20 14:10:13.533', 'D001'
union all select 2 , 2 , 1 , '2017-01-20 14:10:13.533', 'D001'
union all select 3 , 3 , 1 , '2017-01-20 14:10:13.533', 'D001'
union all select 4 , 4 , 1 , '2017-01-20 14:10:13.533', 'D001'
union all select 5 , 1 , 1 , '2017-01-21 12:11:14.500', 'D001'
union all select 6 , 2 , 1 , '2017-01-21 12:11:14.500', 'D001'
union all select 7 , 1 , 0 , '2017-01-21 15:20:20.222', 'D001'
union all select 8 , 2 , 0 , '2017-01-21 15:21:21.111', 'D001'
select
*,
DENSE_RANK() over (partition by documentNumber order by isdeleted desc, case when isdeleted=0 then getdate() else createddate end) as partitionValues
from #test
Upvotes: 0