Jonathan Porter
Jonathan Porter

Reputation: 1556

How to partition by range

I'm trying to separate this table into 3 partitions and create a column with which partition the row is in. This table keeps historical data about documents by adding new rows and sets IsDeleted = 1 for old rows. You can see that each revision of the document deletes all of the lines of the old version and recreates it with the new line numbers.

I'm not sure where to start as I haven't used the partition clause before and any help is appreciated.

Current Table:

+----+----------------+------------+-----------+-------------------------+
| ID | DocumentNumber | LineNumber | IsDeleted |       CreatedDate       |
+----+----------------+------------+-----------+-------------------------+
|  1 | D001           |          1 |         1 | 2017-01-20 14:10:13.533 |
|  2 | D001           |          2 |         1 | 2017-01-20 14:10:13.533 |
|  3 | D001           |          3 |         1 | 2017-01-20 14:10:13.533 |
|  4 | D001           |          4 |         1 | 2017-01-20 14:10:13.533 |
|  5 | D001           |          1 |         1 | 2017-01-21 12:11:14.500 |
|  6 | D001           |          2 |         1 | 2017-01-21 12:11:14.500 |
|  7 | D001           |          1 |         0 | 2017-01-21 15:20:20.222 |
|  8 | D001           |          2 |         0 | 2017-01-21 15:21:21.111 |
+----+----------------+------------+-----------+-------------------------+

Expected Result:

+----+----------------+------------+-----------+-------------------------+-----------------+
| ID | DocumentNumber | LineNumber | IsDeleted |       CreatedDate       | PartitionNumber |
+----+----------------+------------+-----------+-------------------------+-----------------+
|  1 | D001           |          1 |         1 | 2017-01-20 14:10:13.533 |               1 |
|  2 | D001           |          2 |         1 | 2017-01-20 14:10:13.533 |               1 |
|  3 | D001           |          3 |         1 | 2017-01-20 14:10:13.533 |               1 |
|  4 | D001           |          4 |         1 | 2017-01-20 14:10:13.533 |               1 |
|  5 | D001           |          1 |         1 | 2017-01-21 12:11:14.500 |               2 |
|  6 | D001           |          2 |         1 | 2017-01-21 12:11:14.500 |               2 |
|  7 | D001           |          1 |         0 | 2017-01-21 15:20:20.222 |               3 |
|  8 | D001           |          2 |         0 | 2017-01-21 15:21:21.111 |               3 |
+----+----------------+------------+-----------+-------------------------+-----------------+

UPDATE:

In addition to Jason's answer, I added a partition by clause in order to reset the ranking for each document in my table. I hope this helps someone in the future.

SELECT ID,
       DocumentNumber,
       LineNumber,
       IsDeleted,
       CreatedDate,
       SUM(CASE WHEN LineNumber = 1 THEN 1 ELSE 0 END) 
       OVER (PARTITION BY DocumentNumber ORDER BY CreatedDate) 
       AS 'PartitionNumber'
FROM CurrentTable

Upvotes: 0

Views: 5906

Answers (3)

SQL006
SQL006

Reputation: 492

Is the createdDate is same for each partition...as in partition 3 it is different. If it is same then you can use DENSE_Rank()

SELECT *,
DENSE_RANK() OVER(PARTITION BY documentNumber,CreatedDate ORDER BY documentNumber,CreatedDate ) as PartitionNumber
FROM Table

Upvotes: 1

Jason
Jason

Reputation: 945

I got what your looking for by doing this:

SELECT ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate, 
       SUM(CASE WHEN LineNumber = 1 THEN 1 ELSE 0 END) 
       OVER (ORDER BY ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate) 
       AS 'PartitionNumber'
FROM CurrentTable
GROUP BY ID,DocumentNumber,LineNumber,IsDeleted,CreatedDate

I used SUM and CASE to assign a value of 1 to all line number 1's and a 0 to the others. Then I used a window function to calculate a running total.

Results:

+----+----------------+------------+-----------+-------------------------+----------------+
| ID | DocumentNumber | LineNumber | IsDeleted |     CreatedDate         | PartitionNumber|
+----+--- ------------+------------+-----------+-------------------------+----------------+
| 1  |     D001       |     1      |    1      | 2017-01-20 14:10:13.533 |       1        |
| 2  |     D001       |     2      |    1      | 2017-01-20 14:10:13.533 |       1        |
| 3  |     D001       |     3      |    1      | 2017-01-20 14:10:13.533 |       1        |
| 4  |     D001       |     4      |    1      | 2017-01-20 14:10:13.533 |       1        |
| 5  |     D001       |     1      |    1      | 2017-01-21 12:11:14.500 |       2        |
| 6  |     D001       |     2      |    1      | 2017-01-21 12:11:14.500 |       2        |
| 7  |     D001       |     1      |    0      | 2017-01-21 15:20:20.223 |       3        |
| 8  |     D001       |     2      |    0      | 2017-01-21 15:21:21.110 |       3        |
+----+--- ------------+----------------------------------- --------------+----------------+

Upvotes: 1

Jesse
Jesse

Reputation: 873

I think I follow you on this. The below gives you what you want but it will go into more partitions than 3 if there are more in the data, I assume that is expected.

if object_id('tempdb.dbo.#test') is not null drop table #test
create table #test
(
    id int,
    linenumber int,
    isdeleted bit,
    createddate datetime,
    documentnumber varchar(50)
)

insert into #test
select 1 ,  1 ,         1 , '2017-01-20 14:10:13.533', 'D001'
union all select 2 ,  2 ,         1 , '2017-01-20 14:10:13.533', 'D001'
union all select 3 ,  3 ,         1 , '2017-01-20 14:10:13.533', 'D001'
union all select 4 ,  4 ,         1 , '2017-01-20 14:10:13.533', 'D001'
union all select 5 ,  1 ,         1 , '2017-01-21 12:11:14.500', 'D001'
union all select 6 ,  2 ,         1 , '2017-01-21 12:11:14.500', 'D001'
union all select 7 ,  1 ,         0 , '2017-01-21 15:20:20.222', 'D001'
union all select 8 ,  2 ,         0 , '2017-01-21 15:21:21.111', 'D001'


select 
    *, 
    DENSE_RANK() over (partition by documentNumber order by isdeleted desc, case when isdeleted=0 then getdate() else createddate end) as partitionValues
from #test  

Upvotes: 0

Related Questions