DKL44
DKL44

Reputation: 9

SQLite to Number Rows by Two Criteria

I want to use sqlite to number rows in a csv file by two criteria: Duplicate Id number and Created Date. (I am analyzing duplicate account data generated by DemandTools.) Just to help explain, here is a simplified version of what I want to do:

Want to turn this:

Dupe Id #  |   Account Name   |   Created Date  |
1882       | A1, Inc.         |  03/15/2015     |
1567       | Joe's Plumbing   |  08/01/2019     |
1567       | Joes plubming    |  02/07/2020     |
1882       | A1 Corporation   |  06/20/2019     |
1882       | A1 Incorporated  |  05/16/2016     |

Into this:

Dupe Id #  |   Account Name   |   Created Date  |   Dupe #   |
1567       | Joe's Plumbing   |  08/01/2019     |   0        |
1567       | Joes plubming    |  02/07/2020     |   1        |
1882       | A1, Inc.         |  03/15/2015     |   0        |
1882       | A1 Incorporated  |  05/16/2016     |   1        |
1882       | A1 Corporation   |  06/20/2019     |   2        |

I want the original account to have a value of 0, the first dupe 1, 2nd dupe 2, etc.

I had a way to do this in excel using a multi column sort and a countif function like this =COUNTIF(A$2:A2,A2)-1, but what was working in excel for 100k rows and 10 columns, is not working with 700k rows and 24 columns.

My SQLite knowledge is currently at a beginner level. I understand the basics, but not really sure where to begin with a problem like this. I know how to sort by a single column in SQLite, but I don't know how to deal with the countif part of the problem (and maybe there is a better way with SQLite?).

Any help is much appreciated . . . .

Thanks

Upvotes: 0

Views: 91

Answers (1)

forpas
forpas

Reputation: 164139

First thing you must do is change the format of your dates to YYYY-MM-DD because this is the only valid format for dates in SQLite and it is comparable.
With ROW_NUMBER() window function:

select *,
  row_number() over (partition by Id order by CreatedDate) - 1 Dupe
from tablename 
order by id, Dupe

See the demo.
Results:

| Id   | AccountName     | CreatedDate | Dupe |
| ---- | --------------- | ----------- | ---- |
| 1567 | Joe's Plumbing  | 2019-08-01  | 0    |
| 1567 | Joes plubming   | 2020-02-07  | 1    |
| 1882 | A1, Inc.        | 2015-03-15  | 0    |
| 1882 | A1 Incorporated | 2016-05-16  | 1    |
| 1882 | A1 Corporation  | 2019-06-20  | 2    |

Upvotes: 1

Related Questions