Reputation: 59
I am working with a data frame that looks as follows:
JOB_ID | APPLICANT_ID | DATE | STATUS |
---|---|---|---|
1 | A | 01/01/2020 | Stage 1 |
1 | A | 01/01/2020 | Accepted |
2 | A | 01/01/2020 | Rejected |
1 | B | 02/01/2020 | Stage 1 |
1 | B | 02/02/2020 | Stage 2 |
1 | B | 02/03/2020 | Accepted |
And I want to add a field that counts sequentially the number of rows with the same Job ID and Applicant ID so that the result looks as follows:
JOB_ID | APPLICANT_ID | DATE | STATUS | COUNT |
---|---|---|---|---|
1 | A | 01/01/2020 | Stage 1 | 1 |
1 | A | 01/02/2020 | Accepted | 2 |
2 | A | 01/01/2020 | Rejected | 1 |
1 | B | 02/01/2020 | Stage 1 | 1 |
1 | B | 02/02/2020 | Stage 2 | 2 |
1 | B | 02/03/2020 | Accepted | 3 |
I've tried using:
COUNT(DISTINCT(JOB_ID, APPLICANT_ID))
but that didn't seem to work.
Upvotes: 0
Views: 24
Reputation: 460
The answer may vary slightly based on the SQL dialect you are using, this would be the way to go about getting the desired result:
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY JOB_ID, APPLICANT_ID ORDER BY DATE) AS COUNT
FROM
INPUT_TABLE
Here INPUT_TABLE
is the table you have provided as the input.
Upvotes: 1