Reputation: 87
scenario: I have table A and Table B. Both have the primary key of Rep_time_frame. We replicate data by terms. however, that process at times was repeated.
task: Match the source counts with the destination table counts.
problem: I can match the rows of the source table but not the destination table because the replication for some terms happened twice. So, my counts are off.
SELECT d.REPT_TIME_FRAME,
c.SOURCE_TOTAL_RECORDS,
COUNT(*) TOTAL_RECORDS,
CASE
WHEN COUNT(*) > c.SOURCE_TOTAL_RECORDS THEN c.SOURCE_TOTAL_RECORDS
ELSE COUNT(*)
END FINAL_TOTAL_RECORDS,
***CASE
WHEN EXISTS (SELECT REPT_TIME_FRAME
FROM TableA_Counts
WHERE REPT_TIME_FRAME IN (201705, 201708, 201801, 201706, 201710, 201803)
GROUP BY REPT_TIME_FRAME
HAVING COUNT(*) > 1) THEN 'UPLOADED MORE THAN ONCE'
ELSE'UPLOADED ONCE'
END NUMBER_OF_REPLICATIONS***
FROM TableA d
INNER JOIN TableA_Counts c ON (d.REPT_TIME_FRAME = c.REPT_TIME_FRAME)
WHERE d.REPT_TIME_FRAME IN (201705, 201708, 201801, 201706, 201710, 201803)
GROUP BY d.REPT_TIME_FRAME, c.SOURCE_TOTAL_RECORDS```
please note that the bolded area, when ran returns for the newly made column Number_of_replications "Uploaded More than once" on all counts. For the purposes of this example. Here are the facts.
201705 was replicated 2X
201801 once and 201708 once.
expected result. I want to have 201705 'Uploaded more than once', and the rest 201801 and 201708 'Uploaded Once'
Upvotes: 0
Views: 48
Reputation: 87
SELECT d.REPT_TIME_FRAME,
c.SOURCE_TOTAL_RECORDS,
COUNT(*) TOTAL_RECORDS,
CASE
WHEN COUNT(*) > c.SOURCE_TOTAL_RECORDS THEN c.SOURCE_TOTAL_RECORDS
ELSE COUNT(*)
END FINAL_TOTAL_RECORDS,
a.count AS NUM_OF_REPLICATIONS
FROM TableA d
INNER JOIN TableACounts c ON (d.REPT_TIME_FRAME = c.REPT_TIME_FRAME)
INNER JOIN (SELECT REPT_TIME_FRAME ,count(REPT_TIME_FRAME) as count
FROM TableAcounts
WHERE REPT_TIME_FRAME IN (201705, 201708, 201801, 201706, 201710, 201803)
GROUP BY REPT_TIME_FRAME) AS a ON (a.REPT_TIME_FRAME = d.REPT_TIME_FRAME)
WHERE d.REPT_TIME_FRAME IN (201705, 201708, 201801, 201706, 201710, 201803)
GROUP BY d.REPT_TIME_FRAME, c.SOURCE_TOTAL_RECORDS, a.count
Upvotes: 0
Reputation: 115
Does this help? http://sqlfiddle.com/#!18/8692a/1
I always try and get the counts then do a case statement on the final figures. I've based that SqlFiddle on the notion that TableA_Counts is the table with more than one entry for 201705. Is that correct?
CREATE TABLE TableA (REPT_TIME_FRAME int)
INSERT INTO TableA(REPT_TIME_FRAME)VALUES
(201705),
(201708),
(201801),
(201706),
(201710),
(201803 )
CREATE TABLE TableA_Counts (REPT_TIME_FRAME int)
INSERT INTO TableA_Counts(REPT_TIME_FRAME)VALUES
(201705),
(201708),
(201705),
(201801),
(201706),
(201710),
(201803)
SELECT
d.REPT_TIME_FRAME,count(*) as TOTAL_RECORDS,SOURCE_TOTAL_RECORDS,
CASE WHEN SOURCE_TOTAL_RECORDS > 1 then 'more than' else 'just one' end as NUMBER_OF_REPLICATIONS
FROM
TableA d
LEFT JOIN
(SELECT REPT_TIME_FRAME ,count(*) as SOURCE_TOTAL_RECORDS from TableA_Counts group by REPT_TIME_FRAME) c ON (d.REPT_TIME_FRAME = c.REPT_TIME_FRAME)
group by d.REPT_TIME_FRAME,SOURCE_TOTAL_RECORDS
Upvotes: 1