A Saraf
A Saraf

Reputation: 275

Finding duplicate rows while Inserting data into Oracle table

I have an oracle table dm_djr_bulkjob with below columns and sample data.

+----------+------+-----------+------------------+------+----------------------+
| device_id| cg_id|firmware_id| best_status      |dmc_id|oltp_updated          |
+----------+------+-----------+------------------+------+----------------------+
| 2009160  | 000  |25822      |No Device Response|1736  |27-JUL-17 10:00:00 AM |
| 2009160  | 000  |25822      |401               |1736  |27-JUL-17 14:00:00 PM |
| 2009157  | 000  |25745      |Wifi Deferred     |1736  |27-JUL-17 02:00:00 AM |
| 2009174  | 000  |25861      |Low Memory        |1736  |27-JUL-17 08:00:00 AM |
+----------+------+-----------+------------------+------+----------------------+

I am running a query to insert data into a temp table using below query:

insert into DM_ETLTEMP_BULK_BSTRESULT_SUMM
(
device_id,
cg_id,
firmware_id,
best_status,
dmc_id
)SELECT device_id, cg_id, firmware_id, best_Status, dmc_id
from dm_djr_bulkjob where oltp_updated between '27-JUL-17' and '28-JUL-17'

This query will insert all the records from dm_djr_bulkjob table to my temp table. I want to select one records out of duplicate records(we have duplicate records based on device_id and firmware_id)

Duplicate records for device_id= 2009160 and firmware_id = 25822

I want one record out of two duplicate values which has best_status priority as minimum from the below priority table. For example: above query returned two duplicate entries for device_id=2009160 and firmware_id = 25822 but best_status='No Device Response' among the two records has the minimum priority in the below table.

So my final number of records inserted into temp table should be as below.

+----------+------+-----------+------------------+------+
| device_id| cg_id|firmware_id| best_status      |dmc_id|
+----------+------+-----------+------------------+------+
| 2009160  | 000  |25822      |No Device Response|1736  |
| 2009157  | 000  |25745      |Wifi Deferred     |1736  |
| 2009174  | 000  |25861      |Low Memory        |1736  |
+----------+------+-----------+------------------+------+

Priority table

+-------------------+--------+
| status            |priority|
+-------------------+--------+
|No Device Response |1       |
|401                |2       |
|402                |3       |
|500                |4       |
|Wifi Deferred      |5       |
|Low Memory         |6       |
| No Device Response|7       |
+-------------------+--------+

Please suggest query to solve this requirement.

Thanks in advance!

Upvotes: 0

Views: 897

Answers (2)

Alexandru
Alexandru

Reputation: 14

        INSERT
        INTO DM_ETLTEMP_BULK_BSTRESULT_SUMM
            (
                device_id,
                cg_id,
                firmware_id,
                best_status,
                dmc_id
            )
       SELECT  device_id,
            cg_id,
            firmware_id,
            best_Status,
            dmc_id
        FROM
            (SELECT  device_id,
                    cg_id,
                    firmware_id,
                    best_Status,
                    dmc_id,
                    COUNT(*)
                FROM dm_djr_bulkjob
                WHERE oltp_updated BETWEEN '27-JUL-17' AND '28-JUL-17'
                GROUP BY device_id,
                    cg_id,
                    firmware_id,
                    best_Status,
                    dmc_id
                HAVING COUNT(*)<2
            )
        );

Try this :)

Upvotes: 0

Kirby
Kirby

Reputation: 724

I would add a join to the priority table and add an analytic function to select the preferred row in case of duplicates. The query would look like this:

Select device_id, cg_id, firmware_id, best_Status, dmc_id 
  From (Select a.device_id, a.cg_id, a.firmware_id, a.best_Status, a.dmc_id,
               rank() Over (Partition By a.device_id, a.firmware_id 
                                Order By b.priority) As rnk
          From dm_djr_bulkjob a
          Join priority_table b on b.best_status = a.best_status
         Where a.oltp_updated Between '27-JUL-17' And '28-JUL-17')
 Where rnk = 1;

The rank() function assigns a rank number to each row such that one row per unique (device_id, firmware_id) combination would have rnk = 1: the row with the lowest priority.

Upvotes: 1

Related Questions