Mohammed Ismail
Mohammed Ismail

Reputation: 37

What is the Best way to Perform Bulk insert in oracle ?

With this, I mean Inserting millions of records in tables. I know how to insert data using loops but for inserting millions of data it won't be a good approach.

I have two tables

CREATE TABLE test1
  (
    col1         NUMBER,
    valu         VARCHAR2(30),
    created_Date DATE,
    CONSTRAINT pk_test1 PRIMARY KEY (col1)
  )
/
CREATE TABLE test2
  (
    col2          NUMBER,
    fk_col1       NUMBER,
    valu          VARCHAR2(30),
    modified_Date DATE,
    CONSTRAINT pk_test2 PRIMARY KEY (col2),
    FOREIGN KEY (fk_col1) REFERENCES test1(col1)
  )
/

Please suggest a way to insert some dummy records upto 1 million without loops.

Upvotes: 0

Views: 196

Answers (1)

Alex Poole
Alex Poole

Reputation: 191235

As a fairly simplistic approach, which may be enough for you based on your comments, you can generate dummy data using a hierarchical query. Here I'm using bind variables to control how many are created, and to make some of the logic slightly clearer, but you could use literals instead.

First, parent rows:

var parent_rows number;
var avg_children_per_parent number;
exec :parent_rows := 5;
exec :avg_children_per_parent := 3;

-- create dummy parent rows
insert into test1 (col1, valu, created_date)
select level,
  dbms_random.string('a', dbms_random.value(1, 30)),
  trunc(sysdate) - dbms_random.value(1, 365)
from dual
connect by level <= :parent_rows;

which might generate rows like:

      COL1 VALU                           CREATED_DA
---------- ------------------------------ ----------
         1 rYzJBVI                        2016-11-14
         2 KmSWXfZJ                       2017-01-20
         3 dFSTvVsYrCqVm                  2016-07-19
         4 iaHNv                          2016-11-08
         5 AvAxDiWepPeONGNQYA             2017-01-20

Then child rows, which a random fk_col1 in the range generated for the parent:

-- create dummy child rows
insert into test2 (col2, fk_col1, valu, modified_date)
select level,
  round(dbms_random.value(1, :parent_rows)),
  dbms_random.string('a', dbms_random.value(1, 30)),
  trunc(sysdate) - dbms_random.value(1, 365)
from dual
connect by level <= :parent_rows * :avg_children_per_parent;

which might generate:

select * from test2;

      COL2    FK_COL1 VALU                           MODIFIED_D
---------- ---------- ------------------------------ ----------
         1          2 AqRUtekaopFQdCWBSA             2016-06-30
         2          4 QEczvejfTrwFw                  2016-09-23
         3          4 heWMjFshkPZNyNWVQG             2017-02-19
         4          4 EYybXtlaFHkAYeknhCBTBMusGAkx   2016-03-18
         5          4 ZNdJBQxKKARlnExluZWkHMgoKY     2016-06-21
         6          3 meASktCpcuyi                   2016-10-01
         7          4 FKgmf                          2016-09-13
         8          3 JZhk                           2016-06-01
         9          2 VCcKdlLnchrjctJrMXNb           2016-05-01
        10          5 ddL                            2016-11-27
        11          4 wbX                            2016-04-20
        12          1 bTfa                           2016-06-11
        13          4 QP                             2016-08-25
        14          3 RgmIahPL                       2016-03-04
        15          2 vhinLUmwLwZjczYdrPbQvJxU       2016-12-05

where the number of children varies for each parent:

select fk_col1, count(*) from test2 group by fk_col1 order by fk_col1;

   FK_COL1   COUNT(*)
---------- ----------
         1          1
         2          3
         3          3
         4          7
         5          1

To insert a million rows instead, just change the bind variables.

If you needed more of a relationship between the children and parents, e.g. so the modified date is always after the created date, you can modify the query; for example:

insert into test2 (col2, fk_col1, valu, modified_date)
select *
from (
select level,
  round(dbms_random.value(1, :parent_rows)) as fk_col1,
  dbms_random.string('a', dbms_random.value(1, 30)),
  trunc(sysdate) - dbms_random.value(1, 365) as modified_date
from dual
connect by level <= :parent_rows * :avg_children_per_parent
) t2
where not exists (
  select null from test1 t1
  where t1.col1 = t2.fk_col1 and t1.created_date > t2.modified_date
);

You may also want non-midnight times (I set everything to midnight via the trunc() call, based on the column name being 'date' not 'datetime'), or some column values null; so this might just be a starting point for you.

Upvotes: 2

Related Questions