Reputation: 315
I had to copy the data from oracle tables to files.
I have a join query which fetches 800k records so i used row_number() function along with order by clause to generate 4 files containing 200k each.
Query :
SELECT * FROM (
SELECT ROW_NUMBER() OVER ( order by FILE_KEY desc ) rn,
FILE_KEY, ROUTING_NO, INTLROUT_TYPE, ABBR_COUNTRY_CODE_2D, HO_CATALOG_NO
FROM BANK_INTL_ROUT_TBL rout, BANK_INTL_LOC_TBL loc
WHERE loc.CATALOG_NO = rout.FILE_KEY)
WHERE rn BETWEEN start AND end;
Parameters:
For 1st File : start =1 ,end = 200000
For 2nd File : start =200001 ,end = 400000
For 3rd File : start =400001 ,end = 600000
For 4th File : start =600001 ,end = 800000
But when i checked last 10 row using this query in sql query browser and last 10 rows of file are different that is sequence is different in file and sql query browser.
SELECT * FROM (
SELECT ROW_NUMBER() OVER( order by FILE_KEY desc ) rn,
FILE_KEY,ROUTING_NO,INTLROUT_TYPE,ABBR_COUNTRY_CODE_2D,HO_CATALOG_NO
FROM BANK_INTL_ROUT_TBL rout, BANK_INTL_LOC_TBL loc
WHERE loc.CATALOG_NO=rout.FILE_KEY)
WHERE rn BETWEEN 709990 AND 80000;
Upvotes: 2
Views: 1434
Reputation: 16915
In the over clause, order by a unique field in BANK_INTL_LOC_TBL:
SELECT * FROM (
SELECT ROW_NUMBER() OVER ( order by loc.**LOC_KEY** desc ) rn,
FILE_KEY, ROUTING_NO, INTLROUT_TYPE, ABBR_COUNTRY_CODE_2D, HO_CATALOG_NO
FROM BANK_INTL_ROUT_TBL rout, BANK_INTL_LOC_TBL loc
WHERE loc.CATALOG_NO = rout.FILE_KEY)
WHERE rn BETWEEN start AND end
ORDER BY rn;
UPDATE: according to @Shannon Severance comment
add the order by clause
Upvotes: 1
Reputation: 1216
If you have disk to spare on your Oracle installation (which you should!), then instead of running the inner query 4 times it may end up being faster to do the following
CREATE TABLE bank_data
NOLOGGING
PARALLEL 4
AS SELECT ROW_NUMBER() OVER ( order by FILE_KEY desc ) rn,
FILE_KEY, ROUTING_NO, INTLROUT_TYPE, ABBR_COUNTRY_CODE_2D, HO_CATALOG_NO
FROM BANK_INTL_ROUT_TBL rout, BANK_INTL_LOC_TBL loc
WHERE loc.CATALOG_NO = rout.FILE_KEY);
The amount of parallelism to use (the number 4
in my example here) will depend on how much concurrent work your database can handle, mostly dependent on the number of CPUs.
After that has finished, (which should take noticably less than 5 hours!) you can then run simple selects on the bank_dump
table to pull the records you desire
SELECT *
FROM bank_dump
where rn < 200000
for your first data set, for example.
Upvotes: 0
Reputation: 17643
This can be because you have something like this
row_number file_key
799998 same_number
799999 same_number
800000 same_number
800001 same_number
800002 same_number
800003 same_number
800004 same_number
because you order by file_key.
How do you observed that are different data? from your other columns. So, you can use:
SELECT ROW_NUMBER() OVER(order by FILE_KEY desc, ROUTING_NO, INTLROUT_TYPE, ABBR_COUNTRY_CODE_2D, HO_CATALOG_NO ) rn
Or(second cause), your base table had been changen between your querys.
UDPDATE: you can use the use_hash
hint to speed up your query. 5 hours is too much for this query.
SELECT * FROM (
SELECT /*+use_hash(rout loc)*/
ROW_NUMBER() OVER(order by FILE_KEY desc, ROUTING_NO, INTLROUT_TYPE, ABBR_COUNTRY_CODE_2D, HO_CATALOG_NO ) rn,
FILE_KEY, ROUTING_NO, INTLROUT_TYPE, ABBR_COUNTRY_CODE_2D, HO_CATALOG_NO
FROM BANK_INTL_ROUT_TBL rout, BANK_INTL_LOC_TBL loc
WHERE loc.CATALOG_NO = rout.FILE_KEY)
WHERE rn BETWEEN start AND end;
Upvotes: 1