Reputation: 367
I am using a PL/SQL procedure to extract Oracle tables of a particular schema to csv format. I execute the procedure by the following command:
exec dump_table_to_csv ( 'schema.tablename', 'directory', 'tablename.csv');
The directory is created using the following command:
CREATE DIRECTORY <directory_name> AS '<operating_system_path>';
The PL?SQL procedure code is as follows:
create or replace procedure dump_table_to_csv( p_tname in varchar2,
p_dir in varchar2,
p_filename in varchar2 )
is
l_output utl_file.file_type;
l_theCursor integer default dbms_sql.open_cursor;
l_columnValue varchar2(4000);
l_status integer;
l_query varchar2(1000)
default 'select * from ' || p_tname;
l_colCnt number := 0;
l_separator varchar2(1);
l_descTbl dbms_sql.desc_tab;
begin
l_output := utl_file.fopen( p_dir, p_filename, 'w', 32760);
execute immediate 'alter session set nls_date_format=''dd-mon-yyyy hh24:mi:ss'' ';
dbms_sql.parse( l_theCursor, l_query, dbms_sql.native );
dbms_sql.describe_columns( l_theCursor, l_colCnt, l_descTbl );
for i in 1 .. l_colCnt loop
utl_file.put( l_output, l_separator || '"' || l_descTbl(i).col_name || '"' );
dbms_sql.define_column( l_theCursor, i, l_columnValue, 4000 );
l_separator := ',';
end loop;
utl_file.new_line( l_output );
l_status := dbms_sql.execute(l_theCursor);
while ( dbms_sql.fetch_rows(l_theCursor) > 0 ) loop
l_separator := '';
for i in 1 .. l_colCnt loop
dbms_sql.column_value( l_theCursor, i, l_columnValue );
utl_file.put( l_output, l_separator || l_columnValue );
l_separator := ',';
end loop;
utl_file.new_line( l_output );
end loop;
dbms_sql.close_cursor(l_theCursor);
utl_file.fclose( l_output );
execute immediate 'alter session set nls_date_format=''dd-MON-yy'' ';
exception
when others then
execute immediate 'alter session set nls_date_format=''dd-MON-yy'' ';
raise;
end;
/
However, when the tables increases , this process takes too long. My question is how can I modify the code so that I can process list the schema tables and put it in a file and then the procedure will process all the tables listed in a file? If that way is not fast or possible, what other way can I process the extraction in one run without having to execute the procedure for each table?
Upvotes: 0
Views: 7826
Reputation: 1785
Take a look at DBMS_JOB - this will allow you to submit each table extract to run as a background job, in parallel, up to the resources of your machine.
So you could start with a procedure that does something like
begin
for table_rec in (select table_name from user_tables) loop
submit_dump_to_csv(
schema_var||'.'||table_rec.table_name,
'directory',
table_rec.table_name||'.csv);
end loop;
end;
Where submit_dump_to_csv wraps up your call to dump_table_to_csv and passes it to DBMS_JOB.
You probably want to record submitted and completed jobs in a table, and possibly send an email on task completion using DBMS_MAIL or DBMS_SMTP as the jobs will execute in the background on the server.
This does not speed up any individual table extract, just allows different tables to be extracted in parallel.
To speed up the individual extract, I would change your code so that rather than extracting out all the columns using Method 4 dynamic SQL and then concatenating them in pl/sql, instead use ALL_TAB_COLUMNS to build up a version of l_query that returns the single concatenated string.
You should then be able to use the far simpler native dynamic SQL syntax to loop over a dynamic query, calling UTL_FILE for each row returned.
Alternatively . . . .
If you use SQL*Plus, you could do something like
SET HEADING OFF PAGES 0
SET COLSEP ","
SPOOL tablename.csv
SELECT * FROM tablename
SPOOL OFF
If you use a SQLPlus variables for tablename - &tablename - this would work for any table, and you could easily write another script driven by user_tables to call this script for each table you want to extract (either using shell scripting, or by writing SQLPlus that spools an output file and then calls the output file it spooled).
Upvotes: 0
Reputation: 132660
what other way can I process the extraction in one run without having to execute the procedure for each table?
You could automate the running of the current procedure as follows:
begin
for r in (select table_name from all_tables where owner = 'SCHEMA') loop
dump_table_to_csv
('schema.'||r.table_name, 'directory', r.table_name||'.csv');
end loop;
end;
Or you could re-write the procedure to be run as:
exec dump_schema_to_csvs ('schema', 'directory')
i.e. put the FOR loop from my first example inside the procedure.
Neither of these can be expected to run any faster than the current method though.
Upvotes: 1
Reputation: 2402
I'm going to guess that your performance hit comes from the row by row processing found here...
while ( dbms_sql.fetch_rows(l_theCursor) > 0 ) loop
l_separator := '';
for i in 1 .. l_colCnt loop
dbms_sql.column_value( l_theCursor, i, l_columnValue );
utl_file.put( l_output, l_separator || l_columnValue );
l_separator := ',';
end loop;
utl_file.new_line( l_output );
end loop;
I would encourage you to think about making this more set-based, it should increase the speed. A quick search on the old interwebs turned up this small blog posting. The author handles all his column seperation in a specific sql statement.
I understand if you want to leave this generic so that you can run through all of the tables. If that's the case, I would encourage you to think about using some scripting rather than just PLSQL. There's a few examples out there... here's a good starting point:
http://amardeepsidhu.com/blog/2007/06/16/spool-to-a-xls-excel-file/
http://amardeepsidhu.com/blog/2007/06/26/shell-script-to-spool-a-no-of-tables-into-xls-files/
... those both deal with exports to excel, but you can easily convert to csv.
Finally, if this is just an occasional one off thing, you might want to consider grabbing SQLDeveloper - with that (free) utility you can export any table from a simple right click menu.
Hope that helps...
Upvotes: 0