Seprom
Seprom

Reputation: 23

Reading large csv files in PL/SQL

First question here on StackOverFlow.

So, I'm working on a new project and I need to read large CSV-like files from PL/SQL and insert the data into a table. These files are about 400k lines each, and I'm not sure about the best way to do this. My expertise with PL/SQL comes a little short for the task :D

I've seen people using UTL_FILE.FOPEN. Is there a way of reading a large file by chunks or a more appropiate tool to use? I could not find any useful post about this issue.

Thank you in advance!

Upvotes: 2

Views: 2513

Answers (1)

Jon Heller
Jon Heller

Reputation: 36922

You can load data files using an inline external table or APEX_DATA_PARSER. Those techniques are often better than creating external tables because you do not have to manage as many schema objects. And those techniques are often better than using a tool like sqlldr because then your program is not dependent on a specific external program, shell script, or operating system.

Inline External Table

Find the directory to store the file:

select directory_path from all_directories where directory_name = 'DATA_PUMP_DIR';

Create the file test.csv in that directory:

Name,Salary
Alice,100
Bob,200

Read from that file without creating an external table:

select *
from external
(   
    (
        name varchar2(100),
        salary number
    )
    default directory data_pump_dir
    access parameters
    (
        records delimited by newline
        skip 1
        fields terminated by ','
    )
    location ('test.csv')
);

Results:

NAME    SALARY
-----   ------
Alice   100
Bob     200

APEX_DATA_PARSER

If you have APEX installed on your database, the package APEX_DATA_PARSER is a convenient way to query your files. Although this technique does require installing a PL/SQL object that will load the file into a blob. See this Oracle-Base article for the code behind the function FILE_TO_BLOB.

select col001 name, col002 salary
from table
(
    apex_data_parser.parse
    (
        p_content   => file_to_blob('DATA_PUMP_DIR', 'test.csv'),
        p_file_name => 'test.csv'
    )
)

SQL or PL/SQL?

Although you asked for a PL/SQL solution, the above solutions are mostly SQL only. But that's a good thing if you can even avoid creating PL/SQL objects.

If you're going to use this code inside PL/SQL, it's trivial to put those select statements inside a loop like this:

begin
    for lines in
    (
        select *
        from external
        (   
            (
                name varchar2(100),
                salary number
            )
            default directory data_pump_dir
            access parameters
            (
                records delimited by newline
                skip 1
                fields terminated by ','
            )
            location ('test.csv')
        )
    ) loop
        --Do something with the results here.
        dbms_output.put_line('Name: '||lines.name||',Salary: '||lines.salary);
    end loop;
end;
/

Upvotes: 3

Related Questions