user2112120
user2112120

Reputation: 23

Oracle Database Testing Challenge - Compare Schema Data

Wondering if someone can help point me in the right direction with this challenge, or tell me I'm crazy for trying this via sql. If sql would be too challenging, are there any free or inexpensive tools that would help me automate this?

I'm working on testing some data between an old and new Oracle database. What I'd like to do is be able to dynamically generate this query for all tables in a schema.

Select Column_1, Column_2 FROM Table_1
MINUS
Select Column_1, Column_2 FROM Table_1@"OLD_SERVER"

One catch is that the columns selected for each table should only be columns that do not begin with 'ETL' since those are expected to change with the migration.

To keep this dynamic, can I use the all_tab_columns to loop through each table?

So for a simplified example, let's say this query returned the following results, and you can expect the results from ALL_TAB_COLUMNS to be identical between the OLD and NEW database:

select TABLE_NAME, COLUMN_NAME from ALL_TAB_COLUMNS where owner = 'OWNER1'

TABLE_NAME, COLUMN_NAME
-----------------------
TABLE1, COLUMN_1
TABLE1, COLUMN_2
TABLE1, ETLCOLUMN_3

TABLE2, COLUMN_A
TABLE2, COLUMN_B
TABLE2, ETLCOLUMN_C

How would I write a query that would run a minus between the same table and columns (that do not begin with ETL) on the old and new database, and output the results along with the table name and the date ran, and then loop through to the next table and do the same thing?

Upvotes: 1

Views: 272

Answers (2)

Jon Heller
Jon Heller

Reputation: 36798

This testing can automated with SQL and PL/SQL. You're not crazy for doing this. Comparison systems like this can be incredibly helpful for testing changes to complex systems. It's not as good as automated unit tests but it can significantly enhance the typical database testing.

The code below is a fully working example. But in the real world there are many gotchas that could easily take several days to resolve. For example, dealing with CLOBs, large tables, timestamps and sequence-based values, etc.

Sample schemas and data differences

create user schema1 identified by schema1;
create user schema2 identified by schema2;
alter user schema1 quota unlimited on users;
alter user schema2 quota unlimited on users;

--Data in 1, not 2.
create table schema1.table1 as select 1 a, 1 b from dual;
create table schema2.table1(a number, b number);

--Data in 2, not 1.
create table schema1.table2(a number, b number);
create table schema2.table2 as select 1 a, 1 b from dual;

--Same data in both, excluding unused column.
create table schema1.table3 as select 1 a, 1 b, 'asdf' ETL_c from dual;
create table schema2.table3 as select 1 a, 1 b, 'fdsa' ETL_c from dual;

--Table DDL difference.
create table schema1.table4(a number);
create table schema2.table4(b number);

--Privileges can be tricky.
grant select on dba_tab_columns to <your schema>;

Procedure to print differences script

create or replace procedure print_differences(
    p_old_schema in varchar2,
    p_new_schema in varchar2) authid current_user
is
    v_table_index number := 0;
    v_row_count number;
begin
    --Print header information.
    dbms_output.put_line('--Comparison between '||p_old_schema||' and '||
        p_new_schema||', at '||to_char(sysdate, 'YYYY-MM-DD HH24:MI')||'.'||chr(10));

    --Create a SQL statement to return the differences for each table.
    for differences in
    (
        --Return number of differences and SQL statements to view them.
        select
        '
        with old_table as (select '||column_list||' from '||p_old_schema||'.'||table_name||')
        ,    new_table as (select '||column_list||' from '||p_new_schema||'.'||table_name||')
        select * from
        (
            select ''OLD'' old_or_new, old_table.* from old_table minus
            select ''OLD'' old_or_new, new_table.* from new_table
        )
        union all
        select * from
        (
            select ''NEW'' old_or_new, new_table.* from new_table minus
            select ''NEW'' old_or_new, old_table.* from old_table
        )
        ' difference_sql, table_name
        from
        (
            select table_name
                ,listagg(column_name, ',') within group (order by column_id) column_list
            from dba_tab_columns
            where owner = p_old_schema
                and column_name not like 'ETL%'
            group by table_name
        ) column_lists
    ) loop
        begin
            --Print table information:
            v_table_index := v_table_index+1;
            dbms_output.put_line(chr(10)||'--'||lpad(v_table_index, 3, '0')||':  '||differences.table_name);

            --Count differences.
            execute immediate 'select count(*) from ('||differences.difference_sql||')' into v_row_count;

            --Print SQL statements to investigate differences.
            if v_row_count = 0 then
                dbms_output.put_line('--No differences.');
            else
                dbms_output.put_line('--Differences: '||v_row_count);
                dbms_output.put_line(differences.difference_sql||';');
            end if;
        exception when others then
            dbms_output.put_line('/*Error with this statement, possible DDL difference: '
                ||differences.difference_sql||dbms_utility.format_error_stack||
                dbms_utility.format_error_backtrace||'*/');
        end;
    end loop;
end;
/

Running the procedure

begin
    print_differences('SCHEMA1', 'SCHEMA2');
end;
/

Sample output

The procedure does not output the actual differences. If there are differences, it outputs a script that will display the differences. With a decent IDE this will be a much better way to view the data, and it also helps to further analyze the differences.

--Comparison between SCHEMA1 and SCHEMA2, at 2014-03-28 23:44.


--001:  TABLE1
--Differences: 1

        with old_table as (select A,B from SCHEMA1.TABLE1)
        ,    new_table as (select A,B from SCHEMA2.TABLE1)
        select * from
        (
            select 'OLD' old_or_new, old_table.* from old_table minus
            select 'OLD' old_or_new, new_table.* from new_table
        )
        union all
        select * from
        (
            select 'NEW' old_or_new, new_table.* from new_table minus
            select 'NEW' old_or_new, old_table.* from old_table
        )
        ;

--002:  TABLE2
--Differences: 1

        with old_table as (select A,B from SCHEMA1.TABLE2)
        ,    new_table as (select A,B from SCHEMA2.TABLE2)
        select * from
        (
            select 'OLD' old_or_new, old_table.* from old_table minus
            select 'OLD' old_or_new, new_table.* from new_table
        )
        union all
        select * from
        (
            select 'NEW' old_or_new, new_table.* from new_table minus
            select 'NEW' old_or_new, old_table.* from old_table
        )
        ;

--003:  TABLE3
--No differences.

--004:  TABLE4
/*Error with this statement, possible DDL difference: 
        with old_table as (select A from SCHEMA1.TABLE4)
        ,    new_table as (select A from SCHEMA2.TABLE4)
        select * from
        (
            select 'OLD' old_or_new, old_table.* from old_table minus
            select 'OLD' old_or_new, new_table.* from new_table
        )
        union all
        select * from
        (
            select 'NEW' old_or_new, new_table.* from new_table minus
            select 'NEW' old_or_new, old_table.* from old_table
        )
        ORA-06575: Package or function A is in an invalid state
ORA-06512: at "JHELLER.PRINT_DIFFERENCES", line 48
*/

Upvotes: 0

evenro
evenro

Reputation: 2646

First - check out this: http://docs.oracle.com/cd/E11882_01/server.112/e41481/spa_upgrade.htm#RATUG210

Second - you would like to write a query that issues a query - The problem is that in user_tab_columns each column is a row. for doing that I would recommend you reading this : http://www.dba-oracle.com/t_converting_rows_columns.htm
The source table for you is USER_TAB_COLUMNS, and when running the query you can add a where that says "where column_name not like 'ETL%' etc. After that - the query would look something like:
select 'select '

 || listagg..... (from the link) || 'from table name' sql 
from user_tab_columns
where column_name not like 'ETL%'
and table_name = 'table name'
group by table_name

and btw - you're not crazy - before changing a system you need to be able to sign the upgrade will succeed - this is the only way to do it.

btw - if you'll describe in more depth the system and the upgrade - I'm sure the community will be able to help you find ways to test it in more depth, and will point you out to things to test.
Testing only the output is not enough in many cases....

GOOD LUCK!

Upvotes: 2

Related Questions