Bren
Bren

Reputation: 2206

Jdbc Oracle batch update poor performance for some tables

Hey i am really in need of help, and i am quite frustrated.

I am working on this project which is about migrating data from one db to another, both Oracle.What i am doing is to get the table from a source db and create the table at target db. And then fetch data from the source table to batch insert into target db.

Giving a a bit more detail;

I am doing this in parallel by getting 16 rowid ranges from source table and distributing select queries for data copy to 8 server, 2 for each. ( Actually on the test servers i distribute to 4 machines, 4 for each). And at then i select the data with each thread to insert to table.

Having a master module working on one server which creates the table at target and decides the rowid ranges to send the tasks to slave modules working on other machines. But for some tables it works really really slow. And i dont know why. Now i have two tables, one gets done really fast. And another one drastically slow. Both from same db and same tablespace and datafile, to another db, same tablespace and same datafile.

The one i can copy fast 13936.4375 MB in size with 107.910.833 rows and it ends in 5 min. this is the ddl for the table

    -- Create table
create table CMP_SUBS_RESPONSE_INS
(
  RESPONSE_TP_SK            NUMBER(16),
  CHURN_REASON_TP_SK        NUMBER(16),
  CONTRACT_SK               NUMBER(16),
  OFFER_SK                  NUMBER(16),
  CHANNEL_SK                NUMBER(16),
  CMP_RUN_SK                NUMBER(16),
  CAMPAIGN_CELLS_SK         NUMBER(16),
  SUBS_RESPONSE_SK          NUMBER(16),
  SUBS_RESPONSE_NK          VARCHAR2(50),
  SUBS_RESPONSE_NK2         NUMBER,
  CRC_CALCULATION_FLAG      VARCHAR2(2),
  SOURCE_SYSTEM_SK          NUMBER(16),
  INITIAL_ETL_DATE          DATE,
  MODIFICATION_SYSTIME      DATE,
  UPDATE_ETL_DATE           DATE,
  START_DATE                DATE,
  END_DATE                  DATE,
  FULFILLMENT_FLAG          VARCHAR2(1),
  CHURN_SUBREASON_TP_SK     NUMBER(16),
  LMC_REASON_CATEGORY_SK    NUMBER(16),
  LMC_REASON_TIMEPERIOD_SK  NUMBER(16),
  LMC_REASON_PAYMENTTYPE_SK NUMBER(16)
)
tablespace USERS_BTS
  pctfree 10
  initrans 1
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );    

---target

create table BR_TEST2
(
  RESPONSE_TP_SK            NUMBER(16),
  CHURN_REASON_TP_SK        NUMBER(16),
  CONTRACT_SK               NUMBER(16),
  OFFER_SK                  NUMBER(16),
  CHANNEL_SK                NUMBER(16),
  CMP_RUN_SK                NUMBER(16),
  CAMPAIGN_CELLS_SK         NUMBER(16),
  SUBS_RESPONSE_SK          NUMBER(16),
  SUBS_RESPONSE_NK          VARCHAR2(50),
  SUBS_RESPONSE_NK2         NUMBER,
  CRC_CALCULATION_FLAG      VARCHAR2(2),
  SOURCE_SYSTEM_SK          NUMBER(16),
  INITIAL_ETL_DATE          DATE,
  MODIFICATION_SYSTIME      DATE,
  UPDATE_ETL_DATE           DATE,
  START_DATE                DATE,
  END_DATE                  DATE,
  FULFILLMENT_FLAG          VARCHAR2(1),
  CHURN_SUBREASON_TP_SK     NUMBER(16),
  LMC_REASON_CATEGORY_SK    NUMBER(16),
  LMC_REASON_TIMEPERIOD_SK  NUMBER(16),
  LMC_REASON_PAYMENTTYPE_SK NUMBER(16)
)
tablespace PUB_A_BTS
  pctfree 10
  initrans 1
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );

And the one which takes tooo much time to get done is 11519 MB in size, with about 37 millions rows, and it takes more than 2-3, hours maybe more (actually i always got frustrated and killed the process, never seen the end)

create table DGG1
(
  YEAR_MONTH             VARCHAR2(250),
  SUBSCRIBER_ID          NUMBER,
  EXT_DATE               TIMESTAMP(6),
  CALC_MONTHS            VARCHAR2(250),
  LAST_3_MONTH_FLAG      VARCHAR2(250),
  AVEA1_YTL              NUMBER,
  AVEA1_SAVING_YTL       NUMBER,
  AVEA1_SAVING_PER       NUMBER,
  AVEA2_YTL              NUMBER,
  AVEA2_SAVING_YTL       NUMBER,
  AVEA2_SAVING_PER       NUMBER,
  AVEA3_YTL              NUMBER,
  AVEA3_SAVING_YTL       NUMBER,
  AVEA3_SAVING_PER       NUMBER,
  AVEA4_YTL              NUMBER,
  AVEA4_SAVING_YTL       NUMBER,
  AVEA4_SAVING_PER       NUMBER,
  AVEA5_YTL              NUMBER,
  AVEA5_SAVING_YTL       NUMBER,
  AVEA5_SAVING_PER       NUMBER,
  AVEA6_YTL              NUMBER,
  AVEA6_SAVING_YTL       NUMBER,
  AVEA6_SAVING_PER       NUMBER,
  AVEA7_YTL              NUMBER,
  AVEA7_SAVING_YTL       NUMBER,
  AVEA7_SAVING_PER       NUMBER,
  AVEA8_YTL              NUMBER,
  AVEA8_SAVING_YTL       NUMBER,
  AVEA8_SAVING_PER       NUMBER,
  AVEA9_YTL              NUMBER,
  AVEA9_SAVING_YTL       NUMBER,
  AVEA9_SAVING_PER       NUMBER,
  AVEA10_YTL             NUMBER,
  AVEA10_SAVING_YTL      NUMBER,
  AVEA10_SAVING_PER      NUMBER,
  TELSIM1_YTL            NUMBER,
  TELSIM1_SAVING_YTL     NUMBER,
  TELSIM1_SAVING_PER     NUMBER,
  TELSIM2_YTL            NUMBER,
  TELSIM2_SAVING_YTL     NUMBER,
  TELSIM2_SAVING_PER     NUMBER,
  TELSIM3_YTL            NUMBER,
  TELSIM3_SAVING_YTL     NUMBER,
  TELSIM3_SAVING_PER     NUMBER,
  TELSIM4_YTL            NUMBER,
  TELSIM4_SAVING_YTL     NUMBER,
  TELSIM4_SAVING_PER     NUMBER,
  TELSIM5_YTL            NUMBER,
  TELSIM5_SAVING_YTL     NUMBER,
  TELSIM5_SAVING_PER     NUMBER,
  TELSIM6_YTL            NUMBER,
  TELSIM6_SAVING_YTL     NUMBER,
  TELSIM6_SAVING_PER     NUMBER,
  TELSIM7_YTL            NUMBER,
  TELSIM7_SAVING_YTL     NUMBER,
  TELSIM7_SAVING_PER     NUMBER,
  TELSIM8_YTL            NUMBER,
  TELSIM8_SAVING_YTL     NUMBER,
  TELSIM8_SAVING_PER     NUMBER,
  TELSIM9_YTL            NUMBER,
  TELSIM9_SAVING_YTL     NUMBER,
  TELSIM9_SAVING_PER     NUMBER,
  TELSIM10_YTL           NUMBER,
  TELSIM10_SAVING_YTL    NUMBER,
  TELSIM10_SAVING_PER    NUMBER,
  ETT_DATE               TIMESTAMP(6),
  RUN_ID                 VARCHAR2(250),
  CO_ID                  NUMBER,
  UNIQUE_PARTY_ID        NUMBER,
  UNOPTIMISEABLE_CHARGES NUMBER,
  OTHER_CHARGES          NUMBER
)
tablespace USERS_BTS
  pctfree 10
  initrans 1
  maxtrans 255
  storage
  (
    initial 64k
    next 1m
    minextents 1
    maxextents unlimited
  );

----target

create table dds_etl.br_test
(
  YEAR_MONTH             VARCHAR2(250),
  SUBSCRIBER_ID          NUMBER,
  EXT_DATE               TIMESTAMP(6),
  CALC_MONTHS            VARCHAR2(250),
  LAST_3_MONTH_FLAG      VARCHAR2(250),
  AVEA1_YTL              NUMBER,
  AVEA1_SAVING_YTL       NUMBER,
  AVEA1_SAVING_PER       NUMBER,
  AVEA2_YTL              NUMBER,
  AVEA2_SAVING_YTL       NUMBER,
  AVEA2_SAVING_PER       NUMBER,
  AVEA3_YTL              NUMBER,
  AVEA3_SAVING_YTL       NUMBER,
  AVEA3_SAVING_PER       NUMBER,
  AVEA4_YTL              NUMBER,
  AVEA4_SAVING_YTL       NUMBER,
  AVEA4_SAVING_PER       NUMBER,
  AVEA5_YTL              NUMBER,
  AVEA5_SAVING_YTL       NUMBER,
  AVEA5_SAVING_PER       NUMBER,
  AVEA6_YTL              NUMBER,
  AVEA6_SAVING_YTL       NUMBER,
  AVEA6_SAVING_PER       NUMBER,
  AVEA7_YTL              NUMBER,
  AVEA7_SAVING_YTL       NUMBER,
  AVEA7_SAVING_PER       NUMBER,
  AVEA8_YTL              NUMBER,
  AVEA8_SAVING_YTL       NUMBER,
  AVEA8_SAVING_PER       NUMBER,
  AVEA9_YTL              NUMBER,
  AVEA9_SAVING_YTL       NUMBER,
  AVEA9_SAVING_PER       NUMBER,
  AVEA10_YTL             NUMBER,
  AVEA10_SAVING_YTL      NUMBER,
  AVEA10_SAVING_PER      NUMBER,
  TELSIM1_YTL            NUMBER,
  TELSIM1_SAVING_YTL     NUMBER,
  TELSIM1_SAVING_PER     NUMBER,
  TELSIM2_YTL            NUMBER,
  TELSIM2_SAVING_YTL     NUMBER,
  TELSIM2_SAVING_PER     NUMBER,
  TELSIM3_YTL            NUMBER,
  TELSIM3_SAVING_YTL     NUMBER,
  TELSIM3_SAVING_PER     NUMBER,
  TELSIM4_YTL            NUMBER,
  TELSIM4_SAVING_YTL     NUMBER,
  TELSIM4_SAVING_PER     NUMBER,
  TELSIM5_YTL            NUMBER,
  TELSIM5_SAVING_YTL     NUMBER,
  TELSIM5_SAVING_PER     NUMBER,
  TELSIM6_YTL            NUMBER,
  TELSIM6_SAVING_YTL     NUMBER,
  TELSIM6_SAVING_PER     NUMBER,
  TELSIM7_YTL            NUMBER,
  TELSIM7_SAVING_YTL     NUMBER,
  TELSIM7_SAVING_PER     NUMBER,
  TELSIM8_YTL            NUMBER,
  TELSIM8_SAVING_YTL     NUMBER,
  TELSIM8_SAVING_PER     NUMBER,
  TELSIM9_YTL            NUMBER,
  TELSIM9_SAVING_YTL     NUMBER,
  TELSIM9_SAVING_PER     NUMBER,
  TELSIM10_YTL           NUMBER,
  TELSIM10_SAVING_YTL    NUMBER,
  TELSIM10_SAVING_PER    NUMBER,
  ETT_DATE               TIMESTAMP(6),
  RUN_ID                 VARCHAR2(250),
  CO_ID                  NUMBER,
  UNIQUE_PARTY_ID        NUMBER,
  UNOPTIMISEABLE_CHARGES NUMBER,
  OTHER_CHARGES          NUMBER
)
tablespace PUB_A_BTS
  pctfree 10
  initrans 1
  maxtrans 255
  storage
  (
    initial 64k
    next 1m
    minextents 1
    maxextents unlimited
  );

No constraints or indexes in any of the tables. Not even null checks as u can see.

Now the java code i wrote is down below, I did use jvisualvm for sampling and it seems for the slow one, 99% of cpu time is spent at executeBatch method, more spefically oracle.net.ns.Packet.receive() method in the subtree of executeBatch.For the fast one it is still the most time consuming bit, but about 45%.

I comment out writing mechanism and just execute getObject method, and it ends loading the tables in 100 seconds. So reading from source db doesn't seem like where the problem is to me.

So i thought the db takes too much time to execute batch insert. I thought it gets slower when column count increases, and i only selected 20 columns from the slow copied table, it was still that slow and hanging at executeBatch. Then i thought it was because i didn't use proper getXXX() method for column types and i changed the code appropriately from getObject to whatever get method was necessary.But still it was too slow.

Then i thought db takes too much time to allocate new space. So i created the table with an initial extent of 15gb to be sure that it wont spend time to allocate space while executing the batch. Then again it didn't work.

And whenever a table gets copied too slow, i see that the CPU activity on the server i am running the slave modules(the code doing actual data copy i posted below) are really low. And for the ones it is fast, lots of CPU usage.

I tried with different batch sizes and fetch sizes, didn't do much help.

So can anyone tell me what i am doing wrong here ? I copy the same table with a fancy etl tool, it does the job quickly in about 10 mins.

It is obviously about some specific type of tables. But i just can not find what is wrong here. I got the latest jdbc drivers (ojdbc6) to be sure it is not the problem, but it was still the same.

I get query result from source db

public ResultSet getQueryResultset(Connection con, String query) throws SQLException {
        OraclePreparedStatement preparedStatement = null;
        preparedStatement = con.prepareStatement(query);
        preparedStatement.setRowPrefetch(DTSConstants.FETCH_SIZE);//1000
        return preparedStatement.executeQuery();
    }

Target connection is auto commit false

targetConnection.setAutoCommit(false);

how i prepare the insert

OraclePreparedStatement preparedStatement = null;
preparedStatement =  connection.prepareStatement(insertScript);

and here i write the data to target

private int write(ResultSet resultSet, OraclePreparedStatement preparedStatement, long taskID) throws SQLException,
            DTSException, MLException, ParseException {
        int statementCounter = 0;
        int rowsAffected = 0;
        int columnCount = columnNames.length;
        while (resultSet.next()) {
            setColumnsAndAddBatch(resultSet, preparedStatement, columnCount);
            statementCounter++;
            rowsAffected++;
            if (statementCounter >= DTSConstants.BATCH_SIZE) { /1000
                preparedStatement.executeBatch();
                statementCounter = 0;
                controllerUtil.performTaskSanityCheck(taskID);
            }
        }
        preparedStatement.executeBatch();
        return rowsAffected;
    }


private void setColumnsAndAddBatch(ResultSet resultSet, OraclePreparedStatement preparedStatement, int columnCount)
            throws SQLException, MLException, ParseException {
        for (int i = 0; i < columnCount; i++) {
            Object object = resultSet.getObject(i + 1);//Changed this to Object object = OracleDataHandler.getData(resultSet, i + 1, columntypes[i]);

            if (maskingLibGateway != null) {
                String columnName = columnNames[i];
                if (maskFields.containsKey(columnName) == true) { // never true for my examples, so the method never gets called
                    object = maskObject(object, columnName); 
                }
            }
            preparedStatement.setObject(i + 1, object);
        }
        preparedStatement.addBatch();
    }

Using this class to decide getXXX method

public class OracleDataHandler {

public static Object getData(ResultSet resultSet, int columnIndex, int columnType) throws SQLException {

    switch (columnType) {
    case Types.NUMERIC:
    case Types.DECIMAL:
        return resultSet.getBigDecimal(columnIndex);
    case Types.CHAR:
    case Types.VARCHAR:
    case Types.LONGNVARCHAR:
        return resultSet.getString(columnIndex);
    case Types.INTEGER:
        return resultSet.getInt(columnIndex);
    case Types.DATE:
        return resultSet.getDate(columnIndex);
    case Types.TIMESTAMP:
        return resultSet.getTimestamp(columnIndex);
    case Types.TIME:
        return resultSet.getTime(columnIndex);
    case Types.BIGINT:
        return resultSet.getLong(columnIndex);
    case Types.DOUBLE:
    case Types.FLOAT:
        return resultSet.getDouble(columnIndex);
    case Types.SMALLINT:
        return resultSet.getShort(columnIndex);
    case Types.TINYINT:
        return resultSet.getByte(columnIndex);
    case Types.BINARY:
    case Types.VARBINARY:
        return resultSet.getBytes(columnIndex);
    case Types.CLOB:
        return resultSet.getClob(columnIndex);
    case Types.ARRAY:
        return resultSet.getArray(columnIndex);
    case Types.BLOB:
        return resultSet.getBlob(columnIndex);
    case Types.REAL:
        return resultSet.getFloat(columnIndex);
    case Types.BIT:
    case Types.BOOLEAN:
        return resultSet.getBoolean(columnIndex);
    case Types.REF:
        return resultSet.getRef(columnIndex);
    case Types.DATALINK:
        return resultSet.getURL(columnIndex);
    case Types.LONGVARBINARY:
        return resultSet.getBinaryStream(columnIndex);
    default:
        return resultSet.getObject(columnIndex);
    }

}

}

Upvotes: 0

Views: 5367

Answers (1)

Bren
Bren

Reputation: 2206

I solved the problem;

All i did was to change setObject method to proper JDBC setXXX method for each data type.Just the way i use get methods

Though, i didn't quite understand what exactly was the problem. And why only for some tables and not the others. It can't be timestamp data type as i did select columns other than timestamps.

If anyone can tell me what exactly is the problem i would really appricate that. I assume it was because target db was doing some conversion for some data types (which one though ?) or i was making data bigger in memory at java side and sending it over the network (using setXXX method maybe decreasing size of each record ?) and it was taking too much time ? Tho these are just assumptions. (Might very well be stupid ones.)

Anyway i am happy i solved the problem. Thanks to all who replied.

Upvotes: 1

Related Questions