Panchanand Mahto
Panchanand Mahto

Reputation: 151

In JPA, is there any way to insert data in DB in batch and insert only if not exists in DB

In JPA, is there any way to insert data in DB in batch and insert only if not exists in DB. When I try to insert in batch, because of the unique key constraint, it throws exception. I want to insert all the records which are not exist in the DB and rest should be skipped.

org.springframework.dao.DataIntegrityViolationException: could not execute statement; SQL [n/a]; constraint [reference_number_master_id_key]; nested exception is org.hibernate.exception.ConstraintViolationException: could not execute statement

Upvotes: 2

Views: 2735

Answers (1)

Kenny Cason
Kenny Cason

Reputation: 12328

Me and my team ended up creating implementations that our JpaRepositories extend to add such behavior.

Primary Interface

public interface BatchInsertable <T> {
    enum Mode {NORMAL, IGNORE, REPLACE}

    void batchInsertUsingMode(Collection<T> items, final Mode mode);
}

This is how we hook it up to an existing JpaRepository.

public interface UrlRepository extends JpaRepository<UrlEntity, Long>, BatchInsertable<UrlEntity> {
    // insert common methods
}

... and it's usage

urlRepository.batchInsertUsingMode(newUrlEntities, BatchInsertable.Mode.IGNORE);

Batch Inserter implementation

@Component
@Scope("prototype")
@Lazy
public class BatchInserter<T> extends AbstractBaseBatchJdbcMutation<T> implements BatchInsertable<T> {

    @Override
    public void batchInsertUsingMode(final Collection<T> items, final Mode mode) {
        final Map<String, String> parameters = new HashMap<>();
        parameters.put("MODE", mode == Mode.IGNORE ? "INSERT IGNORE" : mode == Mode.REPLACE ? "REPLACE" : "INSERT");
        parameters.put("STAGING_TABLE", getTableName());
        parameters.put("COLUMNS", buildColumnNameList());
        parameters.put("PLACEHOLDERS", buildBindVariablePlaceholderList());

        final StrSubstitutor strSubstitutor = new StrSubstitutor(parameters);
        final String sqlTemplate = "${MODE} INTO `${STAGING_TABLE}` (${COLUMNS}) VALUES (${PLACEHOLDERS})";
        getJdbcTemplate().batchUpdate(strSubstitutor.replace(sqlTemplate), buildBindVariables(items));
    }
}

Batch Deleter implementation

@Component
@Scope("prototype")
@Lazy
public class BatchDeleter<T> extends AbstractBaseBatchJdbcMutation<T> implements BatchDeletable<T> {

    @Override
    public int batchDelete(final List<T> items) {
        final Map<String, String> parameters = new HashMap<>();
        parameters.put("STAGING_TABLE", getTableName());
        parameters.put("COLUMNS", buildColumnNameList());
        parameters.put("PLACEHOLDERS", buildBindVariablePlaceholderList());

        final StrSubstitutor strSubstitutor = new StrSubstitutor(parameters);
        final String sqlTemplate = "DELETE FROM `${STAGING_TABLE}` WHERE (${COLUMNS}) = (${PLACEHOLDERS})";
        final int[] updateCounts = getJdbcTemplate().batchUpdate(strSubstitutor.replace(sqlTemplate), buildBindVariables(items));
        return sum(updateCounts);
    }

    private static int sum(final int[] updateCounts) {
        int sum = 0;

        for (final int updateCount : updateCounts) {
            sum += updateCount;
        }

        return sum;
    }
}

An abstract class that maintains common configs binding logic

public abstract class AbstractBaseBatchJdbcMutation<T> {
    private JdbcTemplate jdbcTemplate;

    private List<ColumnValueExtractors> columnExtractors;

    private String tableName;

    public void setColumnExtractors(final List<ColumnValueExtractors> columnExtractors) {
        this.columnExtractors = new ArrayList<>(columnExtractors);
    }

    public void setTableName(final String tableName) {
        this.tableName = tableName;
    }

    protected List<Object[]> buildBindVariables(final Collection<T> items) {
        return FluentIterable.from(items).transform(new BulkBindingTransform<T>(columnExtractors)).toList();
    }

    protected JdbcTemplate getJdbcTemplate() {
        return jdbcTemplate;
    }

    protected String getTableName() {
        return tableName;
    }

    @Autowired
    public void setDataSource(final DataSource datasource) {
        this.jdbcTemplate = new JdbcTemplate(datasource);
    }

    protected String buildColumnNameList() {
        return join(extract(columnExtractors, on(ColumnValueExtractors.class).getColumnName()), ",");
    }

    protected String buildBindVariablePlaceholderList() {
        return join(nCopies(columnExtractors.size(), "?"), ",");
    }

    private static class BulkBindingTransform<T> implements Function<T, Object[]> {

        private final List<ColumnValueExtractors> columns;
        private BulkBindingTransform(final List<ColumnValueExtractors> columns) {
            this.columns = columns;
        }

        @Nullable
        @Override
        public Object[] apply(final T input) {
            final Object[] bindings = new Object[columns.size()];

            for (int i = 0; i < columns.size(); i++) {
                bindings[i] = columns.get(i).resolveValue(input);
            }

            return bindings;
        }

    }
}

This will let you also bypass some of the slowness you may encounter directly interfacing with the default save(Iterable<S> iterable) interface. We use it all over for batch SQL operations. It is amazing how complicated such a simple task is. My bet is that you can trim this down to meet your specific needs. :)

Upvotes: 4

Related Questions