Reputation: 216
We have an application that works in an EC2 cluster (2 nodes currently for testing). For searching domain models we use Hibernate Search, and since the application runs on a cluster we use Infinispan as the Lucene Directory. To survive restarts we use a JDBC cache store on MySQL, both nodes accessing the same MySQL tables. To account for adding and removing nodes, we use the "jgroups" backend for Hibernate Search worker configuration.
Our problem is about duplicate records exceptions we receive when we try to rebuild the whole entity index. We get errors with similar stacktraces as this:
ERROR [AsyncStoreProcessor-LuceneIndexesData-0] [2016-06-21 17:01:59] org.infinispan.persistence.jdbc.stringbased.JdbcStringBasedStore - ISPN008024: Error while storing string key to database; key: '_d.fdt|0|1048576|com.model.SomeModel'
com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '_d.fdt|0|1048576|com.model.SomeModel' for key 'PRIMARY'
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1041)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4190)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4122)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1399)
at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:857)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2460)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2377)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2361)
at com.zaxxer.hikari.proxy.PreparedStatementJavassistProxy.executeUpdate(PreparedStatementJavassistProxy.java)
at org.infinispan.persistence.jdbc.stringbased.JdbcStringBasedStore.write(JdbcStringBasedStore.java:174)
at org.infinispan.persistence.async.AsyncCacheWriter.applyModificationsSync(AsyncCacheWriter.java:158)
at org.infinispan.persistence.async.AsyncCacheWriter$AsyncStoreProcessor.retryWork(AsyncCacheWriter.java:330)
at org.infinispan.persistence.async.AsyncCacheWriter$AsyncStoreProcessor.run(AsyncCacheWriter.java:312)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
When we check the DB, there is an entry with this ID. When we try it with a single node there are no errors. So our guess is both nodes are trying to write the cache entry to DB. What may be causing this problem? AFAIK the jgroups backend should be preventing it.
We use hibernate 4.3.9.Final, hibernate-search 5.2.1.Final, infinispan 7.2.5.Final and jgroups 3.6.8.Final. Infinispan configuration is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<infinispan xmlns="urn:infinispan:config:7.2"
xmlns:jdbc="urn:infinispan:config:store:jdbc:7.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
urn:infinispan:config:7.2 http://www.infinispan.org/schemas/infinispan-config-7.2.xsd
urn:infinispan:config:store:jdbc:7.2 http://www.infinispan.org/schemas/infinispan-cachestore-jdbc-config-7.2.xsd">
<jgroups>
<stack-file name="tcp" path="default-configs/default-jgroups-tcp.xml"/>
<stack-file name="ec2" path="search/infinispan-jgroups-ec2.xml"/>
</jgroups>
<cache-container name="HibernateSearch" default-cache="default" statistics="false" shutdown-hook="DONT_REGISTER">
<transport stack="${infinispan.transport:tcp}"/>
<!-- Duplicate domains are allowed so that multiple deployments with default configuration
of Hibernate Search applications work - if possible it would be better to use JNDI to share
the CacheManager across applications -->
<jmx duplicate-domains="true"/>
<!-- *************************************** -->
<!-- Cache to store Lucene's file metadata -->
<!-- *************************************** -->
<replicated-cache name="LuceneIndexesMetadata" mode="SYNC" remote-timeout="25000">
<transaction mode="NONE"/>
<state-transfer enabled="true" timeout="480000" await-initial-transfer="true"/>
<indexing index="NONE"/>
<locking striping="false" acquire-timeout="10000" concurrency-level="500" write-skew="false"/>
<eviction max-entries="-1" strategy="NONE"/>
<expiration max-idle="-1"/>
<persistence passivation="false">
<jdbc:string-keyed-jdbc-store preload="true" fetch-state="true" read-only="false" purge="false">
<jdbc:data-source jndi-url="java:comp/env/jdbc/..."/>
<jdbc:string-keyed-table drop-on-exit="false" create-on-start="true" prefix="ISPN_STRING_TABLE">
<jdbc:id-column name="ID" type="VARCHAR(255)"/>
<jdbc:data-column name="METADATA" type="BLOB"/>
<jdbc:timestamp-column name="TIMESTAMP" type="BIGINT"/>
</jdbc:string-keyed-table>
<property name="key2StringMapper">org.infinispan.lucene.LuceneKey2StringMapper</property>
<write-behind/>
</jdbc:string-keyed-jdbc-store>
</persistence>
</replicated-cache>
<!-- **************************** -->
<!-- Cache to store Lucene data -->
<!-- **************************** -->
<distributed-cache name="LuceneIndexesData" mode="SYNC" remote-timeout="25000">
<transaction mode="NONE"/>
<state-transfer enabled="true" timeout="480000" await-initial-transfer="true"/>
<indexing index="NONE"/>
<locking striping="false" acquire-timeout="10000" concurrency-level="500" write-skew="false"/>
<eviction max-entries="-1" strategy="NONE"/>
<expiration max-idle="-1"/>
<persistence passivation="false">
<jdbc:string-keyed-jdbc-store preload="true" fetch-state="true" read-only="false" purge="false">
<jdbc:data-source jndi-url="java:comp/env/jdbc/..."/>
<jdbc:string-keyed-table drop-on-exit="false" create-on-start="true" prefix="ISPN_STRING_TABLE">
<jdbc:id-column name="ID" type="VARCHAR(255)"/>
<jdbc:data-column name="DATA" type="MEDIUMBLOB"/>
<jdbc:timestamp-column name="TIMESTAMP" type="BIGINT"/>
</jdbc:string-keyed-table>
<property name="key2StringMapper">org.infinispan.lucene.LuceneKey2StringMapper</property>
<write-behind/>
</jdbc:string-keyed-jdbc-store>
</persistence>
</distributed-cache>
<!-- ***************************** -->
<!-- Cache to store Lucene locks -->
<!-- ***************************** -->
<replicated-cache name="LuceneIndexesLocking" mode="SYNC" remote-timeout="25000">
<transaction mode="NONE"/>
<state-transfer enabled="true" timeout="480000" await-initial-transfer="true"/>
<indexing index="NONE"/>
<locking striping="false" acquire-timeout="10000" concurrency-level="500" write-skew="false"/>
<eviction max-entries="-1" strategy="NONE"/>
<expiration max-idle="-1"/>
</replicated-cache>
</cache-container>
</infinispan>
Respective Hibernate configuration is as follows (which is done via Spring):
<bean id="emf" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="dataSource" ref="dataSource"/>
<property name="persistenceUnitName" value="..."/>
<property name="packagesToScan" value="com...."/>
<property name="jpaVendorAdapter">
<bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter">
<property name="generateDdl" value="false"/>
<property name="showSql" value="false"/>
<property name="databasePlatform" value="org.hibernate.dialect.MySQL5Dialect"/>
<property name="database" value="MYSQL"/>
</bean>
</property>
<property name="jpaPropertyMap">
<map>
<entry key="hibernate.default_batch_fetch_size" value="50"/>
<entry key="hibernate.multiTenancy" value="SCHEMA"/>
<entry key="hibernate.multi_tenant_connection_provider" value-ref="connectionProvider"/>
<entry key="hibernate.tenant_identifier_resolver" value-ref="tenantIdentifierResolver"/>
<entry key="hibernate.cache.use_second_level_cache" value="true"/>
<entry key="hibernate.cache.region.factory_class" value="com.hazelcast.hibernate.HazelcastCacheRegionFactory"/>
<entry key="hibernate.cache.hazelcast.use_native_client" value="true"/>
<entry key="hibernate.cache.hazelcast.native_client_address" value="127.0.0.1"/>
<entry key="hibernate.cache.hazelcast.native_client_group" value="dev"/>
<entry key="hibernate.cache.hazelcast.native_client_password" value="dev-pass"/>
<entry key="hibernate.connection.characterEncoding" value="UTF-8"/>
<entry key="hibernate.connection.useUnicode" value="true"/>
<entry key="hibernate.search.default.directory_provider" value="infinispan"/>
<entry key="hibernate.search.default.locking_cachename" value="LuceneIndexesLocking"/>
<entry key="hibernate.search.default.data_cachename" value="LuceneIndexesData"/>
<entry key="hibernate.search.default.metadata_cachename" value="LuceneIndexesMetadata"/>
<entry key="hibernate.search.default.chunk_size" value="1048576"/>
<entry key="hibernate.search.infinispan.configuration_resourcename" value="search/hibernatesearch-infinispan.xml"/>
<entry key="hibernate.search.default.worker.backend" value="jgroups"/>
<entry key="hibernate.search.services.jgroups.configurationFile" value="search/infinispan-jgroups-ec2.xml"/>
</map>
</property>
</bean>
Upvotes: 0
Views: 455
Reputation: 6107
You are correct on the purpose of the JGroups Backend and your configuration of Hibernate looks like correct.
The problem is on the configuration of the CacheStore component in Infinispan, these have a "shared" attribute, whose default is false.
<jdbc:string-keyed-jdbc-store
preload="true"
fetch-state="true"
read-only="false"
purge="false"
shared="true" <!-- FIX
As you suspected, each Infinispan node was going to re-write the same entry on "each" CacheStore instance, as it didn't guess that each of your nodes are actually connecting to the same database.
Setting the shared attribute to true should ensure that Infinispan's core will coordinate among nodes so that one (and only one) node will write the entry.
Upvotes: 2