Reputation: 815
We are storing complex objects in Hazelcast maps and need the possibility to search for objects not only based on the key but also on the content of these complex objects. In order to not take too large a performance hit, we are using indices on those search terms.
We are also using spring-data-hazelcast which provides repositories that allow us to use findByAbcXyz() type semantic queries. For some of the more complex queries we are using the @Query annotation (which spring-data-hazelcast internally translates to SqlPredicates).
We have now encountered an issue where under certain situations these @Query based search methods did not return any values, even if we could verify that the searched objects did in fact exist in the map.
I have managed to reproduce this issue with core hazelcast (i.e. without the use of spring-data-hazelcast).
Here is our object structure:
BetriebspunktKey.java
public class BetriebspunktKey implements Serializable {
private Integer uicLand;
private Integer nummer;
public BetriebspunktKey(final Integer uicLand, final Integer nummer) {
this.uicLand = uicLand;
this.nummer = nummer;
}
public Integer getUicLand() {
return uicLand;
}
public Integer getNummer() {
return nummer;
}
}
Betriebspunkt.java
public class Betriebspunkt implements Serializable {
private BetriebspunktKey key;
private List<BetriebspunktVersion> versionen;
public Betriebspunkt(final BetriebspunktKey key, final List<BetriebspunktVersion> versionen) {
this.key = key;
this.versionen = versionen;
}
public BetriebspunktKey getKey() {
return key;
}
}
BetriebspunktVersion.java
public class BetriebspunktVersion implements Serializable {
private List<BetriebspunktKey> zusatzbetriebspunkte;
public BetriebspunktVersion(final List<BetriebspunktKey> zusatzbetriebspunkte) {
this.zusatzbetriebspunkte = zusatzbetriebspunkte;
}
}
In my main file, I am now setting up hazelcast:
Config config = new Config();
final MapConfig mapConfig = config.getMapConfig("points");
mapConfig.addMapIndexConfig(new MapIndexConfig("versionen[any].zusatzbetriebspunkte[any].nummer", false));
HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
IMap<BetriebspunktKey, Betriebspunkt> map = instance.getMap("points");
I am also preparing my search criteria for later on:
Predicate equalPredicate = Predicates.equal("versionen[any].zusatzbetriebspunkte[any].nummer", 53090);
Predicate sqlPredicate = new SqlPredicate("versionen[any].zusatzbetriebspunkte[any].nummer=53090");
Next, I am creating two objects, one with the "full depth" of information, the other does not contain any "zusatzbetriebspunkte":
final Betriebspunkt abc = new Betriebspunkt(
new BetriebspunktKey(80, 166),
Collections.singletonList(new BetriebspunktVersion(
Collections.singletonList(new BetriebspunktKey(80, 53090))
))
);
final Betriebspunkt def = new Betriebspunkt(
new BetriebspunktKey(83, 141),
Collections.singletonList(new BetriebspunktVersion(
Collections.emptyList()
))
);
Here is, where things become interesting. If I first insert the "full" object into the map, the search using both the EqualPredicate as well as the SqlPredicate works:
map.put(abc.getKey(), abc);
map.put(def.getKey(), def);
Collection<Betriebspunkt> equalResults = map.values(equalPredicate);
Collection<Betriebspunkt> sqlResults = map.values(sqlPredicate);
assertEquals(1, equalResults.size()); // contains "abc"
assertEquals(1, sqlResults.size()); // contains "abc"
However, if I insert the objects into my map in reverse order (i.e. first the "partial" object and then the "full" one), only the EqualPredicate works correctly, the SqlPredicate returns an empty list, no matter what the content of the map or the search criteria.
map.put(abc.getKey(), abc);
map.put(def.getKey(), def);
Collection<Betriebspunkt> equalResults = map.values(equalPredicate);
Collection<Betriebspunkt> sqlResults = map.values(sqlPredicate);
assertEquals(1, equalResults.size()); // contains "abc"
assertEquals(1, sqlResults.size()); // --> this fails, it returns en empty list
What is the reason for this behaviour? It looks like a bug in the hazelcast code.
Upvotes: 2
Views: 1541
Reputation: 815
The reason for failing
After a lot of debugging, I have found the reason for this issue. The reasons can indeed be found in the hazelcast code.
When putting a value into a hazelcast map DefaultRecordStore.putInternal
is called. At the end of this method DefaultRecordStore.saveIndex
is called which finds the corresponding indexes and then calls Indexes.saveEntryIndex
. This method iterates over each index and calls InternalIndex.saveEntryIndex
(or rather its implementation IndexImpl.saveEntryIndex
. The interesting part of that method are the following lines:
if (this.converter == null || this.converter == TypeConverters.NULL_CONVERTER) {
this.converter = entry.getConverter(this.attributeName);
}
Aparently each index stores a converter class when the first element is put into the map. Looking at QueryableEntry.getConverter
explains what happens:
TypeConverter getConverter(String attributeName) {
Object attribute = this.getAttributeValue(attributeName);
if (attribute == null) {
return TypeConverters.NULL_CONVERTER;
} else {
AttributeType attributeType = this.extractAttributeType(attributeName, attribute);
return attributeType == null ? TypeConverters.IDENTITY_CONVERTER : attributeType.getConverter();
}
}
When first inserting the "full" object, extractAttributeType()
will follow the "path" of our index definition "versionen[any].zusatzbetriebspunkte[any].nummer" and find out that nummer
is an integer type, accordingly a TypeConverters.IntegerConverter will be returned and stored.
When first inserting the "partial" object, "zusatzbetriebspunkte[any]" is emtpy, and there is no way for extractAttributeType
to find out what type nummer
hast, it therefore returns null which means that TypeConverters.IdentityConverter is used.
Also, whenever a "full" element is inserted an entry is written into the index map using nummer
as key, i.e. the index-map is of type Map.
So much for writing to the map. Let's now look at how data is read from the map. When calling map.values(predicate)
we will eventually get to QueryRunner.runUsingGlobalIndexSafely
which contains a line:
Collection<QueryableEntry> entries = indexes.query(predicate);
this will in turn after some boilerplate code call
Set<QueryableEntry> result = indexAwarePredicate.filter(queryContext);
For both of our predicates we will eventually get to IndexImpl.getRecords()
which looks as follows:
public Set<QueryableEntry> getRecords(Comparable attributeValue) {
long timestamp = this.stats.makeTimestamp();
if (this.converter == null) {
this.stats.onIndexHit(timestamp, 0L);
return new SingleResultSet((Map)null);
} else {
Set<QueryableEntry> result = this.indexStore.getRecords(this.convert(attributeValue));
this.stats.onIndexHit(timestamp, (long)result.size());
return result;
}
}
The crucial call is this.convert(attributeValue)
where attributeValue
is the value
of the predicate.
If we compare our two predicates, we can see that the EqualPredicate has two members:
attributeName = "versionen[any].zusatzbetriebspunkte[any].nummer"
value = {Integer} 53090
The SqlPredicate contains the initial string (which we passed to its constructor) but which at constructions was also parsed and mapped to a internal EqualPredicate (which when evaluating the predicate is eventually used and passed to getRecords() above):
sql = "versionen[any].zusatzbetriebspunkte[any].nummer=53090"
predicate = {EqualPredicate}
attributeName = "versionen[any].zusatzbetriebspunkte[any].nummer"
value = {String} "53090"
And this explains why the manually created EqualPredicate works in both cases: Its value is an integer. When passed to the converter, it does not matter whether it is the IntegerConverter or the IdentityConverter, as both will return the integer which can then be used as key in the index-map (which uses an integer as key).
With the SqlPredicate however, the value is a String. If this is passed to the IntegerConverter, it is converted to its corresponding integer value and accessing the index-map works. If it is passed to the IdentityConverter, the string is returned by the conversion and trying to access the index-map with a string will never find any results.
A possible solution
How can we solve this issue? I see several possibilities:
First we need to implement a value extractor which returns all zusatzbetriebspunkte of our Betriebspunkt in a form suitable for us
public class BetriebspunktExtractor extends ValueExtractor<Betriebspunkt, String> implements Serializable {
@Override
public void extract(final Betriebspunkt betriebspunkt, final String argument, final ValueCollector valueCollector) {
betriebspunkt.getVersionen().stream()
.map(BetriebspunktVersion::getZusatzbetriebspunkte)
.flatMap(List::stream)
.map(zbp -> zbp.getUicLand() + "_" + zbp.getNummer())
.forEach(valueCollector::addObject);
}
}
You'll notice that I am not only returning the nummer
field but also include the uicLand
field this is something we really wanted but couldn't get working using the "...[any]..." notation. We could of course only return the nummer if we wanted the exact same behavior as outlined above.
Now we need to modify our hazelcast configuration slightly:
Config config = new Config();
final MapConfig mapConfig = config.getMapConfig("points");
//mapConfig.addMapIndexConfig(new MapIndexConfig("versionen[any].zusatzbetriebspunkte[any].nummer", false));
mapConfig.addMapIndexConfig(new MapIndexConfig("zusatzbetriebspunkt", false));
mapConfig.addMapAttributeConfig(new MapAttributeConfig("zusatzbetriebspunkt", BetriebspunktExtractor.class.getName()));
You'll notice that the "long" index definition using the "...[any]..." notation is no longer needed.
Now we can use this "pseudo attribute" to query our values and it doesn't matter in which order the objects have been added to the map:
Predicate keyPredicate = Predicates.equal("zusatzbetriebspunkt", "80_53090");
Collection<Betriebspunkt> keyResults = map.values(keyPredicate);
assertEquals(1, keyResults.size()); // always contains "abc"
And in our spring-data-hazelcast repository we can now do this:
@Query("zusatzbetriebspunkt=%d_%d")
List<StammdatenBetriebspunkt> findByZusatzbetriebspunkt(Integer uicLand, Integer nummer);
If you do not need to use spring-data-hazelcast, instead of returning a string to the ValueCollector, you could return the BetriebspunktKey directly and then use it in the predicate as well. That would be the cleanest solution:
public class BetriebspunktExtractor extends ValueExtractor<Betriebspunkt, String> implements Serializable {
@Override
public void extract(final Betriebspunkt betriebspunkt, final String argument, final ValueCollector valueCollector) {
betriebspunkt.getVersionen().stream()
.map(BetriebspunktVersion::getZusatzbetriebspunkte)
.flatMap(List::stream)
//.map(zbp -> zbp.getUicLand() + "_" + zbp.getNummer())
.forEach(valueCollector::addObject);
}
}
and then
Predicate keyPredicate = Predicates.equal("zusatzbetriebspunkt", new BetriebspunktKey(80, 53090));
However, for this to work, BetriebspunktKey needs to implement Comparable
and must also provide its own equals
and hashCode
methods.
Upvotes: 1