Hazelcast not working correctly with SqlPredicate and Index on optional field

Question

We are storing complex objects in Hazelcast maps and need the possibility to search for objects not only based on the key but also on the content of these complex objects. In order to not take too large a performance hit, we are using indices on those search terms.

We are also using spring-data-hazelcast which provides repositories that allow us to use findByAbcXyz() type semantic queries. For some of the more complex queries we are using the @Query annotation (which spring-data-hazelcast internally translates to SqlPredicates).

We have now encountered an issue where under certain situations these @Query based search methods did not return any values, even if we could verify that the searched objects did in fact exist in the map.

I have managed to reproduce this issue with core hazelcast (i.e. without the use of spring-data-hazelcast).

Here is our object structure:

BetriebspunktKey.java

public class BetriebspunktKey implements Serializable {
  private Integer uicLand;
  private Integer nummer;

  public BetriebspunktKey(final Integer uicLand, final Integer nummer) {
    this.uicLand = uicLand;
    this.nummer = nummer;
  }

  public Integer getUicLand() {
    return uicLand;
  }

  public Integer getNummer() {
    return nummer;
  }
}

Betriebspunkt.java

public class Betriebspunkt implements Serializable {
  private BetriebspunktKey key;
  private List versionen;

  public Betriebspunkt(final BetriebspunktKey key, final List versionen) {
    this.key = key;
    this.versionen = versionen;
  }

  public BetriebspunktKey getKey() {
    return key;
  }
}

BetriebspunktVersion.java

public class BetriebspunktVersion implements Serializable {
  private List zusatzbetriebspunkte;

  public BetriebspunktVersion(final List zusatzbetriebspunkte) {
    this.zusatzbetriebspunkte = zusatzbetriebspunkte;
  }
}

In my main file, I am now setting up hazelcast:

Config config = new Config();
final MapConfig mapConfig = config.getMapConfig("points");
mapConfig.addMapIndexConfig(new MapIndexConfig("versionen[any].zusatzbetriebspunkte[any].nummer", false));

HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);

IMap map = instance.getMap("points");

I am also preparing my search criteria for later on:

Predicate equalPredicate = Predicates.equal("versionen[any].zusatzbetriebspunkte[any].nummer", 53090);
Predicate sqlPredicate = new SqlPredicate("versionen[any].zusatzbetriebspunkte[any].nummer=53090");

Next, I am creating two objects, one with the "full depth" of information, the other does not contain any "zusatzbetriebspunkte":

final Betriebspunkt abc = new Betriebspunkt(
        new BetriebspunktKey(80, 166),
        Collections.singletonList(new BetriebspunktVersion(
            Collections.singletonList(new BetriebspunktKey(80, 53090))
        ))
    );

    final Betriebspunkt def = new Betriebspunkt(
        new BetriebspunktKey(83, 141),
        Collections.singletonList(new BetriebspunktVersion(
            Collections.emptyList()
        ))
    );

Here is, where things become interesting. If I first insert the "full" object into the map, the search using both the EqualPredicate as well as the SqlPredicate works:

map.put(abc.getKey(), abc);
map.put(def.getKey(), def);

Collection equalResults = map.values(equalPredicate);
Collection sqlResults = map.values(sqlPredicate);

assertEquals(1, equalResults.size()); // contains "abc"
assertEquals(1, sqlResults.size());   // contains "abc"

However, if I insert the objects into my map in reverse order (i.e. first the "partial" object and then the "full" one), only the EqualPredicate works correctly, the SqlPredicate returns an empty list, no matter what the content of the map or the search criteria.

map.put(abc.getKey(), abc);
map.put(def.getKey(), def);

Collection equalResults = map.values(equalPredicate);
Collection sqlResults = map.values(sqlPredicate);

assertEquals(1, equalResults.size()); // contains "abc"
assertEquals(1, sqlResults.size());   // --> this fails, it returns en empty list

What is the reason for this behaviour? It looks like a bug in the hazelcast code.

Urs Beeli · Accepted Answer

The reason for failing

After a lot of debugging, I have found the reason for this issue. The reasons can indeed be found in the hazelcast code.

When putting a value into a hazelcast map DefaultRecordStore.putInternal is called. At the end of this method DefaultRecordStore.saveIndex is called which finds the corresponding indexes and then calls Indexes.saveEntryIndex. This method iterates over each index and calls InternalIndex.saveEntryIndex (or rather its implementation IndexImpl.saveEntryIndex. The interesting part of that method are the following lines:

if (this.converter == null || this.converter == TypeConverters.NULL_CONVERTER) {
      this.converter = entry.getConverter(this.attributeName);
}

Aparently each index stores a converter class when the first element is put into the map. Looking at QueryableEntry.getConverter explains what happens:

  TypeConverter getConverter(String attributeName) {
    Object attribute = this.getAttributeValue(attributeName);
    if (attribute == null) {
      return TypeConverters.NULL_CONVERTER;
    } else {
      AttributeType attributeType = this.extractAttributeType(attributeName, attribute);
      return attributeType == null ? TypeConverters.IDENTITY_CONVERTER : attributeType.getConverter();
    }
  }

When first inserting the "full" object, extractAttributeType() will follow the "path" of our index definition "versionen[any].zusatzbetriebspunkte[any].nummer" and find out that nummer is an integer type, accordingly a TypeConverters.IntegerConverter will be returned and stored.

When first inserting the "partial" object, "zusatzbetriebspunkte[any]" is emtpy, and there is no way for extractAttributeType to find out what type nummer hast, it therefore returns null which means that TypeConverters.IdentityConverter is used.

Also, whenever a "full" element is inserted an entry is written into the index map using nummer as key, i.e. the index-map is of type Map.

So much for writing to the map. Let's now look at how data is read from the map. When calling map.values(predicate) we will eventually get to QueryRunner.runUsingGlobalIndexSafely which contains a line:

Collection entries = indexes.query(predicate);

this will in turn after some boilerplate code call

Set result = indexAwarePredicate.filter(queryContext);

For both of our predicates we will eventually get to IndexImpl.getRecords() which looks as follows:

  public Set getRecords(Comparable attributeValue) {
    long timestamp = this.stats.makeTimestamp();
    if (this.converter == null) {
      this.stats.onIndexHit(timestamp, 0L);
      return new SingleResultSet((Map)null);
    } else {
      Set result = this.indexStore.getRecords(this.convert(attributeValue));
      this.stats.onIndexHit(timestamp, (long)result.size());
      return result;
    }
  }

The crucial call is this.convert(attributeValue) where attributeValue is the value of the predicate.

If we compare our two predicates, we can see that the EqualPredicate has two members:

attributeName = "versionen[any].zusatzbetriebspunkte[any].nummer"
value = {Integer} 53090

The SqlPredicate contains the initial string (which we passed to its constructor) but which at constructions was also parsed and mapped to a internal EqualPredicate (which when evaluating the predicate is eventually used and passed to getRecords() above):

sql = "versionen[any].zusatzbetriebspunkte[any].nummer=53090"
predicate = {EqualPredicate}
  attributeName = "versionen[any].zusatzbetriebspunkte[any].nummer"
  value = {String} "53090"

And this explains why the manually created EqualPredicate works in both cases: Its value is an integer. When passed to the converter, it does not matter whether it is the IntegerConverter or the IdentityConverter, as both will return the integer which can then be used as key in the index-map (which uses an integer as key).

With the SqlPredicate however, the value is a String. If this is passed to the IntegerConverter, it is converted to its corresponding integer value and accessing the index-map works. If it is passed to the IdentityConverter, the string is returned by the conversion and trying to access the index-map with a string will never find any results.

A possible solution

How can we solve this issue? I see several possibilities:

insert a "fully built" dummy value into our map during startup to ensure the converter is correctly initialised. While this works, it is ugly and not maintenance friendly
avoid using SqlPredicate and use the integer based EqualPredicate. This is not an option when working with spring-data-hazelcast as it always converts @Query based searches to SqlPredicates. We could of course use hazelcast directly and circumvent the spring-data wrapper but while that would work it means having two ways of accessing hazelcast which is also not very maintainable
use hazelcast's ValueExtractor class. This is the elegant solution that works both natively and using spring-data-hazelcast. I will outline what that looks like:

First we need to implement a value extractor which returns all zusatzbetriebspunkte of our Betriebspunkt in a form suitable for us

public class BetriebspunktExtractor extends ValueExtractor implements Serializable {
  @Override
  public void extract(final Betriebspunkt betriebspunkt, final String argument, final ValueCollector valueCollector) {
    betriebspunkt.getVersionen().stream()
                 .map(BetriebspunktVersion::getZusatzbetriebspunkte)
                 .flatMap(List::stream)
                 .map(zbp -> zbp.getUicLand() + "_" + zbp.getNummer())
                 .forEach(valueCollector::addObject);
  }
}

You'll notice that I am not only returning the nummer field but also include the uicLand field this is something we really wanted but couldn't get working using the "...[any]..." notation. We could of course only return the nummer if we wanted the exact same behavior as outlined above.

Now we need to modify our hazelcast configuration slightly:

Config config = new Config();
final MapConfig mapConfig = config.getMapConfig("points");
//mapConfig.addMapIndexConfig(new MapIndexConfig("versionen[any].zusatzbetriebspunkte[any].nummer", false));
mapConfig.addMapIndexConfig(new MapIndexConfig("zusatzbetriebspunkt", false));
mapConfig.addMapAttributeConfig(new MapAttributeConfig("zusatzbetriebspunkt", BetriebspunktExtractor.class.getName()));

You'll notice that the "long" index definition using the "...[any]..." notation is no longer needed.

Now we can use this "pseudo attribute" to query our values and it doesn't matter in which order the objects have been added to the map:

Predicate keyPredicate = Predicates.equal("zusatzbetriebspunkt", "80_53090");
Collection keyResults = map.values(keyPredicate);
assertEquals(1, keyResults.size()); // always contains "abc"

And in our spring-data-hazelcast repository we can now do this:

@Query("zusatzbetriebspunkt=%d_%d")
List findByZusatzbetriebspunkt(Integer uicLand, Integer nummer);

If you do not need to use spring-data-hazelcast, instead of returning a string to the ValueCollector, you could return the BetriebspunktKey directly and then use it in the predicate as well. That would be the cleanest solution:

public class BetriebspunktExtractor extends ValueExtractor implements Serializable {
  @Override
  public void extract(final Betriebspunkt betriebspunkt, final String argument, final ValueCollector valueCollector) {
    betriebspunkt.getVersionen().stream()
                 .map(BetriebspunktVersion::getZusatzbetriebspunkte)
                 .flatMap(List::stream)
                 //.map(zbp -> zbp.getUicLand() + "_" + zbp.getNummer())
                 .forEach(valueCollector::addObject);
  }
}

and then

Predicate keyPredicate = Predicates.equal("zusatzbetriebspunkt", new BetriebspunktKey(80, 53090));

However, for this to work, BetriebspunktKey needs to implement Comparable and must also provide its own equals and hashCode methods.

Hazelcast not working correctly with SqlPredicate and Index on optional field

Answers (1)

Related Questions