curious learner
curious learner

Reputation: 41

Indexing problem with Spring Data Elastic migration from 3.x to 4.x

In our monolith application that used JHIPSTER-6.10.5, we were using Spring-Data-Elastic Version: 3.3.1 with Elastic Search Version: 6.8.8. We have multiple @ManyToOne and @OneToMany relationships with over a 100+ entities. In some cases a maximum of 7 entities are referenced from each other (I mean interlinked not just from one to other). For elastic searching, We have been using

  1. To ignore indexing: @JsonIgnoreProperities(value = { "unwanted fields" }, allowSetters = true) and @JsonIgnore where not needed
  2. To map the relations: on ManyToOne's we use @JsonBackReference with a corresponding @JsonManagedReference on the respective OneToMany relationships.

Now we are in process of migration to Jhipster-7.0.1 and started seing the below problems:

  1. New Spring-Data-Elastic Version: 4.1.6 with Elastic Search Version: 7.9.3
  2. Now with Spring data elastic, the Jackson based mapper is not available we are seeing multiple StackOverflow errors. Below is the migration change we did on the annotations:
    1. On the relationships we have added @Field(type = FieldType.Nested, ignoreMalformed = true, ignoreFields = {"unwanted fields"}). This stopped StackOverflow errors at Spring data level but still throw StackOverflow errors at elastic rest-client level internally. So, we are forced to use @Transient to exclude all the OnetoMany relations.
    2. Even on ManyToOne relations with the above mentioned @Field annotation present we are facing the elasticsearchexception with "Limit of total fields [1000] in index [] has been exceeded"
    3. I have tried to follow the documentation on spring data, but could not resolve it.
    4. We have kept the Json(Jackson) Annotations also that were generated by Jhipster but they have no effect.

We are stalled at the moment as we are not sure how to resolve these issues; personally it was very convenient and well documented to use the Json annotations; We being new to both elastic search and spring data elastic search, started using it just for the past 8 months back, not able to figure out how to fix these errors. Please ask if i missed any information needed. I will share as much as it doesn't voilate the org policies.

Sample code Repository as requested on gitter: https://gitlab.com/thelearner214/spring-data-es-sample

Thank you in advance

Upvotes: 2

Views: 1510

Answers (3)

Another Good Guy
Another Good Guy

Reputation: 90

@Lina Basuni You can use java.util.Collections.emptyList()

Upvotes: 0

curious learner
curious learner

Reputation: 41

Here is an approach we are using as a stop gap arrangement until we rewrite / find a better solution. Can't use separate classes for ES, like @P.J.Meisch advised, as we have large number of entities to maintain and a "microservice migration program" is already in progress.

Posting here as it might be useful for someone else with a similar issue.

Created a utility to serialize and deserialize the entity to get the benefit of Jackson annotations on the class. Ex: @JsonIgnoreProperities, @JsonIgnore etc. This way, we are able to reduce usage of the @Transient annotation and still get the ID(s) of the related object(s).

package com.sample.shop.service.util;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.DeserializationFeature;
import com.fasterxml.jackson.databind.JavaType;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.datatype.hibernate5.Hibernate5Module;

import org.jetbrains.annotations.NotNull;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.List;
import java.util.Optional;

public class ESUtils {

private static final Logger log = LoggerFactory.getLogger(ESUtils.class);

public static <T> Optional<T> mapForES(Class<T> type, T input) {
    ObjectMapper mapper = getObjectMapper();
    try {
        return Optional.ofNullable(mapper.readValue(mapper.writeValueAsString(input), type));
    } catch (JsonProcessingException e) {
        log.error("Parsing exception {} {}", e.getMessage());
        return Optional.empty();
    } 
}

public static <T> List<T> mapListForES(Class<T> type, List<T> input) {
    ObjectMapper mapper = getObjectMapper();
    try {
        JavaType javaType = mapper.getTypeFactory().constructCollectionType(List.class, type);
        String serialText = mapper.writeValueAsString(input);
        return mapper.readValue(serialText, javaType);
    } catch (JsonProcessingException e) {
        log.error("Parsing exception {} {}", e.getMessage());
    } 
}

@NotNull
private static ObjectMapper getObjectMapper() {
    ObjectMapper mapper = new ObjectMapper();
    mapper.configure(SerializationFeature.FAIL_ON_EMPTY_BEANS, false);
    mapper.configure(SerializationFeature.WRITE_SELF_REFERENCES_AS_NULL, true);
    mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
    Hibernate5Module module = new Hibernate5Module();
    module.disable(Hibernate5Module.Feature.FORCE_LAZY_LOADING);
    module.enable(Hibernate5Module.Feature.SERIALIZE_IDENTIFIER_FOR_LAZY_NOT_LOADED_OBJECTS);
    module.enable(Hibernate5Module.Feature.USE_TRANSIENT_ANNOTATION);
    module.enable(Hibernate5Module.Feature.REPLACE_PERSISTENT_COLLECTIONS);
    return mapper;
    }
}

Then, to save a single entry, have adjusted the logic to save to use the above utility like:

//    categorySearchRepository.save(result); instead of the Jhipster generated code let's use the ESUtils
ESUtils.mapForES(Category.class,category).map(res -> categorySearchRepository.save(res));

And to save a list to for bulk-reindex, using the second utility:

Page<T> categoryPage = jpaRepository.findAll(page);
List<T> categoryList = ESUtils.mapListForES(Category.class, categoryPage.getContent());
elasticsearchRepository.saveAll(categoryList);

Might not be a best solution, but got the work done for our migration.

Upvotes: 2

Abacus
Abacus

Reputation: 19471

Had a look at the repository you linked on gitter (you might consider adding a link here).

First: the @Field annotation is used to write the index mapping and the ignoreFields property is needed to break circular references when the mapping is built. It is not used when the entity is written to Elasticsearch.

What happens for example with the Address and Customer entities during writing to Elasticsearch: The Customer document has Addresses so these adresses are converted as subdocuments embedded in the Customer document. But the Address has a Customer, so on writing the address the Customer is embedded into this Address element which already is a subdocument of the customer.

I suppose the Customers should not be stored in the Address and the other way round. So you need to mark these embedded documents as @org.springframework.data.annotation.Transient, you don't need the @Field annotation on them as you don not want to store them as properties in the index.

Jackson annotations are not used by Spring Data Elasticsearch anymore.

The basic problem of the approach that is used here, is that the modelling that comes from a relational world - linking and joining different tables with (one|many)to{one|many) relationships, manifested in a Java object graph by an ORM mapper - is used on a document based data store that does not use these concepts.

It used to work in your previous version, because the elder version of Spring Data Elasticsearch used Jackson as well and so these fields were skipped on writing, now you have to add the @Transient annotation which is a Spring Data annotation.

But I don't know how @Transient might interfere with Spring Data JPA - another point that shows that it's not a good idea to use the same Java class for different stores

Upvotes: 4

Related Questions