sagarpavan
sagarpavan

Reputation: 95

ElasticSearch Painless script: Not able to access nested objects using script score

I want to search for keywords from the specific field and return the documents. On top of these documents, I want to iterate through each nested object and search for keywords again from the same specific field on selected documents.

If a keyword exists, then check: if boolean isCurrent = True, set isCurrent=0 and append this value to list else if isCurrent = False, take the difference of current datetime, end datetime and get the value in terms of months and append this value to the list.

Finally, get the minimum value from that list of each document and sort the documents based on the minimum value.

I am able to do this custom login through script_fields and sort the documents based on the minimum value. When I use this same login in script_score it does not work. When I debug I see a problem accessing the nested field using params._source.

Any help will be much appreciated.

Please find the below elastic search query using script_fields. Here I am passing the current_milliseconds value from a python script.

    query = {
        'query': {
            "nested": {
                "path": "person.experiences",
                "query": {
                    "query_string": {
                        "query": keywords,
                        "fields": ["person.experiences.description"],
                        "default_operator": "AND"
                    }
                }
            }
        },
        "script_fields": {
            "recency": {
                "script": {
                    "lang": "painless",
                    "inline": """
                            def myString = "";
                            def isCurrent = 0;
                            def isFound = false;
                            def index_position = 0;
                            def recency_num = 0;
                            def result = 0;
                            def list = new ArrayList();

                            // for loop starts here
                            for(int i=0; i<params._source.person.experiences.size(); i++){
                            myString = params._source.person.experiences[i].description;

                            // string match starts here
                            if(myString != null && myString != ''){
                            def matcher1 = /electric/.matcher(myString.toLowerCase());
                            def matcher2 = /vehicle/.matcher(myString.toLowerCase());                    
                            //if(wordMatcher.find()){
                            if (matcher1.find() || matcher2.find()){
                            isFound = true;
                            }

                            if (isFound == true){
                            // recency check starts here
                            isCurrent = params._source.person.experiences[i].isCurrent;
                            if(isCurrent == true){
                            isCurrent=0;
                            result+=isCurrent;
                            list.add(isCurrent);
                            } else{
                            ZonedDateTime now = ZonedDateTime.ofInstant(Instant.ofEpochMilli(params['current_datetime']), ZoneId.of('Z'));
                            ZonedDateTime end_date = ZonedDateTime.parse(params._source.person.experiences[i].end);
                            isCurrent = end_date.until(now, ChronoUnit.MONTHS);
                            list.add(isCurrent);
                            result+=isCurrent;
                            recency_num = isCurrent;
                            }
                            }

                            }
                            }
                            def min = list.get(0);
                            for (int i : list){
                            min = min < i ? min : i;
                            }
                            return min;
                            """,
                    "params": {
                        "search_keywords": "Machine Learning",
                        "current_datetime": current_milliseconds
                    }

                }
            }
        }
    }

Thanks in advance.

Upvotes: 0

Views: 860

Answers (1)

Joe - Check out my books
Joe - Check out my books

Reputation: 16933

A valid script_score query in your case would look something like this:

GET my-index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "function_score": {
            "functions": [
              {
                "script_score": {
                  "script": {
                    "source": """
                      def myString = "";
                      def isCurrent = 0;
                      def isFound = false;
                      def index_position = 0;
                      def recency_num = 0;
                      def result = 0;
                      def list = new ArrayList();
          
                      def experiences = params._source.person.experiences;
          
                      // for loop starts here
                      for (int i=0; i<experiences.length; i++){
                        def experience = experiences[i];
                        
                        myString = experience.description;
            
                        // string match starts here
                        if(myString != null && myString != '') {
                          def matcher1 = /electric/.matcher(myString.toLowerCase());
                          def matcher2 = /vehicle/.matcher(myString.toLowerCase());                    
                          
                          if (matcher1.find() || matcher2.find()){
                            isFound = true;
                          }
            
                          if (isFound == true){
                            // recency check starts here
                            isCurrent = experience.isCurrent;
                        
                            if (isCurrent == true){
                              isCurrent=0;
                              result += isCurrent;
                              list.add(isCurrent);
                            } else {
                              def now = ZonedDateTime.ofInstant(Instant.ofEpochMilli(params['current_datetime']), ZoneId.of('Z'));
                              def end_date = ZonedDateTime.parse(experience.end);
                              isCurrent = end_date.until(now, ChronoUnit.MONTHS);
                              list.add(isCurrent);
                              result += isCurrent;
                              recency_num = isCurrent;
                            }
                          }
                        }
                      }
                      
                      if (list.length === 0) {
                        return 0;
                      }
                      
                      def min = list.get(0);
                      
                      for (int i : list){
                        min = min < i ? min : i;
                      }
                      
                      return min;
                    """,
                    "params": {
                        "search_keywords": "Machine Learning",
                        "current_datetime": 1643036066000
                    }
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}

Note that loads of regexes, iteration, and (date) parsing can drastically increase the query resolution.

Upvotes: 0

Related Questions