Reputation: 4659
I have indexed the meta-data of three files and they are of "text/plain" mime-Types.
But when I am trying to match other mime-types
with "text/plain", following are getting matched!
Here is the list of mime-types
that are matching with "text/plain", with hits
& score
:
***********************************
1. Mime-Type text/vnd.motorola.reflex
2. Total Hits 3
3. Max Score 0.07154637
***********************************
1. Mime-Type text/vnd.ms-mediapackage
2. Total Hits 3
3. Max Score 0.034633614
***********************************
1. Mime-Type text/vnd.net2phone.commcenter.command
2. Total Hits 3
3. Max Score 0.07154637
***********************************
1. Mime-Type text/plain
2. Total Hits 3
3. Max Score 0.629606
***********************************
I want that mime-type should exact match and should consider only last one. If you notice it's giving Max-score greater then all above.
Search Code:
query = "text/plain"; filter = "mimeType"
public long getHitsCount(String query, String filter, Project project) {
try {
/*TermQueryBuilder QueryBuilder = new TermQueryBuilder(filter, smartEscapeQuery(query));*/
/* QueryStringQueryBuilder QueryBuilder = new QueryStringQueryBuilder(smartEscapeQuery(query)).field(filter);*/
MatchQueryBuilder QueryBuilder = QueryBuilders.matchQuery(filter, smartEscapeQuery(query));
QueryBuilder qb = QueryBuilders
.boolQuery()
.must(QueryBuilder);
SearchRequestBuilder requestBuilder;
requestBuilder = client.prepareSearch()
.setIndices(getDomainIndexId(project))
.setTypes(getProjectTypeId(project))
.setSearchType(SEARCH_TYPE)
.setQuery(qb);
SearchResponse response = requestBuilder.execute().actionGet(ES_TIMEOUT_MS);
SearchHits hits = response.getHits();
if (hits.getTotalHits() > 0) {
return hits.getTotalHits();
}else{
return 0l;
}
} catch (IndexMissingException ex) {
}
return 0;
}
/**
* Escape the string from bad chars for the search
*
* @param str the String that should be escaped
* @return an escaped String
*/
@SuppressWarnings({"ConstantConditions"})
private static String smartEscapeQuery(String str) {
if (StringUtils.isBlank(str)) {
return "";
}
StringBuilder sb = new StringBuilder();
for (int i = 0; i < str.length(); i++) {
char c = str.charAt(i);
if (c == '\\' || c == '+' || c == '-' || c == '!' || c ==
'(' || c == ')' || c == ':'
|| c == '^' || c == '[' || c == ']' || c == '\"'
|| c == '{' || c == '}' || c == '~' || c == '/'
|| c == '?' || c == '|' || c == '&' || c == ';'
|| (!Character.isSpaceChar(c) &&
Character.isWhitespace(c))) {
sb.append('\\');
}
sb.append(c);
}
return sb.toString();
}
Match Query:
{
"bool" : {
"must" : {
"match" : {
"mimeType" : {
"query" : "text\\/plain",
"type" : "boolean"
}
}
}
}
}
Result: 3 Hits
Term Query:
{
"bool" : {
"must" : {
"term" : {
"mimeType" : "text\\/plain"
}
}
}
}
Result: 0 Hits
I have tried with both TermQuery
& MatchQuery
but it did not work. I am using AutoDetectParser
while indexing.
How can I match the exact value in elasticsearch so that in above example it should only match with the "text/plain" NOT with matching ones?
Upvotes: 0
Views: 1504
Reputation: 1479
In your first example you have a query of type "match query". Therefore your query is analyzed before search (text OR plain). Which anlayzer you have used by indexing? Or could it be helpful to "not_analyzed" this field? In your second example you make use of type "term query". This also requires a "not_analyzed" field.
Upvotes: 1