Rose
Rose

Reputation: 1498

I want to use a wildcard query for url in elasticsearch. I am using elasticsearch 2.3.0

My index looks like this:

GET pibtest1/_search

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 11,
    "max_score": 1,
    "hits": [
      {
        "_index": "pibtest1",
        "_type": "SearchTech",
        "_id": "_update",
        "_score": 1,
        "_source": {
          "script": "ctx._source.remove(\"wiki_collection\")"
        }
      },
      {
        "_index": "pibtest1",
        "_type": "SearchTech",
        "_id": "http://www.searchtechnologies.com/bundles/jquery?v=gOdOgfykTFJnypePAvGweyMPwl-krhx8ntIhefPKelg1",
        "_score": 1,
        "_source": {
          "extension": {
            "X-Parsed-By": "org.apache.tika.parser.DefaultParser",
            "Content-Encoding": "ISO-8859-1",
            "resourceName": "http://www.searchtechnologies.com/bundles/jquery?v=gOdOgfykTFJnypePAvGweyMPwl-krhx8ntIhefPKelg1"
          },
          "keywords": "keywords-NOT-PROVIDED",
          "default_collection": true,
          "wiki_collection": false,
          "description": "description-NOT-PROVIDED",
          "connectorSpecific": {
            "discoveredBy": "http://www.searchtechnologies.com/",
            "xslt": "false",
            "pathFromSeed": "E",
            "md5": "OKTGVLEWTE5V4PWXUBM2RK3KMQ"
          },
          "title": "Title-NOT-PROVIDED",
          "url": "http://www.searchtechnologies.com/bundles/jquery?v=gOdOgfykTFJnypePAvGweyMPwl-krhx8ntIhefPKelg1",
          "remove": "wiki_collection",
          "UD": "http://www.searchtechnologies.com/bundles/jquery?v=gOdOgfykTFJnypePAvGweyMPwl-krhx8ntIhefPKelg1",

Now I want to use a wildcard query to search for few url which includes some pattern(for eg. http://www.searchtechnologies.com/bundles)

This is my wildcard query:

GET pibtest1/_search

    {
      "query": {
        "wildcard": {
          "url": {
            "value": "http://www.searchtechnologies.com/bundles*"
          }
        }
      }
    }

I am using "*" wildcard which matches any character sequence. But I am not getting any results. My output looks like this:

{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

I want my results to include those url which matches this "http://www.searchtechnologies.com/bundles" pattern. Any help would be appreciated.

Upvotes: 0

Views: 2359

Answers (1)

alpert
alpert

Reputation: 4655

Based on comments your url field is an analyzed field. So when you insert data the data will be tokenized as ["www.searchtechnologies.com", "v", "jquery", "gOdOgfykTFJnypePAvGweyMPwl", ...]. So your query wont match this field.

  • You should delete your index.
  • Insert a mapping and specify url field as not analyzed {"index":"not_analyzed"}
  • Insert your data.
  • Run wildcard query.

If you dont want to delete your index because a downtime check: https://www.elastic.co/blog/changing-mapping-with-zero-downtime

Upvotes: 2

Related Questions