Sahand
Sahand

Reputation: 8350

How are numbers analysed by Elasticsearch?

I have some data like this:

{"date": "2018-04-29T00:36", "price": 11900, "sellerName": "Leif J", "description": "Nybesiktigad U.A 2018-04-28 vid 2291mil. 360 mil p\u00e5 senaste 6 \u00e5ren. Ej vinterk\u00f6rd, och varmgarage p\u00e5 varje vinterf\u00f6rvaring (i min \u00e4go.) Extra ut\u00f6ver standard: -Eluppv\u00e4rmda handtag (2l\u00e4ges) -LED lampor -Avtagbar packbox. -TK Hydrotech avgassystem (ca 10% effekt) L\u00e4nkar nedan: https://www.youtube.com/watch?v=vqa_AiNq8-4 http://www.turbokit.net/#sthash.MboS0Bf2.dpbs L\u00e4s g\u00e4rna Expressens omd\u00f6me* f\u00f6r mer info (*fel i artikeln=varvr\u00e4knare finnes) https://www.expressen.se/motor/tester/en-fatolj-pa-hjul/ Billig i skatt / f\u00f6rs\u00e4kring / drift. Kymco, Grand Dink, 150, Scooter, Maxi, Maxiscooter, Vespa, Piaggio", "location": "Malm\u00f6, Sydv\u00e4st", "id": 0, "title": "Kymco Grand Dink 250", "modelYear": 2002, "url": "https://www.blocket.se/malmo/Kymco_Grand_Dink_250_79092265.htm?ca=11&w=3", "vehicleType": "Scooter"}}

As you can see, the price and modelYear fields are numbers. If I index this document with default settings, it seems like Elasticsearch automatically recognises that these fields are numbers. Like this:

POST _search:

{
    "query": {
        "bool": {
            "must": {
                "multi_match": {
                    "fields": [
                        "title^1.0",
                        "description"
                    ],
                    "operator": "or",
                    "query": "honda",
                    "type": "cross_fields"
                }
            }
        }
    }
}

returns:

        {
            "_index": "simple",
            "_type": "motorcycle",
            "_id": "XZu-Y2MByJQ0ZKmCDVzt",
            "_score": 4.3209167,
            "_source": {
                "date": "2018-03-28T00:00",
                "price": 67900,
                "sellerName": "Honda Mc Center Vänersborg",
                "description": "Honda VFR800A Mätarställning: 2800 mil Färg: Svartmetallic Typ: Touring/Landsväg Info: en härlig v-fyra i bra skick, Nyservad, värmehandtag, däck ok, GIVI väskor, Top box Oxford Honda, Honda VFR, VFR, VFR 800, Honda touring, Honda sport touring, Touring, Bankörning, banhoj",
                "location": "Vänersborg",
                "id": 13561,
                "title": "Honda VFR800A",
                "modelYear": 2008,
                "url": "https://www.blocket.se/alvsborg/Honda_VFR800A_78000747.htm?ca=11&w=3",
                "vehicleType": "Touring"
            }
        }

The number fields have no quotes around them, so they seem to be numbers. My question is, how are number fields analysed, if at all? I can't find any documentation about the process in which Elasticsearch recognises that a field is a number and then what sort of analysis is performed when indexing that field.

Could somebody tell me where I can read about this?

Upvotes: 0

Views: 26

Answers (1)

paqash
paqash

Reputation: 2314

I'm not entirely sure I understand the question, but the fields in a document are typed - Elasticsearch doesn't 'recognize fields', they are defined on an index level.

Official docs

Edit: Looks like your index was probably auto-created, so what you're looking for is dynamic field mapping.

Upvotes: 1

Related Questions