Information Technology
Information Technology

Reputation: 2333

How do you properly index data in elasticsearch, find it, and then delete it, all using CURL?

I've created code (in a test.sh shell script) that tries to test indexing, searching, and deleting of data into an elasticsearch cluster called my_test_cluster:

  1. Puts two records of type "person" (Jane and John) into an index called megacorp. NOTE: I believe this is where the code is failing.

  2. I search for all records and none of the records I tried to index show up. RESULT: I do not see the two records I tried to index but I do see the default record that comes with the instance of elasticsearch (name=Leonus) does show up.

  3. I specifically search for user=Jane and nothing comes up. RESULT: I do not see a record for user=Jane.

  4. I try to delete user=Jane record by her ID=1

  5. I search for all records, again, and only see the default record of user=Leonus

My code is as follows:

echo ""
echo "------------------------------------------------------"
echo "PUT Employees into megacorp index."
echo "------------------------------------------------------"
curl -XPUT 'http://localhost:9200/megacorp/employee/1' -d '{
    "first_name" : "Jane",
    "last_name" :  "Doe",
    "age" :        25,
    "about" :      "I love to go rock climbing and write music.",
    "interests": [ "sports", "music" ]
}'

curl -XPUT 'http://localhost:9200/megacorp/employee/2' -d '{
    "first_name" : "John",
    "last_name" :  "Smith",
    "age" :        30,
    "about" :      "I love to go rock climbing and cooking.",
    "interests": [ "sports", "cooking" ]
}'

echo ""
echo ""
echo "------------------------------------------------------"
echo "Search all records"
echo "------------------------------------------------------"
curl -i -XGET 'http://localhost:9200/'

echo ""
echo ""
echo "------------------------------------------------------"
echo "Search specifically for user = Jane"
echo "------------------------------------------------------"
curl -XGET 'http://localhost:9200/megacorp/employee/_search?q=user:Jane'

echo ""
echo ""
echo "------------------------------------------------------"
echo "Delete employee record ID = 1'"
echo "------------------------------------------------------"

curl -XDELETE 'http://localhost:9200/megacorp/employee/1'

echo ""
echo ""
echo "------------------------------------------------------"
echo "Search all"
echo "------------------------------------------------------"
curl -i -XGET 'http://localhost:9200/'

echo ""
echo ""

When I run the script . test.sh, I get the following results...

------------------------------------------------------
PUT Employees into megacorp index.
------------------------------------------------------
{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}{"_index":"megacorp","_type":"employee","_id":"2","_version":5,"_shards":{"total":2,"successful":1,"failed":0},"created":false}

------------------------------------------------------
Search all records
------------------------------------------------------
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 317

{
  "name" : "Leonus",
  "cluster_name" : "my_test_cluster",
  "version" : {
    "number" : "2.1.0",
    "build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
    "build_timestamp" : "2015-11-18T22:40:03Z",
    "build_snapshot" : false,
    "lucene_version" : "5.3.1"
  },
  "tagline" : "You Know, for Search"
}


------------------------------------------------------
Search specifically for user = Jane
------------------------------------------------------
{"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

------------------------------------------------------
Delete employee record ID = 1'
------------------------------------------------------
{"found":true,"_index":"megacorp","_type":"employee","_id":"1","_version":2,"_shards":{"total":2,"successful":1,"failed":0}}

------------------------------------------------------
Search all
------------------------------------------------------
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 317

{
  "name" : "Leonus",
  "cluster_name" : "my_test_cluster",
  "version" : {
    "number" : "2.1.0",
    "build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
    "build_timestamp" : "2015-11-18T22:40:03Z",
    "build_snapshot" : false,
    "lucene_version" : "5.3.1"
  },
  "tagline" : "You Know, for Search"
}

Any help on what I'm doing wrong is greatly appreciated. -- Thanks

Upvotes: 0

Views: 159

Answers (1)

Val
Val

Reputation: 217304

Your query for searching all the records should be either one of (i.e. you need to query the /_search endpoint not the root / which simply tells you some details about your ES install):

curl -i -XGET 'http://localhost:9200/megacorp/employee/_search'
curl -i -XGET 'http://localhost:9200/megacorp/_search'
curl -i -XGET 'http://localhost:9200/_search'

And you'll see your two records after indexing them. If not you might need to call _refresh after indexing and before searching:

curl -XPOST 'http://localhost:9200/megacorp/_refresh'

In order to search for Jane, you need to use the proper field (first_name not user), i.e.

curl -XGET 'http://localhost:9200/megacorp/employee/_search?q=first_name:jane'

You can also search without specifying a field at all (the search will be done on a special field called _all):

curl -XGET 'http://localhost:9200/megacorp/employee/_search?q=jane'

UPDATE

I'm answering your comment here as there's more room :)

  1. There is no default user "Leonus". What you see there when querying http://localhost:9200/ is Elasticsearch basically saying "hello world". "Leonus" is just the name of your node. If you restart your node, you'll see another name (more info).

  2. A refresh is happening every second by default, so you only need to call it if you're searching right after indexing in a script. But if you index a document and search it more than a second later, you don't need to refresh it. If you don't want to refresh explicitely, you have two options: 1) you make your script pause for one second (e.g. sleep 1) or 2) you set the refresh interval to -1 (more info)

Upvotes: 1

Related Questions