Horse Voice
Horse Voice

Reputation: 8338

Using the java api for Elasticsearch how can I map the doc id during indexing

I am indexing data to an ElasticSearch engine server. I have a domain object called User. Below is the relevant code I am using. Right now, the _id attribute in elasticsearch is set to the incremental value from this command as shown below:

bulkRequest.add(client.prepareIndex("heros", "entry", i+"")

But I don't want an arbitrary id incremented, because Wonder Woman will also have a SN_NO as the id for her document. How can I map this Domain object's unique ID (SN_NO) to the _id in the elasticsearch engine? The reason I want this is, I may have to change one of her attributes such as waist_size for example over time. And I don't want the elasticsearch engine containing 2 Wonder Women one with a thin waist and another one fat as a silly example.

Sorry for the long question, i tried hard to make it entertaining to read.

Thank you in advance!

public class TestBulkElastic {

public static void main(String [] args) throws JsonGenerationException, JsonMappingException, IOException {

    // Create User object
    User user1 = new User();
    user1.setGender(Gender.FEMALE);
    Name n = new Name();
    n.setFirst("Wonder");
    n.setLast("Woman");
    user1.setName(n);
    user1.setVerified(false);
    ObjectMapper mapper = new ObjectMapper();
    mapper.writeValue(new File("user.json"), user1);

    HashMap<String,Object> fileResult =
            new ObjectMapper().readValue(new File("user.json"), HashMap.class);

    Settings settings = ImmutableSettings.settingsBuilder()
            .put("cluster.name", "MyES").build();

    Client client = new TransportClient(settings)
        .addTransportAddress(new InetSocketTransportAddress("123.123.123.123", 9350));

    BulkRequestBuilder bulkRequest = client.prepareBulk();
    int batch = 10000;
    int i = 0 ;

    while(i < 10000000){
        bulkRequest.add(client.prepareIndex("heros", "entry", i+"")
            .setSource(fileResult));

        if(i%batch == 0){
            bulkRequest.execute().actionGet();
            bulkRequest = null;
            bulkRequest = client.prepareBulk();
            }

        i++;

    }
}

}

Upvotes: 2

Views: 1659

Answers (1)

James
James

Reputation: 66

You can do this in the mapping for the type. Set the path of the _id field to point to the field to be used as the _id

{
"YourType": {
    "dynamic": "true",
    "_id": {
        "path": "new_id"
    },
    "_timestamp": {
        "enabled": true,
        "store": true
    },
    "properties": {
        "new_id": {
            "type": "string",
            "fields": {
                "raw": {
                    "index": "not_analyzed",
                    "type": "string"
                }
            }
        }
    }
}

}

Upvotes: 1

Related Questions