Michael Wiles
Michael Wiles

Reputation: 21184

How do I upload a pdf to elasticsearch when using the elastic search java client?

This link explains how to use the REST API to upload an attachment.

But I want to upload an attachment with the java client...

I assume the following classes are relevant (though I may be wrong)...

org.elasticsearch.ingest.IngestService
org.elasticsearch.ingest.PipelineStore

I realize that I can just fall back to the REST interface but I'd rather try and use the native client first...

Upvotes: 1

Views: 893

Answers (2)

Ilia P
Ilia P

Reputation: 646

Here is four options that you can use to index PDFs to ElasticSearch

  • Ingest Attachment Plugin
  • Apache Tika
  • FsCrawler
  • Ambar

Pros/cons described in this post

Upvotes: 0

dadoonet
dadoonet

Reputation: 14537

Just send a BASE64 encoded PDF in a field like:

String base64;
try (InputStream is = YourClass.class.getResourceAsStream(pathToYourFile)) {
    byte bytes[] = IOUtils.toByteArray(is);
    base64 = Base64.getEncoder().encodeToString(bytes);
}

IndexRequest indexRequest = new IndexRequest("index", "type", "id")
   .setPipeline("foo")
   .source(
       jsonBuilder().startObject()
           .field("field", base64)
       .endObject()
   );

In case you are not aware of it, I'm also linking to FSCrawler project in case it solves something you want to do already.

Upvotes: 1

Related Questions