Roul
Roul

Reputation: 965

Making base64 encoded string as document key in Elasticsearch

I'm very new to the elasticsearch database.

I'm working on a document repository application where all documents uploaded to the application go to S3. Based on the S3 file Key(file path in S3 bucket) I generate a Base64 Encoded string which we use as a document Id in Elasticsearch.

Elasticsearch document contains data related to the uploaded file, like some extracted content from a file which used for searching and some additional metadata related to the file.

Now my question: Is it safe to use a Base64 encoded string as document Id in elasticsearch in terms of performance and security.

Upvotes: 0

Views: 912

Answers (2)

ms_27
ms_27

Reputation: 1684

By document id, if you mean to have _id field value as base64 encoded string, then it is totally allowed. As elastic search internally stores _id with string type, so it doesn't matter what type of value you pass it will be treated as string. The only point to note is that, it has limit of 512 bytes. (ref-link)

As _id is an indexed field and can be used for exact match based searches, you should be okay from performance perspective.

Regarding safety, there are couple of things to decide whether its safe or not:

  1. whether application/service making call to ES is public or internal to your organisation
  2. your security & access policies configured for S3

For #1 : If application is internal to organisation, then mostly internal apis and machines are inside VPN. So, should be safe.

For #2: If your application is external & access to your S3 is non-public, then even if someone was able to get documentIds and decodes the base64 strings to grab S3 file key, then still, due to access policies your data will be safe.

Upvotes: 1

Nisarg Shah
Nisarg Shah

Reputation: 389

You can have a look at this link for basic field data types which can be used in documents and for other operations.

There is a binary filed type which accepts binary as well as Base64 encoded string, but the field is not searchable as per the documentation. So I would suggest to use some other field as key for searching the documents, or create some unique ID per one document.

Upvotes: 0

Related Questions