Reputation: 777
Elasticsearch experts,
I have been unable to find a simple way to just tell ElasticSearch to insert the _timestamp field for all the documents that are added in all the indices (and all document types).
I see an example for specific types: http://www.elasticsearch.org/guide/reference/mapping/timestamp-field/
and also see an example for all indices for a specific type (using _all): http://www.elasticsearch.org/guide/reference/api/admin-indices-put-mapping/
but I am unable to find any documentation on adding it by default for all documents that get added irrespective of the index and type.
Upvotes: 61
Views: 133833
Reputation: 16895
You can make use of default index pipelines, leverage the script processor, and thus emulate the auto_now_add
functionality you may know from Django and DEFAULT GETDATE()
from SQL.
The process of adding a default yyyy-MM-dd HH:mm:ss
date goes like this:
1. Create the pipeline and specify which indices it'll be allowed to run on:
PUT _ingest/pipeline/auto_now_add
{
"description": "Assigns the current date if not yet present and if the index name is whitelisted",
"processors": [
{
"script": {
"source": """
// skip if not whitelisted
if (![ "myindex",
"logs-index",
"..."
].contains(ctx['_index'])) { return; }
// don't overwrite if present
if (ctx['created_at'] != null) { return; }
ctx['created_at'] = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date());
"""
}
}
]
}
Side note: the ingest processor's Painless script context is documented here.
2. Update the default_pipeline
setting in all of your indices:
PUT _all/_settings
{
"index": {
"default_pipeline": "auto_now_add"
}
}
Side note: you can restrict the target indices using the multi-target syntax:
PUT myindex,logs-2021-*/_settings?allow_no_indices=true
{
"index": {
"default_pipeline": "auto_now_add"
}
}
3. Ingest a document to one of the configured indices:
PUT myindex/_doc/1
{
"abc": "def"
}
4. Verify that the date string has been added:
GET myindex/_search
Upvotes: 9
Reputation: 957
An example for ElasticSearch 6.6.2 in Python 3:
from elasticsearch import Elasticsearch
es = Elasticsearch(hosts=["localhost"])
timestamp_pipeline_setting = {
"description": "insert timestamp field for all documents",
"processors": [
{
"set": {
"field": "ingest_timestamp",
"value": "{{_ingest.timestamp}}"
}
}
]
}
es.ingest.put_pipeline("timestamp_pipeline", timestamp_pipeline_setting)
conf = {
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1,
"default_pipeline": "timestamp_pipeline"
},
"mappings": {
"articles":{
"dynamic": "false",
"_source" : {"enabled" : "true" },
"properties": {
"title": {
"type": "text",
},
"content": {
"type": "text",
},
}
}
}
}
response = es.indices.create(
index="articles_index",
body=conf,
ignore=400 # ignore 400 already exists code
)
print ('\nresponse:', response)
doc = {
'title': 'automatically adding a timestamp to documents',
'content': 'prior to version 5 of Elasticsearch, documents had a metadata field called _timestamp. When enabled, this _timestamp was automatically added to every document. It would tell you the exact time a document had been indexed.',
}
res = es.index(index="articles_index", doc_type="articles", id=100001, body=doc)
print(res)
res = es.get(index="articles_index", doc_type="articles", id=100001)
print(res)
About ES 7.x, the example should work after removing the doc_type related parameters as it's not supported any more.
Upvotes: 1
Reputation: 11
first create index and properties of the index , such as field and datatype and then insert the data using the rest API.
below is the way to create index with the field properties.execute the following in kibana console
`PUT /vfq-jenkins
{
"mappings": {
"properties": {
"BUILD_NUMBER": { "type" : "double"},
"BUILD_ID" : { "type" : "double" },
"JOB_NAME" : { "type" : "text" },
"JOB_STATUS" : { "type" : "keyword" },
"time" : { "type" : "date" }
}}}`
the next step is to insert the data into that index:
curl -u elastic:changeme -X POST http://elasticsearch:9200/vfq-jenkins/_doc/?pretty
-H Content-Type: application/json -d '{
"BUILD_NUMBER":"83","BUILD_ID":"83","JOB_NAME":"OMS_LOG_ANA","JOB_STATUS":"SUCCESS" ,
"time" : "2019-09-08'T'12:39:00" }'
Upvotes: -1
Reputation: 1983
Elasticsearch used to support automatically adding timestamps to documents being indexed, but deprecated this feature in 2.0.0
From the version 5.5 documentation:
The _timestamp and _ttl fields were deprecated and are now removed. As a replacement for _timestamp, you should populate a regular date field with the current timestamp on application side.
Upvotes: 63
Reputation: 1684
Adding another way to get indexing timestamp. Hope this may help someone.
Ingest pipeline can be used to add timestamp when document is indexed. Here, is a sample example:
PUT _ingest/pipeline/indexed_at
{
"description": "Adds indexed_at timestamp to documents",
"processors": [
{
"set": {
"field": "_source.indexed_at",
"value": "{{_ingest.timestamp}}"
}
}
]
}
Earlier, elastic search was using named-pipelines because of which 'pipeline' param needs to be specified in the elastic search endpoint which is used to write/index documents. (Ref: link) This was bit troublesome as you would need to make changes in endpoints on application side.
With Elastic search version >= 6.5, you can now specify a default pipeline for an index using index.default_pipeline
settings. (Refer link for details)
Here is the to set default pipeline:
PUT ms-test/_settings
{
"index.default_pipeline": "indexed_at"
}
I haven't tried out yet, as didn't upgraded to ES 6.5, but above command should work.
Upvotes: 29
Reputation: 17964
You can do this by providing it when creating your index.
$curl -XPOST localhost:9200/test -d '{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"_default_":{
"_timestamp" : {
"enabled" : true,
"store" : true
}
}
}
}'
That will then automatically create a _timestamp for all stuff that you put in the index. Then after indexing something when requesting the _timestamp field it will be returned.
Upvotes: 52