Reputation: 401
Using the elasticsearch Python API I want to create an elasticsearch index with a mapping so that when I upload a CSV file the documents are uploaded according to this mapping.
import argparse, elasticsearch, json
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk
import csv
I have this (I removed some fields so the mapping doesn't look that long):
mapping =
'''{
"mappings": {
"type": {
"properties": {
"@timestamp": {
"type": "date"
},
"@version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"authEndStopCode": {
"type": "keyword"
},
"expandedTripNumber": {
"type": "integer"
},
"operator": {
"type": "integer"
},
"path": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"startStopName": {
"type": "keyword"
},
"userStartStopCode": {
"type": "keyword"
}
}
}
}
}'''
I'm creating the index this way:
es.indices.create(index=INDEX_NAME, ignore=400, body=mapping)
This is what I do to upload the data:
with open(args.file, "r", encoding="latin-1") as f:
reader = csv.DictReader(f)
bulk(es, reader, index=INDEX_NAME, doc_type=TYPE)
Where INDEX_NAME
and TYPE
are strings I already defined.
The CSV file is just data (it should be one document per line), doesn't have headers, but elasticsearch seems like it's trying to use the first line as the headers. I don't want this, I want to use the mapping I already added to the index.
Hope someone can help. Thank you.
Upvotes: 0
Views: 4495
Reputation: 2399
I'm the author of moshe/elasticsearch_loader
I wrote ESL for this exact problem.
You can download it with pip:
pip install elasticsearch-loader
And then you will be able to load csv files into elasticsearch while supplying your custom mapping by issuing:
elasticsearch_loader --index-settings-file mappings.json \
--index incidents --type incident csv file1.csv
Upvotes: 0
Reputation: 401
The problem wasn't bulk. csv.DictReader
always reads the first line from the file to get the headers for subsequent rows. So if you're going to use DictReader
, the file needs a header.
Upvotes: 1