Reputation: 4747
I have an ETL process that I am implementing using Pentaho Kettle (Spoon). Everything is working fine, except that I can't insert the generated data into my ElasticSearch remote server. I tried using Kettle's component "Elastic Search Bulk Insert", but Kettle can't find my Elastic Search nodes (as it can be seen here) . Is there any reliable way to add a lot of information to my ES server? Solutions with kettle or independent scripts/plugins/etc are accepted, the only constraint is that The ETL process will run in a different machine from Elastic Search. Kettle has a custom Java script element that could also be used.
EDIT: I found out that Pentaho is using a very old version of elastic search (0.16.3), I am trying to find a way to update it. No luck until now...
Upvotes: 2
Views: 6475
Reputation: 11
One common mistake in this context is copying elasticsearch-6.4.2.jar to \data-integration\lib. This is unnecessary and counterproductive.
Steps:
Settings: cluster.name my_cluster_name // from elasticsearch.yml
PDI 8.2 or 8.3 or 9.0
Elasticsearch ver 6.4.2
Upvotes: 1
Reputation: 79
Current PDI(6.0.1) release support elasticsearch 1.5.4,
if someone needs to latest elasticsearch 2.2 working plugin for PDI 6.*
U can download the it, I tested it working with 2.2
https://drive.google.com/file/d/0B0hgGtBdLOBMbWtfVVFnTE1uVmM/view?usp=sharing
Upvotes: 0
Reputation: 164
First you should know your Elastic Search Server configuration. Open elasticsearch.yml file under your Elasticsearch server and copy IP Address, transport.tcp.port and cluster.name values.
Come back to your Kettle, open "ElasticSearch Bulk Insert" task. Add "culster.name" in the [Settings] tab, and IP addres and tcp.port in [Servers] tab. Then try "Test Connection". it should works.
Upvotes: 1
Reputation: 31
I changed the dependent jar from elasticsearch-0.16.3.jar
to elasticsearch-1.6.0.jar
(it also needs lucene-core-4.10.4.jar
), copied 'ElasticSearchBulk' (with some help) as a new plugin or modify the source code, because some of the locations of the elasticsearch package have changed (removing the wrong package import, then adding the correct). Finally, it is working well with elasticsearch1.6.
Upvotes: 2
Reputation: 3294
elasticsearch is a RESTful search engine so i use the REST Client kettle step. All you have to do is to follow the rest standarts for insertion rows into your remote elasticsearch server. it works well.
Upvotes: 2