Tarun Sapra
Tarun Sapra

Reputation: 1901

Apache Flink integration with Elasticsearch

I am trying to integrate Flink with Elasticsearch 2.1.1, I am using the maven dependency


and here's the Java Code where I am reading the events from a Kafka queue (which works fine) but somehow the events are not getting posted in the Elasticsearch and there is no error either, in the below code if I change any of the settings related to port, hostname, cluster name or index name of ElasticSearch then immediately I see an error but currently it doesn't show any error nor any new documents get created in ElasticSearch

       StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

    // parse user parameters
    ParameterTool parameterTool = ParameterTool.fromArgs(args);

    DataStream<String> messageStream = env.addSource(new FlinkKafkaConsumer082<>(parameterTool.getRequired("topic"), new SimpleStringSchema(), parameterTool.getProperties()));


    Map<String, String> config = new HashMap<>();
    config.put(ElasticsearchSink.CONFIG_KEY_BULK_FLUSH_MAX_ACTIONS, "1");
    config.put(ElasticsearchSink.CONFIG_KEY_BULK_FLUSH_INTERVAL_MS, "1");

    config.put("cluster.name", "FlinkDemo");

    List<InetSocketAddress> transports = new ArrayList<>();
    transports.add(new InetSocketAddress(InetAddress.getByName("localhost"), 9300));

    messageStream.addSink(new ElasticsearchSink<String>(config, transports, new TestElasticsearchSinkFunction()));

private static class TestElasticsearchSinkFunction implements ElasticsearchSinkFunction<String> {
    private static final long serialVersionUID = 1L;

    public IndexRequest createIndexRequest(String element) {
        Map<String, Object> json = new HashMap<>();
        json.put("data", element);

        return Requests.indexRequest()

    public void process(String element, RuntimeContext ctx, RequestIndexer indexer) {

Upvotes: 4

Views: 4271

Answers (2)


Reputation: 1971

I have found a very good example of Flink & Elasticsearch Connector

First Maven dependency:


Second Example Java code

public static void writeElastic(DataStream<String> input) {

    Map<String, String> config = new HashMap<>();

    // This instructs the sink to emit after every element, otherwise they would be buffered
    config.put("bulk.flush.max.actions", "1");
    config.put("cluster.name", "es_keira");

    try {
        // Add elasticsearch hosts on startup
        List<InetSocketAddress> transports = new ArrayList<>();
        transports.add(new InetSocketAddress("", 9300)); // port is 9300 not 9200 for ES TransportClient

        ElasticsearchSinkFunction<String> indexLog = new ElasticsearchSinkFunction<String>() {
            public IndexRequest createIndexRequest(String element) {
                String[] logContent = element.trim().split("\t");
                Map<String, String> esJson = new HashMap<>();
                esJson.put("IP", logContent[0]);
                esJson.put("info", logContent[1]);

                return Requests

            public void process(String element, RuntimeContext ctx, RequestIndexer indexer) {

        ElasticsearchSink esSink = new ElasticsearchSink(config, transports, indexLog);
    } catch (Exception e) {

Upvotes: 0

Tarun Sapra
Tarun Sapra

Reputation: 1901

I was indeed running it on the local machine and debugging as well but, the only thing I was missing is to properly configure logging, as most of elastic issues are described in "log.warn" statement. The issue was the exception inside "BulkRequestHandler.java" in elasticsearch-2.2.1 client API, which was throwing the error -"org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: type is missing;" As I had created the index but not an type which I find pretty strange as it should be primarily be concerned with index and create the type by default.

Upvotes: 2

Related Questions