Hazlecast Jet Cluster. Work load not distributed

Question

I have one huge csv file. I have a Jet cluster with 3 nodes. When the job is submitted only one node processes the entire file. What I want is the each part of work can be distributed. Meaning, how can I optimally use the resources in each of the nodes to get the work done faster.

    Pipeline p = Pipeline.create();

    BatchSource> source = Sources.filesBuilder("files")
            .glob("*.csv")
            .build(path -> Files.lines(path).skip(1).map(line -> split(line)));

    p.readFrom(source)
            .map(function1)
            .map(function2)
            .writeTo(Sinks.filesBuilder("out").build());
    instance.newJob(p).join();

Hazlecast Jet Cluster. Work load not distributed

Answers (1)

Related Questions