Felipe
Felipe

Reputation: 7563

How do I serialize the ExecutorService in Java?

I created a CountMinSketch to calculate the minimum frequency of some values. I am using an ExecutorService to update the sketch asynchronously. I am using this class on my Flink project it needs to be serializable, so I am implementing Serializable interface. However, it is not enough, because ExecutorService also needs to be serializable. How do I use the ExecutorService in a serializable manner? Or is there any implementation of ExecutorService that is serializable?

import java.io.Serializable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;

public class CountMinSketch implements Serializable {

    private static final long serialVersionUID = 1123747953291780413L;

    private static final int H1 = 0;
    private static final int H2 = 1;
    private static final int H3 = 2;
    private static final int H4 = 3;
    private static final int LIMIT = 100;
    private final int[][] sketch = new int[4][LIMIT];

    final NaiveHashFunction h1 = new NaiveHashFunction(11, 9);
    final NaiveHashFunction h2 = new NaiveHashFunction(17, 15);
    final NaiveHashFunction h3 = new NaiveHashFunction(31, 65);
    final NaiveHashFunction h4 = new NaiveHashFunction(61, 101);

    private ExecutorService executor = Executors.newSingleThreadExecutor();

    public CountMinSketch() {
        // initialize sketch
    }

    public Future<Boolean> updateSketch(String value) {
        return executor.submit(() -> {
            sketch[H1][h1.getHashValue(value)]++;
            sketch[H2][h2.getHashValue(value)]++;
            sketch[H3][h3.getHashValue(value)]++;
            sketch[H4][h4.getHashValue(value)]++;
            return true;
        });
    }

    public Future<Boolean> updateSketch(String value, int count) {
        return executor.submit(() -> {
            sketch[H1][h1.getHashValue(value)] = sketch[H1][h1.getHashValue(value)] + count;
            sketch[H2][h2.getHashValue(value)] = sketch[H2][h2.getHashValue(value)] + count;
            sketch[H3][h3.getHashValue(value)] = sketch[H3][h3.getHashValue(value)] + count;
            sketch[H4][h4.getHashValue(value)] = sketch[H4][h4.getHashValue(value)] + count;
            return true;
        });
    }

    public int getFrequencyFromSketch(String value) {
        int valueH1 = sketch[H1][h1.getHashValue(value)];
        int valueH2 = sketch[H2][h2.getHashValue(value)];
        int valueH3 = sketch[H3][h3.getHashValue(value)];
        int valueH4 = sketch[H4][h4.getHashValue(value)];
        return findMinimum(valueH1, valueH2, valueH3, valueH4);
    }

    private int findMinimum(final int a, final int b, final int c, final int d) {
        return Math.min(Math.min(a, b), Math.min(c, d));
    }
}

import java.io.Serializable;

public class NaiveHashFunction implements Serializable {

    private static final long serialVersionUID = -3460094846654202562L;
    private final static int LIMIT = 100;
    private long prime;
    private long odd;

    public NaiveHashFunction(final long prime, final long odd) {
        this.prime = prime;
        this.odd = odd;
    }

    public int getHashValue(final String value) {
        int hash = value.hashCode();
        if (hash < 0) {
            hash = Math.abs(hash);
        }
        return calculateHash(hash, prime, odd);
    }

    private int calculateHash(final int hash, final long prime, final long odd) {
        return (int) ((((hash % LIMIT) * prime) % LIMIT) * odd) % LIMIT;
    }
}

Flink class:

    public static class AverageAggregator implements
            AggregateFunction<Tuple3<Integer, Tuple5<Integer, String, Integer, String, Integer>, Double>, Tuple3<Double, Long, Integer>, Tuple2<String, Double>> {

        private static final long serialVersionUID = 7233937097358437044L;
        private String functionName;
        private CountMinSketch countMinSketch = new CountMinSketch();
.....
}

error:

Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: The implementation of the AggregateFunction is not serializable. The object probably contains or references non serializable fields.
    at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
    at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1559)
    at org.apache.flink.streaming.api.datastream.WindowedStream.aggregate(WindowedStream.java:811)
    at org.apache.flink.streaming.api.datastream.WindowedStream.aggregate(WindowedStream.java:730)
    at org.apache.flink.streaming.api.datastream.WindowedStream.aggregate(WindowedStream.java:701)
    at org.sense.flink.examples.stream.MultiSensorMultiStationsReadingMqtt2.<init>(MultiSensorMultiStationsReadingMqtt2.java:39)
    at org.sense.flink.App.main(App.java:141)
Caused by: java.io.NotSerializableException: java.util.concurrent.Executors$FinalizableDelegatedExecutorService
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
    at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:534)
    at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:81)
    ... 6 more

Upvotes: 0

Views: 1746

Answers (2)

Stephen C
Stephen C

Reputation: 718856

An ExecutorService contains state that is impossible to serialize. Specifically the worker threads ... and the state of the tasks they are working on could never be serialized using the standard object serialization classes.

If you don't really need to serialize the ExecutorService, you could mark the variable that refers to it as transient ... to stop it being serialized by accident.

It is conceivable that you could serialize an ExecutorService's work queue. But serializing an executing task would require you to implement a custom mechanism to checkpoint the task's Callable / Runnable ... while it is running.


If you are trying to serialisation itself as a mechanism for checkpointing a computation, you are probably barking up the wrong tree. Serialization cannot capture state held on a thread's stack.

Upvotes: 4

jan.zanda
jan.zanda

Reputation: 146

You usually don't serialize functionality components, only data. I really don't see what you are trying to do, but if you annotate ExecutorService field with @Transient annotation, it should do the trick.

Upvotes: 0

Related Questions