Reputation: 22684
As I was learning about transducers in Clojure it suddenly struck me what they reminded me of: Java 8 streams!
Clojure:
(def xf
(comp
(filter odd?)
(map inc)
(take 5)))
(println
(transduce xf + (range 100))) ; => 30
(println
(into [] xf (range 100))) ; => [2 4 6 8 10]
Java:
// Purposely using Function and boxed primitive streams (instead of
// UnaryOperator<LongStream>) in order to keep it general.
Function<Stream<Long>, Stream<Long>> xf =
s -> s.filter(n -> n % 2L == 1L)
.map(n -> n + 1L)
.limit(5L);
System.out.println(
xf.apply(LongStream.range(0L, 100L).boxed())
.reduce(0L, Math::addExact)); // => 30
System.out.println(
xf.apply(LongStream.range(0L, 100L).boxed())
.collect(Collectors.toList())); // => [2, 4, 6, 8, 10]
Apart from the difference in static/dynamic typing, these seem quite similar to me in purpose and usage.
Is the analogy with transformations of Java streams a reasonable way of thinking about transducers? If not, how is it flawed, or how do the two differ in concept (not to speak of implementation)?
Upvotes: 15
Views: 1377
Reputation: 23506
Another important difference that I see is that Clojure Transducers are composable. I often have the situation that my stream pipelines are a bit longer than in your example, where there are just some intermediate steps that I could re-use elsewhere, e.g.:
someStream
.map(...)
.filter(...)
.map(...) // <- gee, there are at least two other
.filter(...) // <- pipelines where I could use the functionality
.map(...) // <- of just these three steps!
.filter(...)
.collect(...)
I haven't found a sane way to achieve that. What I wish I had was something like this:
Transducer<Integer,String> smallTransducer = s -> s.map(...); // usable in a stream Integer -> String
Transducer<String,MyClass> otherTransducer = s -> s.filter(...).map(...); // stream String -> MyClass
Transducer<Integer,MyClass> combinedTransducer = smallTransducer.then(otherTransducer); // compose transducers, to get an Integer -> MyClass transducer
and then use it like this:
someStream
.map(...)
.filter(...)
.transduce(smallTransducer)
.transduce(otherTransducer)
.filter(...)
.collect(...)
// or
someStream
.map(...)
.filter(...)
.transduce(combinedTransducer)
.filter(...)
.collect(...)
Upvotes: 1
Reputation: 7949
The main difference is that the set of verbs (operations) is somehow closed for streams while it's open for transducers: try for example to implement partition
on streams, it feels a bit second class:
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Stream;
import java.util.stream.Stream.Builder;
public class StreamUtils {
static <T> Stream<T> delay(final Supplier<Stream<T>> thunk) {
return Stream.of((Object) null).flatMap(x -> thunk.get());
}
static class Partitioner<T> implements Function<T, Stream<Stream<T>>> {
final Function<T, ?> f;
Object prev;
Builder<T> sb;
public Partitioner(Function<T, ?> f) {
this.f = f;
}
public Stream<Stream<T>> apply(T t) {
Object tag = f.apply(t);
if (sb != null && prev.equals(tag)) {
sb.accept(t);
return Stream.empty();
}
Stream<Stream<T>> partition = sb == null ? Stream.empty() : Stream.of(sb.build());
sb = Stream.builder();
sb.accept(t);
prev = tag;
return partition;
}
Stream<Stream<T>> flush() {
return sb == null ? Stream.empty() : Stream.of(sb.build());
}
}
static <T> Stream<Stream<T>> partitionBy(Stream<T> in, Function<T, ?> f) {
Partitioner<T> partitioner = new Partitioner<>(f);
return Stream.concat(in.flatMap(partitioner), delay(() -> partitioner.flush()));
}
}
Also like sequences and reducers, when you transform you don't create a "bigger" computation, you create a "bigger" source.
To be able to pass computations, you've introduced xf
a function from Stream to Stream to lift operations from methods to first class entities (so as to untie them from the source). By doing so you've created a transducer albeit with a too large interface.
Below is a more general version of the above code to apply any (clojure) transducer to a Stream:
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Stream;
import java.util.stream.Stream.Builder;
import clojure.lang.AFn;
import clojure.lang.IFn;
import clojure.lang.Reduced;
public class StreamUtils {
static <T> Stream<T> delay(final Supplier<Stream<T>> thunk) {
return Stream.of((Object) null).flatMap(x -> thunk.get());
}
static class Transducer implements Function {
IFn rf;
public Transducer(IFn xf) {
rf = (IFn) xf.invoke(new AFn() {
public Object invoke(Object acc) {
return acc;
}
public Object invoke(Object acc, Object item) {
((Builder<Object>) acc).accept(item);
return acc;
}
});
}
public Stream<?> apply(Object t) {
if (rf == null) return Stream.empty();
Object ret = rf.invoke(Stream.builder(), t);
if (ret instanceof Reduced) {
Reduced red = (Reduced) ret;
Builder<?> sb = (Builder<?>) red.deref();
return Stream.concat(sb.build(), flush());
}
return ((Builder<?>) ret).build();
}
Stream<?> flush() {
if (rf == null) return Stream.empty();
Builder<?> sb = (Builder<?>) rf.invoke(Stream.builder());
rf = null;
return sb.build();
}
}
static <T> Stream<?> withTransducer(Stream<T> in, IFn xf) {
Transducer transducer = new Transducer(xf);
return Stream.concat(in.flatMap(transducer), delay(() -> transducer.flush()));
}
}
Upvotes: 10