/ CLOJURE, TRANSFORMATION, REDUCER, TRANSDUCER

Learning Clojure: transducers

This week, the subject is transducers. But before diving into that subject, we first need to talk more about reducers.

This is the 8th post in the Learning Clojure focus series.Other posts include:

  1. Decoding Clojure code, getting your feet wet
  2. Learning Clojure: coping with dynamic typing
  3. Learning Clojure: the arrow and doto macros
  4. Learning Clojure: dynamic dispatch
  5. Learning Clojure: dependent types and contract-based programming
  6. Learning Clojure: comparing with Java streams
  7. Feedback on Learning Clojure: comparing with Java streams
  8. Learning Clojure: transducers (this post)

Reduction in Java

If you have some experience in Java 8, you probably already know about the Stream.reduce() function. It’s available in three different flavors:

  1. Optional<T> reduce(BinaryOperator<T> accumulator)
  2. T reduce(T identity, BinaryOperator<T> accumulator)
  3. <U> U reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)

Examples of usage are:

int[] ints = new int[] { 0, 1, 2, 3, 4 };
OptionalInt sum1 = Arrays.stream(ints).reduce(Integer::sum); (1)
int sum2 = Arrays.stream(ints).reduce(0, Integer::sum);      (2)
1 Without an initial value, yields 10
2 With an initial value, also yields 10

Note that the only difference between the first flavor and the second flavor is the presence of an initial value. That means that the second flavor is bound to return a value, even if the stream doesn’t contain any elements i.e. OptionalInt vs int.

The only use-case of the third flavor is parallel processing, so let’s put it aside.

Reduction in Kotlin

Reduction in Kotlin is pretty similar to what is available in Java. There are two reducing functions provided out-of-the-box:

  1. One without an initial value, named reduce()
  2. The other with an initial value, called fold()
val ints = arrayOf( 0, 1, 2, 3, 4 )
val sum1 = ints.reduce(Int::plus)
val sum2 = ints.fold(0, Int::plus)

Returning an unrelated type

Reduction in both Java and Kotlin is actually pretty limited: the returned type represents a single value. Reduction (also called fold) is much broader in its definition:

In functional programming, fold (also termed reduce, accumulate, aggregate, compress, or inject) refers to a family of higher-order functions that analyze a recursive data structure and through use of a given combining operation, recombine the results of recursively processing its constituent parts, building up a return value. Typically, a fold is presented with a combining function, a top node of a data structure, and possibly some default values to be used under certain conditions. The fold then proceeds to combine elements of the data structure’s hierarchy, using the function in a systematic way.

— Fold (higher-order function)
https://en.wikipedia.org/wiki/Fold_(higher-order_function)

This definition doesn’t enforce any requirement on the result type. Since a collection is a type like any other, the following functions are also reductions:

fun reduce(acc: Set<Int>, ints: List<Int>): Set<Int> = acc + ints
fun reduce(acc: Set<Int>, aInt: Int): Set<Int> = acc + aInt
fun reduce(aInt: Int): Set<Int> = setOf(aInt)
fun reduce(ints: List<Int>): Set<Int> = setOf<Int>() + ints

Generalizing reduction

At this stage, the next step is pretty small. Let’s consider the following code:

ints.distinct()

This is the same as the last function above. Considering the broader definition, distinct() is also a reduction function.

Likewise, the two following code snippets are the same:

fun reduce(ints: List<Int>): List<Int> {
    val list = mutableListOf<Int>()
    ints.forEach { list.add(it + 1) }
    return list
}

ints.map { it + 1 }

Hence, map() is a reducing function. filter() fits the definition as well. A lot more function also fit the description above.

Transducers

Now that reducing functions have been properly defined, it’s possible to define what a transducer is:

A transducer (sometimes referred to as xform or xf) is a transformation from one reducing function to another.

— Terminology
https://clojure.org/reference/transducers

The composition of reducing functions conforms to that. Now that transducers have been defined, it’s finally time for some Clojure. In fact, Clojure makes it easy to define transducers.

Let’s start by defining a reduction function pipeline using the thread-last macro:

(defn transform [coll]
  (->> coll            (1)
    (filter even?)     (2)
    (take 5)           (3)
    (map inc)))        (4)

As a reminder, here is a step-by-step explanation:

1 Start from coll - assume a collection of numbers
2 Keep only even numbers
3 Keep only the 5 first numbers
4 Increment every number by one

Composing functions

The next step is to use the (comp) function that allows to compose functions:

(comp)(comp f)(comp f g)(comp f g & fs)

Takes a set of functions and returns a fn that is the composition of those fns. The returned fn takes a variable number of args, applies the rightmost of fns to the args, the next fn (right-to-left) to the result, etc.

— ClojureDocs
https://clojuredocs.org/clojure.core/comp

To create a composing function from the pipeline, the (comp) function is used.

(def transducer
  (comp
    (filter even?)
    (take 5)
    (map inc)))

Creating transducers

An important point is that though the (transform) and the (transducer) function implementations look the same, they are not.

In the context of thread-last i.e. ->>, the function is applied to the resulting collection at the point of the "pipeline".

(--> (range 25)
  (filter even?)
  (take 5)
  (map inc))

For example, in the above snippet, the (filter even?) function is applied to the result of the (range 25) function. As seen in the post about threading macros, this is the same as (filter even? (range 25)).

On the opposite side, in the context of the (transducer) function, (filter even?) only executes the (filter) function with a single argument. Going back to the definition of (filter):

(filter pred)(filter pred coll)

Returns a lazy sequence of the items in coll for which (pred item) returns logical true. pred must be free of side-effects. Returns a transducer when no collection is provided.

— ClojureDocs
https://clojuredocs.org/clojure.core/filter

It’s easy to validate the last claim with the REPL:

(filter even?)

=> #object[clojure.core$filter$fn__5610 0x261832c7 "[email protected]"]

Actually, a lot of out-of-the-box Clojure functions working on collections also return a transducer when no collection is provided. This not only includes the functions used so far (filter), (take) and (map) but also:

  • (distinct)
  • (drop)
  • (map-indexed)
  • (mapcat)
  • (partition-by)
  • (random-sample)
  • (remove)
  • etc.

An exhaustive list of transducers provided in core is available in this gist.

Applying a transducer

To apply a transducer to a collection, Clojure provides a (transduce) function:

(transduce xform f coll)(transduce xform f init coll)

Reduce with a transformation of f (xf). If init is not supplied, (f) will be called to produce it. f should be a reducing step function that accepts both 1 and 2 arguments, if it accepts only 2 you can add the arity-1 with 'completing'. Returns the result of applying (the transformed) xf to init and the first item in coll, then applying xf to that result and the 2nd item, etc.

— ClojureDocs
https://clojuredocs.org/clojure.core/transduce

Since the formal definition might be a bit overwhelming, here’s an example of using a (transducer):

(transduce
  (take 5)     (1)
  conj         (2)
  (range 25))  (3)
1 Create a transducer taking the 5 first elements
2 Reduce the result of (take 5 (range 25)) into a vector
3 Initial collection

Compared to the thread-last macros, there are two important differences:

  1. The transducer is a higher-order function. It can be used as a parameter and a return value.
  2. (transducer) requires an additional argument, and allows a second one. The required argument is a reducing function, while the optional one is the initial value.

Conclusion

With transducers, it’s possible to define a named a pipeline of reductions. They allow to define a transformation - or an ordered pipeline of them - independently of any source or destination.

However, transducers go way beyond that: there are stateless transducers and stateful ones. Also, (transduce) is just one way to apply transducers in Clojure, but there are other ones.

This is but the first step into transducers.

To go further:
Nicolas Fränkel

Nicolas Fränkel

Nicolas Fränkel is a Developer Advocate with 15+ years experience consulting for many different customers, in a wide range of contexts (such as telecoms, banking, insurances, large retail and public sector). Usually working on Java/Java EE and Spring technologies, but with focused interests like Rich Internet Applications, Testing, CI/CD and DevOps. Currently working for Hazelcast. Also double as a teacher in universities and higher education schools, a trainer and triples as a book author.

Read More