/ FUNCTIONAL PROGRAMMING, GROUP BY

From Imperative to Functional Programming: a grouping issue (and how to solve it)

There’s a whole category of problems related to grouping e.g.:

  • given a collection of person, return a list of pairs with the first value the age, and the second one the collection of persons of that age
  • given a collection of orders, return a list of pairs with some price range e.g. $0-$100, $101-$200, etc. as the first value, and the number of such orders as the second one
  • given a collection of words, return a list of pairs with the number of letters as first, and the number of words of that length second
  • etc.

This is the 3rd post in the From Imperative to Functional Programming focus series. Other posts include:

  1. From Imperative to Functional Programming using Arrow
  2. From Imperative to Functional Programming, an approach
  3. From Imperative to Functional Programming: a grouping issue (and how to solve it) (this post)
  4. From Imperative to Functional Programming: the Dijkstra algorithm

An imperative solution

Here’s a function that solves the last requirement, written in the imperative way:

fun imperative(strings: Array<String>): List<Pair<Int, Int>> {
  val map = mutableMapOf<Int, Int>()
  strings.forEach {
    val length = it.length
    if (map.containsKey(length)) map[length] = map[length] as Int + 1
    else map[length] = 1
  }
  val pairs = mutableListOf<Pair<Int, Int>>()
  map.forEach {
    pairs.add(it.key to it.value)
  }
  pairs.sortBy { it.first }
  return pairs
}

I believe this is readable enough, so that I don’t need to explain it. It’s also quite possible to propose a similar solution in one own’s favorite language.

For example, the following input arrayOf("a", "an", "the", "ace", "little", "six", "seven", "ten", "eleven") yields:

(1, 1)   // "a"
(2, 1)   // "an"
(3, 4)   // "the" "ace" "six" "ten"
(5, 1)   // "seven"
(6, 2)   // "little" "eleven"

The functional alternative

The equivalent functional code is more concise, and more importantly easier to read:

fun functional(strings: Array<String>): List<Pair<Int, Int>> = strings.groupBy { it.length }
  .map { it.key to it.value.count() }
  .sortedBy { it.first }
}

Changing the requirements

From where I stand, the functional approach seems a better fit. But let’s throw a wrench in the gears in the form of some additional constraints/requirements:

  • Every key MUST be listed, even those with no associated value
  • The key itself MUST NOT be returned, only the value
  • The order MUST be the same as the previous case (by length of the word)

The above test sample should return:

1 1 4 0 1 2

Notice the zero, it’s the crux of the matter.

Updating the functional solution

The imperative solution is pretty easy to update:

  1. Find the length of the longest word
  2. Initialize a map with keys from 0 to this number
  3. Update the map values by iterating through the words
  4. Transform, sort and transform again

The functional solution is a tad harder. How can the missing keys be added if they are not present in the first place?

My first attempt was "tainted" by imperative tricks :

fun functional2(strings: Array<String>): List<Int> {
  return strings.groupBy { it.length }
    .toMutableMap()                                  (1)
    .apply {
      val max = strings.map { it.length }.max() ?: 0 (2)
      (1..max).forEach {
        computeIfAbsent(it) { arrayListOf() }        (3)
      }
    }
    .map { it.key to it.value.count() }
    .sortedBy { it.first }
    .map { it.second }
}
1Ooops, from immutability to mutability…​
2Get the length of the longest word
3Use mutability!

While perfectly viable, that approach is not satisfactory in the functional realm.

My second attempts leverages data structures and how they can be combined. Imagine two lists, the one above and a second specific one:

First list (indices)Second list (lengths)

First value

Second value

First value

Second value

Word length

0

Word length

Number of words

Those lists can be merged, and then elements be made distinct according to the first value, the word length. Hence, if a pair exists in the second list, it will overwrite the value in the first list - 0. If not, the initial pair with 0 value will be kept.

fun functional3(strings: Array<String>): List<Int> {
  val max = strings.map { it.length }.max() ?: 0
  val indices: List<Pair<Int, Int>> = (1..max).map { it to 0 }
  val lengths: List<Pair<Int, Int>> = strings.groupBy { it.length }
    .map { it.key to it.value.count() }
  return lengths.union(indices)
    .distinctBy { it.first }
    .sortedBy { it.first }
    .map { it.second }
}

Bonus: some Clojure love

As for last week, here’s some Clojure code to achieve the same:

(use '[clojure.algo.generic.functor :only (fmap)])

(defn functional [strings]
  (let [max (apply max (map count strings))
    lengths (fmap count (group-by count strings))
    indices (zipmap (range 1 max)
                    (repeat max 0))]
    (vals (merge indices lengths)))
  )

In this snippet, maps are preferred to lists of pairs to make use of the merge function. There’s no such map merging feature available in Kotlin (at least none that I know about).

Conclusion

Compared to imperative programming, functional programming makes the code more concise. However, improved readability also requires a knowledge of a majority of available functions, both in APIs and libraries.

Nicolas Fränkel

Nicolas Fränkel

Nicolas Fränkel is a Developer Advocate with 15+ years experience consulting for many different customers, in a wide range of contexts (such as telecoms, banking, insurances, large retail and public sector). Usually working on Java/Java EE and Spring technologies, but with focused interests like Rich Internet Applications, Testing, CI/CD and DevOps. Currently working for Exoscale. Also double as a teacher in universities and higher education schools, a trainer and triples as a book author.

Read More