As a follow-up of point 4 of my previous article, here’s a first little cheatsheet on the Scala collections API. As in Java, knowing API is a big step in creating code that is more relevant, productive and maintainable. Collections play such an important part in Scala that knowing the collections API is a big step toward better Scala knowledge.
In Scala, collections are typed, which means you have to be extra-careful with elements type. Fortunaltey, constructors and companion objects factory have the ability to infer the type by themselves (most of the type). For example:
scala>val countries = List("France", "Switzerland", "Germany", "Spain", "Italy", "Finland")
countries: List[java.lang.String] = List(France, Switzerland, Germany, Spain, Italy, Finland)
Now, the countries value is of type
List[String] since all elements of the collections are
As a corollary, if you don’t explicitly set the type if the collection is empty, you’ll have a collection typed with
scala>val empty = List()
empty: List[Nothing] = List()
scala> 1 :: empty
res0: List[Int] = List(1)
scala> "1" :: empty
res1: List[java.lang.String] = List(1)
Adding a new element to the empty list will return a new list, typed according to the added element. This is also the case if a element of another type is added to a typed-collection.
scala> 1 :: countries
res2: List[Any] = List(1, France, Switzerland, Germany, Spain, Italy, Finland)
In Functional Programming, state is banished in favor of “pure” functions. Scala being both Object-Oriented and Functional in nature, it offers both mutable and immutable collections under the same name but under different packages:
scala.collection.immutable. For example,
Map are found under both packages (interstingly enough, there’s a
scala.collection.immutable.List but a
scala.collection.mutable.MutableList). By default, collections that are imported in scope are those that are immutable in nature, through the
scala.Predef companion object (which is imported implicitly).
The collections API
The heart of the matter lies in the API themselves. Beyond expected methods also found in Java (like
indexOf()), Scala brings to the table a unique functional approach to collections.
Filtering and partitioning
Scala collections can be filtered so that they return:
- either a new collection that retain only elements that satisfy a predicate (
- or those that do not (
Both take a function that takes the element as a parameter and return a boolean. The following example returns a collection which only retains countries whose name has more than 6 characters.
scala> countries.filter(_.length > 6)
res3: List[java.lang.String] = List(Switzerland, Germany, Finland)
Additionally, the same function type can be used to partition the original collection into a pair of two collections, one that satisfies the predicate and one that doesn’t.
scala> countries.partition(_.length > 6)
res4: (List[java.lang.String], List[java.lang.String]) = (List(Switzerland, Germany, Finland),List(France, Spain, Italy))
Taking, droping and splitting
- Taking a collection means returning a collection that keeps only the first n elements of the original one
res5: List[java.lang.String] = List(France, Switzerland)
- Droping a collection consists of returning a collection that keeps all elements but the first n elements of the original one.
res6: List[java.lang.String] = List(Germany, Spain, Italy, Finland)
- Splitting a collection consists in returning a pair of two collections, the first one being the one before the specified index, the second one after.
res7: (List[java.lang.String], List[java.lang.String]) = (List(France, Switzerland),List(Germany, Spain, Italy, Finland))
Scala also offers
dropRight(Int) variant methods that do the same but start with the end of the collection.
Additionally, there are
takeWhile(f: A => Boolean) and
dropWhile(f: A => Boolean) variant methods that respectively take and drop elements from the collection sequentially (starting from the left) while the predicate is satisfied.
Scala collections elements can be grouped in key/value pairs according to a defined key. The following example groups countries by their name’s first character.
res8: scala.collection.immutable.Map[Char,List[java.lang.String]] = Map(F -> List(France, Finland), S -> List(Switzerland, Spain), G -> List(Germany), I -> List(Italy))
Three methods are available in the set algebra domain:
- union (
- difference (
- intersection (
Those are pretty self-explanatory.
map(f: A => B) method returns a new collection, which length is the same as the original one, and whose elements have been applied a function.
For example, the following example returns a new collection whose names are reversed.
res9: List[String] = List(ecnarF, dnalreztiwS, ynamreG, niapS, ylatI, dnalniF)
Folding is the operation of, starting from an initial value, applying a function to a tuple composed of an accumulator and the element under scrutiny. Considering that, it can be used as the above
map if the accumulator is a collection, like so:
scala> countries.foldLeft(List[String]())((list, x) => x.reverse :: list)
res10: List[String] = List(dnalniF, ylatI, niapS, ynamreG, dnalreztiwS, ecnarF)
Alternatively, you can provide other types of accumulator, like a string, to get different results:
scala> countries.foldLeft("")((concat, x) => concat + x.reverse)
res11: java.lang.String = ecnarFdnalreztiwSynamreGniapSylatIdnalniF
Zipping creates a list of pairs, from a list of single elements. There are two variants:
zipWithIndex() forms the pair with the index of the element and the element itself, like so:
res12: List[(java.lang.String, Int)] = List((France,0), (Switzerland,1), (Germany,2), (Spain,3), (Italy,4), (Finland,5))
Note: zipping with index is very important when you want to use an iterator but still want to have a reference to the index. It keeps you from declaring a variable outside the iteration and incrementing the former inside the latter.
- Additionally, you can also zip two lists together:
scala> countries.zip(List("Paris", "Bern", "Berlin", "Madrid", "Rome", "Helsinki"))
res13: List[(java.lang.String, java.lang.String)] = List((France,Paris), (Switzerland,Bern), (Germany,Berlin), (Spain,Madrid), (Italy,Rome), (Finland,Helsinki))
Note that the original collections don’t need to have the same size. The returned collection’s size will be the min of the sizes of the two original collections.
The reverse operation is also available, in the form of the
unzip() method which returns two lists when provided with a list of pairs. The
unzip3() does the same with a triple list.
I’ve written this article in the form of a simple fact-oriented cheat sheet, so you can use it as such. In the next months, I’ll try to add other such cheatsheets.
To go further:
I’ve found the following references around the web: