Attention: Seq is not immutable!

Update: Thanks to gerferra (see comment below) I have added a paragraph explaining that you should use chained package clauses in combination with retargeting Seq.

One of Scala’s guiding principles is its bias towards immutability: While we can use vars and mutable objects, we are encouraged to use vals and immutable objects. This manifests itself very clearly in the collection library which contains both immutable and mutable collections. Actually there are three main packages:

  1. scala.collection
  2. scala.collection.immutable
  3. scala.collection.mutable

scala.collection contains basic objects which are extended by the immutable collections in scala.collection.immutable and the mutable ones in scala.collection.mutable.

It is important to understand that the basic collections are neither immutable or mutable, they just provide common functionality. If an API makes use of a basic collection, e.g. for the type of a method parameter, we can either use an immutable or mutable one for the argument.

Let’s look at a simple example:

def basicSetSize[A](as: scala.collection.Set[A]): Int =
as.size
scala> basicSetSize(scala.collection.immutable.Set(1, 2, 3))
res0: Int = 3

scala> basicSetSize(scala.collection.mutable.Set(1, 2, 3))
res1: Int = 3

As you can see, we can use an immutable or a mutable Set for a basic Set. So far so good.

Now back to Scala’s guiding principle of immutability. In order to give preference to the immutable collections, the Predef singleton object and the scala package object contain a number of aliases that bring the immutable collections into scope without any imports. That means that we can use many immutable collections even if we don’t import anything from either scala.collection.immutable or scala.collection.mutable.

Let’s rewrite the above example:

def basicSetSize[A](as: Set[A]): Int =
as.size
scala> basicSetSize(scala.collection.immutable.Set(1, 2, 3))
res0: Int = 3

scala> basicSetSize(scala.collection.mutable.Set(1, 2, 3))
:9: error: type mismatch;
found : scala.collection.mutable.Set[Int]
required: scala.collection.immutable.Set[?]

scala> basicSetSize(scala.collection.Set(1, 2, 3))
:9: error: type mismatch;
found : scala.collection.Set[Int]
required: scala.collection.immutable.Set[?]

As you can see, Set without imports or qualifying package essentially means scala.collection.immutable.Set and we can’t use a mutable or basic Set.

Now this is expected, because Scala is encouraging us to use immutable objects. Many think that this principle holds for all major collections types, but this is not true! Guess where the aliases for Seq are pointing to: To scala.collection.immutable.Seq like for Set and other collections? No!

type Seq[+A] = scala.collection.Seq[A]
val Seq = scala.collection.Seq

These aliases are defined in the scala package object. The reason for this exception is, that one should be able to use arrays, which are mutable, for the “default” sequence. Predef contains implicit conversions from Array to WrappedArray which mixes in various mutable collection traits.

While this makes it comfortable to work with arrays, I consider this very dangerous, because it is too easy to forget importing scala.collection.immutable.Seq. If we forget this import and hence use the basic sequence in our API, users can provide mutable sequences which most certainly will lead to trouble in any concurrent program. Just imagine objects used as messages in an Akka based system …

If you want to be sure your code is using immutable sequences, I recommend adding the following lines to the top-level package object in your projects:

type Seq[+A] = scala.collection.immutable.Seq[A]

val Seq = scala.collection.immutable.Seq

If you use chained package clauses (see below) in all your Scala source file, this will automatically “retarget” Seq to the immutable sequence and all should be good.

package name.heikoseeberger.toplevel
package subpackage
Advertisements
Attention: Seq is not immutable!

Name based extractors in Scala 2.11

Update 2: Thanks to @SeanTAllen I have fixed a typo in one of the code examples.

Update: Thanks to @xuwei_k and @eed3si9n I have learned that value classes which mix in a universal trait incur the cost of allocation. Therefore I had to change the second example (NameOpt), which – by the way – lead to significant LOC reduction 😉

The recently released milestone M5 for Scala 2.11.0 contains a nice little gem: extractors which don’t need any allocations. This should improve performance significantly and make extractors ready for prime time.

So far an extractor had to be an object with an unapply method taking an arbitrary argument and returning an Option, potentially destructing the given argument into a value of a different type:

def unapply(any: A): Option[B]

There are variations, e.g. to extract several values or sequences of such, but the basic shape remains the same. Here is a simple example:

object PositiveInt {
  def unapply(n: Int): Option[Int] =
    if (n > 0) Some(n) else None
}

Extractors are quite useful to write concise and expressive code. Yet unapply returning an Option might have a negative impact on runtime performance, because an instance of Some needs to be created for each successful extraction.

Scala 2.11 introduces name based extractors which no longer require unapply to return an Option. Instead any object which defines the two methods isEmpty and get will do the job:

isEmpty: Boolean
get: A

Clearly this could be an Option, but we can also make use of value classes which have been introduced in Scala 2.10. Value classes, which extend from AnyVal, don’t get allocated, because all operations are inlined by the compiler. Let’s rewrite the above simple example:

class PositiveIntOpt(val n: Int) extends AnyVal {
  def isEmpty: Boolean =
    n <= 0
  def get: Int =
    n
}

object PositiveInt {
  def unapply(n: Int): PositiveIntOpt =
    new PositiveIntOpt(n)
}

Let’s give it a spin in the REPL:

scala> val PositiveInt(n) = 1
n: Int = 1

scala> val PositiveInt(n) = 0
scala.MatchError: 0 (of class java.lang.Integer)

Woot, that works! Here is another example, this time extracting multiple values. We are making use of value classes, universal traits (extending Any) and the null object pattern to minimize allocations:

class NameOpt(val parts: (String, String)) extends AnyVal {
  def isEmpty: Boolean =
    parts == null
  def get: (String, String) =
    parts
}

object Name {
  def unapply(name: String): NameOpt = {
    val parts = name split " "
    if (parts.length == 2)
      new NameOpt((parts(0), parts(1)))
    else
      new NameOpt(null)
  }
}

Of course, this adds quite a bit boilerplate compared to simply returning an Option. But the performance gain might be worth it.

If we control the class we want to destruct, we can even add the extractor logic to the class itself, thereby avoiding any allocations at all. In this case, in order to extract multiple values, we have to define the according product selectors _1, _2, etc.:

object Name {
  def unapply(name: Name) =
    name
}

class Name(first: String, last: String) {
  def isEmpty: Boolean =
    false
  def get: Name =
    this
  def _1: String =
    first
  def _2: String =
    last
}

All right, that’s it. Before you start using name based extractors, please make sure to check the next milestones and finally the 2.11 release for any changes to this new feature.

Name based extractors in Scala 2.11