Tech/Scala LiftOff London 2010 - Day 1

The first day of this year’s Scala Lift Off London exceeded my modest expectations. It was full of excellent presentations and interesting discussions attended by very friendly folk. I trust tomorrow will not be worse; in the mean time here are some notes from the sessions I attended.

Session 1: From Java to Scala

Graham Tackley of guardian.co.uk told us about the experience of his team with embracing Scala. Guardian.co.uk had been a Java shop and when a project to provide API to access all of their content had been started their technology stack of choice was Java, Guice and Apache Solr. Then, at some point, Graham started playing with Scala and decided to use it for testing. The experience was positive enough for him and his team to decide to rewrite the project in Scala, which has been aided by a Java-to-Scala conversion feature of IntelliJ IDEA. The team then progressed from, as Graham put it, “Java without semicolons”, to more advanced features of Scala, embracing functional concepts like map and flatMap. Overall the experience was a very positive one, with the use of a new programming language putting back excitement into what used to be a fairly mundane Java/Spring/Hibernate-based routine and with no major complaints about the tooling available. The message, confirmed by voices from the audience, was that it is easiest to start using Scala for the test code, and after a while the developers will be so put off by the unnecessary verbosity, noisiness and lack of expresiveness of Java that they will push for Scala all way through. It might be a bit more difficult to sell the language to the managers and customers though.

The slides from Graham’s talk are available here.

Session 2: Akka

Presented by Jonas Bonér, the creator of Akka, this session was an overview of the framework. Working as a consultant and faced with the challenges of high-concurrency, scalability and fault-tolerance Jonas wanted to employ the tried and tested mechanisms of Erlang. This, more often than not, did not go down well with the clients who treated Erlang as an exotic platform and would prefer something they are familiar with, e.g. the JVM. This prompted him to create a framework which could provide these mechanisms on the platform customers wanted. As a result Akka implements actors, multiple dispatch mechanisms to choose from, supervisors (actors that can react to failures of other actors) which can be organised into hierarchies, and strategies for restarting failed actors or groups thereof. All that is supported by interfaces in both Java and Scala. Furthermore, Akka actors and supervisors can be configured as Spring beans and the framework can make use of integration features provided by Apache Camel.

Martin Odersky (the creator and main designer of Scala) turned up at this talk and a discussion ensued on Scala actors vs. Akka actors and their relatvie merits. The consensus seems to be that either have been designed with slightly different goals and that Scala actors are more feature-rich and closer to their Erlang counterparts at the expense of performance. The primary source of the performance difference is the fact that unhandled messages are left in Scala actor’s mailbox and make the search for further messages take longer, while Akka throws an exception whenever a message is not matched by any of the receive patterns. Another interesting comparison discussed was between Akka (or, more general, JVM) and Erlang. In Erlang each process (i.e. actor) has its own heap and messages have to be physically copied between process memories. In contrast all actors running in a single JVM share the same heap which provides an opportunity for better performance, but also for all the pitfalls of shared-state concurrency in case a reference to a mutable object is passed in a message. Finally, Jonas has shown how big an impact synchronicity has on the performance: using !! (synchronous send which waits on a Future to receive a response) as opposed to ! (asynchronous send which returns immediately) reduced the throughput of the tested system by up to 80%.

An interesting piece of Scala trivia (from Martin Odersky): asInstanceOf was intentionally chosen to be long and unwieldy to discourage explicit casts.

Session 3: Writing Compiler Plug-ins

I have been some ten minutes late to this session due to an unfortunate chain of events, which involved the Akka session taking half an hour more than the allocated time and me desiring to eat a proper lunch at a nearby Italian café. I have therefore missed out on Iulian Dragos explaining the basics of noboxing plug-in. Nevertheless, it could be inferred from the words of Iulian and Kevin Wright, who co-hosted this session, that Scala compiler’s plug-in infrastructure allows to easily instert a new compilation phase, but not modify what happens during the existing phases (parsing, typing, erasure, code generation). This leads to problems when trying to implement a Transformer that would add new definitions to the code: if it is run before the typer phase it won’t have access to all type information that might be required to generate the methods; if it is run after typer then any code that relies on the presence of definitions generated by the plug-in will report error during type-checking. The work-around used by Kevin in his autoproxy plug-in is to run a custom type-checking phase inside the plug-in, but it has been agreed that there is currently no easy way to make plug-in insert new methods based on the type information. Plug-ins that modify code withing existing methods or just analyse the AST are, on the other hand, relatively simple to write.

Session 4: Scala 2.8 Collections

The final session of the day was somewhat whimsically named “Scala Wizardry” and was meant to discuss more exotic features of the language like implicits, higher-kinded types, cake pattern and explicit self-typing. However, it has been swiftly taken over by Mr. Odersky who covered the re-design of collections in Scala 2.8 while touching upon most of these advanced subjects. He described how the requirement for methods like filter to return the type of collection it was invoked on led to a lot of code duplication and unsightly type casts in Scala 2.7. The initial attempt to tackle it in 2.8 involved higher-kinded types:

trait TraversableLike[A, CC[_]] {
  def filter(p: A => Boolean): CC[A]
  def map[B](f: A => B): CC[B]
}

trait Traversable[A] extends TraversableLike[A, Traversable]

TraversableLike is a higher-kinded type because its second parameter is not a type but a *type constructor//. When Traversable[A] is defined, Traversable (without type parameters) denotes said type constructor.

This initial design dealt with 90%+ of cases, but there were still some issues remaining; one of them was that map needs to change the type of the collection when the return type of supplied function requires it. For example:

val bits = BitSet(1, 2, 4)
bits map (_.toString)  // should return Set[String]

This required some sort of rule engine to work out the appropriate type in each case. One can imagine a predicate CanBuildFrom[From, Elem, To] where From and To are collection types and Elem is the return type of the function passed to map. The rules could then be expressed as:

CanBuildFrom[BitSet, Int, BitSet]
CanBuildFrom[BitSet, B, Set[B]]
...

We would also like to be able to express constraints like this (in Prolog-like notation):

CanBuildFrom[SortedSet[A], B, SortedSet[B]] :- Ordering[B]

It turns out these rules can be encoded in Scala using implicits. I’ve taken the below snippet from my notes and it might lack necessary details (or be partially incorrect, for that matter), but it should give an idea of the sort of machinery that is involved – and sophistication of the language that allows this to work:

implicit def bf1[A, B]: CanBuildFrom[Set[A], B, Set[B]]
implicit def bf2: CanBuildFrom[Set[
implicit def bf3[A]: CanBuildFrom[BitSet, A, Set[A]]
// ...
implicit def bfn[A, B](implicit ord: Ordering[B]): 
	        CanBuildFrom[SortedSet[A], B, SortedSet[B]]

trait TraversableLike[A, Coll] { this: Coll =>
  def map[B, To](f: A => B)
                (implicit cbf: CanBuildFrom[Coll, B, To]): To = {
    val b = cbf(this)
    foreach (x => b += f(x))
    b.result
  }

  // ...
}

Martin did admit that these weird method signatures can scare people off and while they are relevant for library developers the users of the library should not be bothered with them. The rule of thumb is: if what you desire can be expressed in simple types then use simple types. Which seems as a special case of the universal KISS rule. Definitely advisable when dealing with implicits and higher-kinded types.

The collections re-design is described in a paper written by Odersky and Adriaan Moors.

Day Two ->