Pages

Saturday, 13 December 2014

A great example of Type Classes in Scala

I was recently working with RxScala and processing twitter stream data together with clustering strategies. To solve my problem I had to use maps representing word vectors embedded in other maps and at the end I had to merge them and I was surprised on how nicely this was solved by the Scalaz library using the Semigroup Type Class.

What I would like to do in this post is to exploit the usage of implicit to implement the Type Class concept (familiar to the haskell developers) into the real world.

 The Type Class is for me the inverse of the Extension mechanism in OOP.

What does it mean ?

Lets exploit it...

First lets start with a little bit of math.
A Semigroup in math is an operation defined over a set of elements where the result of the operation will produce an element in the same set.

A simple example can be the operator + for the Natural numbers. If I add 2 natural numbers the result will still be a natural number.

This property is not truth if we use the operator /, because is not truth that the division of 2 natural numbers produce always a natural number.

In scala we can define this concept as a trait:

trait Semigroup[M] {
  def combine(m1:M, m2:M):M
}

With that definition we can create higher order functions able to reduce lists of elements which belong to the Semigroup Trait, for example:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
trait Semigroup[M] {
  def combine(m1:M, m2:M): M
}

object Semigroup {
  implicit val intToSemigroup = new Semigroup[Int] {
    def combine(v1:Int, v2:Int) = v1 + v2
  }
}

object MyApp extends App {
  import Semigroup._
  
  def sum[M:Semigroup](xs:List[M]): M =  xs.reduce(implicitly[Semigroup[M]].combine)

  val myList = 1::2::3::4::Nil

  val result = sum(myList) // > result = 10
}

The method in line 14 is accepting a List of generic type M where M is a member of Semigroup.
In line 18 we pass to the sum method a List of Int. Because we import all the methods from Semigroup object, the implicit val intToSemigroup will transform the Int type into a Semigroup[Int] and we can then use the combine method to reduce them.

In this way we have defined the Type Int as member of the Class Semigroup[Int].
We can then apply the same logic for Strings:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
trait Semigroup[M] {
  def combine(m1:M, m2:M): M
}

object Semigroup {
  implicit val intToSemigroup = new Semigroup[Int] {
    def combine(v1:Int, v2:Int) = v1 + v2
  }

  implicit val stringToSemigroup = new Semigroup[String] {
    def combine(s1:String, s2:String) = s1 concat s2
  }
}

object MyApp extends App {
  import Semigroup._
  
  def sum[M:Semigroup](xs:List[M]): M =  xs.reduce(implicitly[Semigroup[M]].combine)

  val strList = "h" :: "e" :: "l" :: "l" :: "o" :: Nil

  val result = sum(strList) // result = "result"
}

We can also introduce the & operator to make the syntax more elegant:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
trait Semigroup[M] {
  def combine(m1:M, m2:M): M
}

object Semigroup {
  implicit val intToSemigroup = new Semigroup[Int] {
    def combine(v1:Int, v2:Int) = v1 + v2
  }

  implicit val stringToSemigroup = new Semigroup[String] {
    def combine(s1:String, s2:String) = s1 concat s2
  }

  implicit def wrapOp[A:Semigroup](a:A) = SemigroupOperator(a)
}

case class SemigroupOperator[A:Semigroup](a:A) {
  def &(other:A):A = implicitly[Semigroup[A]].combine(a, other)
}

object MyApp extends App {
  import Semigroup._
 
  def sum[M:Semigroup](xs:List[M]): M = xs.reduce{(x1, x2) => x1 & x2 }

  val strList = "h" :: "e" :: "l" :: "l" :: "o" :: Nil

  val result = sum(strList) // result = "result"
}

From line 17 to 19 we have defined the SemigroupOperator case class, this takes as argument a type member of the class Semigroup and defines a new method & which accepts another instance of the same type. What this case class is doing is to transform the combine method of the Semigroup trait in a more familiar object oriented syntax, so that instead to call:

combine(a1, a2)

we can simply use the following syntax:

a1 & a2

In order to achieve our target we need to defined an implicit method which can wrap a type member of Semigroup class with the SemigroupOperator case class.
The effect of these changes can be seen in line 24, where the sum method now is using the & operator because x1 will be wrapped by the SemigroupOperator.

Ok now that we got familiar with Semigroups lets look on a more interesting case.
Think about the problem to merge two Maps together, how can we do that ?
In general if we have two maps and we want to merge them into one, the result should be a map containing all the keys from map1 and map2 with the values merged together. How can we merge the values ?
Well if the Value Type is member of Semigroup we can use the combine method.
This is a very powerful concept because we can now state that:

If the Value Type of a Map is a member of Semigroup, the Map itself is a Semigroup member.

As per the above example a Map[K, Int], whatever value is K, because Int is member of Semigroup then Map[K, Int] is a semigroup itself.

If we apply recursively this concept then the following: Map[K1, Map[K2, V]] where V:Semigroup is again a Semigroup.

Lets write some code:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
trait Semigroup[M] {
  def combine(m1:M, m2:M):M
}

object Semigroup {
  implicit val intSemigroup = new Semigroup[Int] {
    def combine(m1:Int, m2:Int) = m1 + m2 
  }
  
  implicit val strSemigroup = new Semigroup[String] {
    def combine(s1:String, s2:String) = s1 concat s2
  }
  
  /**
   * A Map is a Semigroup if the Value type of the map is a Semigroup itself.
   */
  implicit def mapSemigroup[K, A:Semigroup] = new Semigroup[Map[K, A]] {
    def combine(map1:Map[K, A], map2:Map[K, A]) = map1.foldRight(Map[K, A]()) { case ((k, a1), acc) =>
      val sg = implicitly[Semigroup[A]]
      map2.get(k) match {
        case Some(a2) => acc + (k -> sg.combine(a1, a2))
        case None => acc + (k -> a1)
      }
    } ++ (map2 -- map1.keys)
  }
  
  implicit def wrapOp[A:Semigroup](a:A) = SemigroupOperator(a) 
}

case class SemigroupOperator[A:Semigroup](a:A) {
  def &(other:A):A = implicitly[Semigroup[A]].combine(a, other)
}


object MergeApp extends App {
  
  import Semigroup._
  
  
  def sum[A:Semigroup](xs:List[A]): A = xs.reduce { (a1, a2) =>
    a1 & a2
  }
  
  val map1:Map[String, Int] = Map("apple" -> 1, "banana" -> 2)
  val map2:Map[String, Int] = Map("orange" -> 3, "apple" -> 2)
  
  val r = map1 & map2
  
  r.foreach{ case (k, v) => println(s"$k -> $v") }
  
}

Line 17 defines an implicit method that states: if the value of a Map is a Semigroup the Map itself is a semigroup. This is a powerful definition because is recursive.

Map[String, Int] => because Int is a Semigroup then Map[String, Int] is a Semigroup and I can use the & operator between those maps.

NOTE: But also Map[String, Map[String, Int]] is a Semigroup. Lets analyse:

lets look at the inner map: Map[String, Int]. Because Int is a Semigroup then Map[String, Int] is also a Semigroup

Now the outer map is made by a key of type String and a value type Map[String, Int] which is also a Semigroup so Map[String, Map[String, Int]] is also a Semigroup and can use the & operator.

Method Extension in Scala with implicit

Scala language introduces a new powerful tool called "implicit" which allows the usage of the "Method Extension" pattern like in other languages such as Kotlin and C#.

The usage of implicit can be found every time we try to convert a string into an integer, or boolean or any other basic type:


1
2
3
  val myInt:Int = "3".toInt
  val myBool:Boolean = "true".toBoolean
  val myDouble:Double = "1.0".toDouble

The interesting aspect of these methods are that they are "attached" to the type String, but in Java the String class is marked as final so cannot be extended.

How is it possible that in scala we have these new methods ?

Here is where the power of implicit comes in place. This is also called "dynamic polymorphism".

All the magic is made by the compiler.

How does it work ?

In scala you can define a method as implicit.

Lets make an example and consider we want to write a class to represent the rational numbers.
We can define that as a case class as below:



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
case class Rational(n:Int, d:Int) {
  require(d!=0)
  
  private val g = gcd(n.abs, d.abs)
  val num:Int = n/g
  val den:Int = d/g
  
  def +(other:Rational):Rational = 
       Rational(
         this.num*other.den + this.den*other.num, 
         this.den*other.den
       )
  def *(other:Rational):Rational = 
       Rational(
         this.num*other.num, 
         this.den*other.den
       )
  
  private def gcd(a: Int, b: Int): Int = 
      if (b == 0) a else gcd(b, a % b)
}


We can now use the rational numbers in the following way:


1
2
3
4
5
6
object RationalApp extends App {
  val r1 = Rational(3, 4)
  val r2 = Rational(5, 4)
  
  val r3:Rational = r1 + r2
}


The only limitation we have in our implementation is if we want to use Integer numbers together with our rationals, we will have a compilation error if we try to do the following:



1
2
3
val r1 = Rational(3, 4)
  val r2 = r1 + 1 //(1) compilation error: method Rational.+(Int) is not defined
  val r3 = 1 + r1 //(2) compilation error: method Int.+(Rational) is not defined


As commented in the code above:

  1. In this case the class Rational doesn't have an overloaded definition for method + which accepts as argument an Int. This can be fixed by simply overloading the + method.
  2. This case is more complex as the method + belongs to the type Int which in Java is final so we cannot extend and eventually overload with a Rational argument.
In order to solve this problem we need to use a little trick.

Lets first write the code to solve the problem:



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
package object converters {
  implicit def intToRational(v:Int):Rational = Rational(v, 1)
}
...

import converters._

object RationalApp extends App {
  val r1 = Rational(3,4)
  val r2 = Rational(5, 4)
  
  val r3:Rational = r1 + r2
  
  val r4 = r1 + 1
  val r5 = 1 + r1
}


All works fine and r4 and r5 are Rational values.

What did happen ?

We defined a new method "implicit" inside the package converters, the method with the signature
          Int => Rational
Then before our main class we have imported everything from the package converters.
When the compiler had to evaluate the expression:

 val r4 = r1 + 1

it initially failed because there is no method + for rational which accepts an Int, however there is a method implicitly defined which is able to convert an Int into a Rational, so the compiler will convert the Int into Rational and then the method + is defined for rationals and everything works as expected.

What is more interesting is the next expression:

  val r5 = 1 + r1

In this case the compiler will initially fail because there is no method + defined for Int which accepts a Rational, however there is an implicit method imported that can convert the Int => Rational, once the Int is converted into Rational then all works because Rational has a method + which accepts another Rational as argument.

So the 2 expressions above will be translated by the compiler in the followings:

  val r4 = r1.+(intToRational(1))
  val r5 = intToRational(1).+(r1)


The implicit mechanism is a powerful tool especially when it is used to implement the Typeclass concept, a mechanism used in Haskell programming model which I will exploit into another post.