Pages

Saturday 13 December 2014

A great example of Type Classes in Scala

I was recently working with RxScala and processing twitter stream data together with clustering strategies. To solve my problem I had to use maps representing word vectors embedded in other maps and at the end I had to merge them and I was surprised on how nicely this was solved by the Scalaz library using the Semigroup Type Class.

What I would like to do in this post is to exploit the usage of implicit to implement the Type Class concept (familiar to the haskell developers) into the real world.

 The Type Class is for me the inverse of the Extension mechanism in OOP.

What does it mean ?

Lets exploit it...

First lets start with a little bit of math.
A Semigroup in math is an operation defined over a set of elements where the result of the operation will produce an element in the same set.

A simple example can be the operator + for the Natural numbers. If I add 2 natural numbers the result will still be a natural number.

This property is not truth if we use the operator /, because is not truth that the division of 2 natural numbers produce always a natural number.

In scala we can define this concept as a trait:

trait Semigroup[M] {
  def combine(m1:M, m2:M):M
}

With that definition we can create higher order functions able to reduce lists of elements which belong to the Semigroup Trait, for example:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
trait Semigroup[M] {
  def combine(m1:M, m2:M): M
}

object Semigroup {
  implicit val intToSemigroup = new Semigroup[Int] {
    def combine(v1:Int, v2:Int) = v1 + v2
  }
}

object MyApp extends App {
  import Semigroup._
  
  def sum[M:Semigroup](xs:List[M]): M =  xs.reduce(implicitly[Semigroup[M]].combine)

  val myList = 1::2::3::4::Nil

  val result = sum(myList) // > result = 10
}

The method in line 14 is accepting a List of generic type M where M is a member of Semigroup.
In line 18 we pass to the sum method a List of Int. Because we import all the methods from Semigroup object, the implicit val intToSemigroup will transform the Int type into a Semigroup[Int] and we can then use the combine method to reduce them.

In this way we have defined the Type Int as member of the Class Semigroup[Int].
We can then apply the same logic for Strings:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
trait Semigroup[M] {
  def combine(m1:M, m2:M): M
}

object Semigroup {
  implicit val intToSemigroup = new Semigroup[Int] {
    def combine(v1:Int, v2:Int) = v1 + v2
  }

  implicit val stringToSemigroup = new Semigroup[String] {
    def combine(s1:String, s2:String) = s1 concat s2
  }
}

object MyApp extends App {
  import Semigroup._
  
  def sum[M:Semigroup](xs:List[M]): M =  xs.reduce(implicitly[Semigroup[M]].combine)

  val strList = "h" :: "e" :: "l" :: "l" :: "o" :: Nil

  val result = sum(strList) // result = "result"
}

We can also introduce the & operator to make the syntax more elegant:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
trait Semigroup[M] {
  def combine(m1:M, m2:M): M
}

object Semigroup {
  implicit val intToSemigroup = new Semigroup[Int] {
    def combine(v1:Int, v2:Int) = v1 + v2
  }

  implicit val stringToSemigroup = new Semigroup[String] {
    def combine(s1:String, s2:String) = s1 concat s2
  }

  implicit def wrapOp[A:Semigroup](a:A) = SemigroupOperator(a)
}

case class SemigroupOperator[A:Semigroup](a:A) {
  def &(other:A):A = implicitly[Semigroup[A]].combine(a, other)
}

object MyApp extends App {
  import Semigroup._
 
  def sum[M:Semigroup](xs:List[M]): M = xs.reduce{(x1, x2) => x1 & x2 }

  val strList = "h" :: "e" :: "l" :: "l" :: "o" :: Nil

  val result = sum(strList) // result = "result"
}

From line 17 to 19 we have defined the SemigroupOperator case class, this takes as argument a type member of the class Semigroup and defines a new method & which accepts another instance of the same type. What this case class is doing is to transform the combine method of the Semigroup trait in a more familiar object oriented syntax, so that instead to call:

combine(a1, a2)

we can simply use the following syntax:

a1 & a2

In order to achieve our target we need to defined an implicit method which can wrap a type member of Semigroup class with the SemigroupOperator case class.
The effect of these changes can be seen in line 24, where the sum method now is using the & operator because x1 will be wrapped by the SemigroupOperator.

Ok now that we got familiar with Semigroups lets look on a more interesting case.
Think about the problem to merge two Maps together, how can we do that ?
In general if we have two maps and we want to merge them into one, the result should be a map containing all the keys from map1 and map2 with the values merged together. How can we merge the values ?
Well if the Value Type is member of Semigroup we can use the combine method.
This is a very powerful concept because we can now state that:

If the Value Type of a Map is a member of Semigroup, the Map itself is a Semigroup member.

As per the above example a Map[K, Int], whatever value is K, because Int is member of Semigroup then Map[K, Int] is a semigroup itself.

If we apply recursively this concept then the following: Map[K1, Map[K2, V]] where V:Semigroup is again a Semigroup.

Lets write some code:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
trait Semigroup[M] {
  def combine(m1:M, m2:M):M
}

object Semigroup {
  implicit val intSemigroup = new Semigroup[Int] {
    def combine(m1:Int, m2:Int) = m1 + m2 
  }
  
  implicit val strSemigroup = new Semigroup[String] {
    def combine(s1:String, s2:String) = s1 concat s2
  }
  
  /**
   * A Map is a Semigroup if the Value type of the map is a Semigroup itself.
   */
  implicit def mapSemigroup[K, A:Semigroup] = new Semigroup[Map[K, A]] {
    def combine(map1:Map[K, A], map2:Map[K, A]) = map1.foldRight(Map[K, A]()) { case ((k, a1), acc) =>
      val sg = implicitly[Semigroup[A]]
      map2.get(k) match {
        case Some(a2) => acc + (k -> sg.combine(a1, a2))
        case None => acc + (k -> a1)
      }
    } ++ (map2 -- map1.keys)
  }
  
  implicit def wrapOp[A:Semigroup](a:A) = SemigroupOperator(a) 
}

case class SemigroupOperator[A:Semigroup](a:A) {
  def &(other:A):A = implicitly[Semigroup[A]].combine(a, other)
}


object MergeApp extends App {
  
  import Semigroup._
  
  
  def sum[A:Semigroup](xs:List[A]): A = xs.reduce { (a1, a2) =>
    a1 & a2
  }
  
  val map1:Map[String, Int] = Map("apple" -> 1, "banana" -> 2)
  val map2:Map[String, Int] = Map("orange" -> 3, "apple" -> 2)
  
  val r = map1 & map2
  
  r.foreach{ case (k, v) => println(s"$k -> $v") }
  
}

Line 17 defines an implicit method that states: if the value of a Map is a Semigroup the Map itself is a semigroup. This is a powerful definition because is recursive.

Map[String, Int] => because Int is a Semigroup then Map[String, Int] is a Semigroup and I can use the & operator between those maps.

NOTE: But also Map[String, Map[String, Int]] is a Semigroup. Lets analyse:

lets look at the inner map: Map[String, Int]. Because Int is a Semigroup then Map[String, Int] is also a Semigroup

Now the outer map is made by a key of type String and a value type Map[String, Int] which is also a Semigroup so Map[String, Map[String, Int]] is also a Semigroup and can use the & operator.

3 comments:

  1. Valeu Patrick vc é nota 1000. Parabens garoto.

    ReplyDelete
  2. Patrick, you missed the "associativity" property of the semigroup.

    ReplyDelete
    Replies
    1. Hi IO, very good point :-), I will make sure to add it in the post! Thanks.

      Delete