31 \emph{any} part of the coursework in \emph{any} programming |
31 \emph{any} part of the coursework in \emph{any} programming |
32 language you like. I use Scala for showing you code during the |
32 language you like. I use Scala for showing you code during the |
33 lectures because its functional programming-style allows me to |
33 lectures because its functional programming-style allows me to |
34 implement the functions we will discuss with very small |
34 implement the functions we will discuss with very small |
35 code-snippets. If I had to do this in Java, for example, I |
35 code-snippets. If I had to do this in Java, for example, I |
36 would first have to run through heaps of boilerplate code and |
36 would first have to go through heaps of boilerplate code and |
37 the code-snippets would not look pretty. Since the Scala |
37 the code-snippets would not look pretty. Since the Scala |
38 compiler is free, you can download the code-snippets and run |
38 compiler is free, you can download the code-snippets and run |
39 every example I give. But if you prefer, you can also easily |
39 every example I give. But if you prefer, you can also easily |
40 translate them into any other functional language, for example |
40 translate them into any other functional language, for example |
41 Haskell, Standard ML, F$^\#$, Ocaml and so on. |
41 Haskell, Standard ML, F$^\#$, Ocaml and so on. |
85 string. Once you are more familiar with the functional |
85 string. Once you are more familiar with the functional |
86 programming-style, you will know what the difference is |
86 programming-style, you will know what the difference is |
87 between a function that returns a result, like addition, and a |
87 between a function that returns a result, like addition, and a |
88 function that causes a side-effect, like \code{print}. We |
88 function that causes a side-effect, like \code{print}. We |
89 shall come back to this point later, but if you are curious |
89 shall come back to this point later, but if you are curious |
90 now, the latter kind of functions always have as return type |
90 now, the latter kind of functions always has as return type |
91 \code{Unit}. |
91 \code{Unit}. |
92 |
92 |
93 If you want to write a stand-alone app in Scala, you can |
93 If you want to write a stand-alone app in Scala, you can |
94 implement an object that is an instance of \code{App}, say |
94 implement an object that is an instance of \code{App}, say |
95 |
95 |
118 $ scalac hello-world.scala |
118 $ scalac hello-world.scala |
119 $ java -cp /usr/local/src/scala/lib/scala-library.jar:. Hello |
119 $ java -cp /usr/local/src/scala/lib/scala-library.jar:. Hello |
120 hello world |
120 hello world |
121 \end{lstlisting} |
121 \end{lstlisting} |
122 |
122 |
|
123 \noindent You might need to adapt the path to where you have |
|
124 installed Scala. |
123 |
125 |
124 \subsection*{Inductive Datatypes} |
126 \subsection*{Inductive Datatypes} |
125 |
127 |
126 The elegance and conciseness of Scala programs are often a |
128 The elegance and conciseness of Scala programs are often a |
127 result of inductive datatypes that can be easily defined. For |
129 result of inductive datatypes that can be easily defined in |
128 example in ``every-day mathematics'' we define regular |
130 Scala. For example in ``every-day mathematics'' we define |
129 expressions simply by giving the grammar |
131 regular expressions simply by giving the grammar |
130 |
132 |
131 \begin{center} |
133 \begin{center} |
132 \begin{tabular}{r@{\hspace{2mm}}r@{\hspace{2mm}}l@{\hspace{13mm}}l} |
134 \begin{tabular}{r@{\hspace{2mm}}r@{\hspace{2mm}}l@{\hspace{13mm}}l} |
133 $r$ & $::=$ & $\varnothing$ & null\\ |
135 $r$ & $::=$ & $\varnothing$ & null\\ |
134 & $\mid$ & $\epsilon$ & empty string\\ |
136 & $\mid$ & $\epsilon$ & empty string\\ |
176 grammar above, I hope you can see the underlying pattern. If |
178 grammar above, I hope you can see the underlying pattern. If |
177 you want to play further with such definitions of inductive |
179 you want to play further with such definitions of inductive |
178 datatypes, feel free to define for example binary trees. |
180 datatypes, feel free to define for example binary trees. |
179 |
181 |
180 Once you make a definition like the one above in Scala, you |
182 Once you make a definition like the one above in Scala, you |
181 can represent, for example, the regular expression for $a + b$ |
183 can represent the regular expression for $a + b$, for example, |
182 as \code{ALT(CHAR('a'), CHAR('b'))}. Expressions such as |
184 as \code{ALT(CHAR('a'), CHAR('b'))}. Expressions such as |
183 \code{'a'} stand for ASCII characters, though in the output |
185 \code{'a'} stand for ASCII characters, though in the output |
184 syntax, as you can see below, the quotes are omitted. In a |
186 syntax, as you can see below, the quotes are omitted. In a |
185 later section we will see how we can support the more |
187 later section we will see how we can support the more |
186 mathematical infix notation for regular expression operators |
188 mathematical infix notation for regular expression operators |
351 \code{collatz}-function from above as follows: |
352 \code{collatz}-function from above as follows: |
352 |
353 |
353 \lstinputlisting[language=Scala]{../progs/collatz2.scala} |
354 \lstinputlisting[language=Scala]{../progs/collatz2.scala} |
354 |
355 |
355 |
356 |
356 \noindent Although in this case the pattern-match does not |
357 \noindent Although in this particular case the pattern-match |
357 improve the code in any way. In cases like \code{rev} it |
358 does not improve the code in any way. In cases like \code{rev} |
358 is really crucial. The underscore in the last case indicates |
359 it is really crucial. The underscore in Line 4 indicates that |
359 that we do not care what the pattern looks like. Thus Line 4 |
360 we do not care what the patter8n looks like. Thus this case |
360 acts like a default case whenever the cases above did not |
361 acts like a default case whenever the cases above did not |
361 match. Cases are always tried out from top to bottom. |
362 match. Cases are always tried out from top to bottom. |
362 |
363 |
363 \subsection*{Loops, or the Absence of} |
364 \subsection*{Loops, or the Absence of} |
364 |
365 |
365 Coming from Java or C, you might be surprised that Scala does |
366 Coming from Java or C, you might be surprised that Scala does |
366 not really have loops. It has instead, what is in functional |
367 not really have loops. It has instead, what is in functional |
367 programming called \emph{maps}. To illustrate how they work, |
368 programming called \emph{maps}. To illustrate how they work, |
368 lets assume you have a list of numbers from 1 to 10 and want to |
369 lets assume you have a list of numbers from 1 to 8 and want to |
369 build the list of squares. The list of numbers from 1 to 10 |
370 build the list of squares. The list of numbers from 1 to 8 |
370 can be constructed in Scala as follows: |
371 can be constructed in Scala as follows: |
371 |
372 |
372 \begin{lstlisting}[language=Scala,numbers=none] |
373 \begin{lstlisting}[language=Scala,numbers=none] |
373 scala> (1 to 10).toList |
374 scala> (1 to 8).toList |
374 res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) |
375 res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8) |
375 \end{lstlisting} |
376 \end{lstlisting} |
376 |
377 |
377 \noindent Generating from this list the list of squares in a |
378 \noindent Generating from this list the list of squares in a |
378 programming language such as Java, you would assume the list |
379 programming language such as Java, you would assume the list |
379 is given as a kind of array. You would then iterate, or loop, |
380 is given as a kind of array. You would then iterate, or loop, |
383 |
384 |
384 A map essentially takes a function that describes how each |
385 A map essentially takes a function that describes how each |
385 element is transformed (for example squared) and a list over |
386 element is transformed (for example squared) and a list over |
386 which this function should work. There are two forms to |
387 which this function should work. There are two forms to |
387 express such maps in Scala. The first way is called a |
388 express such maps in Scala. The first way is called a |
388 for-comprehension. Squaring the numbers from 1 to 10 would |
389 \emph{for-comprehension}. Squaring the numbers from 1 to 8 |
389 look in this form as follows: |
390 would look in this form as follows: |
390 |
391 |
391 \begin{lstlisting}[language=Scala,numbers=none] |
392 \begin{lstlisting}[language=Scala,numbers=none] |
392 scala> for (n <- (1 to 10).toList) yield n * n |
393 scala> for (n <- (1 to 8).toList) yield n * n |
393 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
394 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64) |
394 \end{lstlisting} |
395 \end{lstlisting} |
395 |
396 |
396 \noindent The important keywords are \code{for} and |
397 \noindent The important keywords are \code{for} and |
397 \code{yield}. This for-comprehension roughly states that from |
398 \code{yield}. This for-comprehension roughly states that from |
398 the list of numbers we draw \code{n}s and compute the result |
399 the list of numbers we draw \code{n}s and compute the result |
399 of \code{n * n}. As you can see, we specified the list where |
400 of \code{n * n}. As you can see, we specified the list where |
400 each \code{n} comes from, namely \code{(1 to 10).toList}, and |
401 each \code{n} comes from, namely \code{(1 to 8).toList}, and |
401 how each element needs to be transformed. This can also be |
402 how each element needs to be transformed. This can also be |
402 expressed in a second way in Scala by using directly |
403 expressed in a second way in Scala by using directly |
403 \code{map}s as follows: |
404 \code{map}s as follows: |
404 |
405 |
405 \begin{lstlisting}[language=Scala,numbers=none] |
406 \begin{lstlisting}[language=Scala,numbers=none] |
406 scala> (1 to 10).toList.map(n => n * n) |
407 scala> (1 to 8).toList.map(n => n * n) |
407 res3 = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
408 res3 = List(1, 4, 9, 16, 25, 36, 49, 64) |
408 \end{lstlisting} |
409 \end{lstlisting} |
409 |
410 |
410 \noindent In this way, the expression \code{n => n * n} stands |
411 \noindent In this way, the expression \code{n => n * n} stands |
411 for the function that calculates the square (this is how the |
412 for the function that calculates the square (this is how the |
412 \code{n}s are transformed). This expression for functions |
413 \code{n}s are transformed). This expression for functions |
422 collection, such as lists, sets, vectors, options and so on. |
423 collection, such as lists, sets, vectors, options and so on. |
423 For example if we instead compute the reminders modulo 3 of |
424 For example if we instead compute the reminders modulo 3 of |
424 this list, we can write |
425 this list, we can write |
425 |
426 |
426 \begin{lstlisting}[language=Scala,numbers=none] |
427 \begin{lstlisting}[language=Scala,numbers=none] |
427 scala> (1 to 10).toList.map(n => n % 3) |
428 scala> (1 to 8).toList.map(n => n % 3) |
428 res4 = List(1, 2, 0, 1, 2, 0, 1, 2, 0, 1) |
429 res4 = List(1, 2, 0, 1, 2, 0, 1, 2) |
429 \end{lstlisting} |
430 \end{lstlisting} |
430 |
431 |
431 \noindent If we, however, transform the numbers 1 to 10 not |
432 \noindent If we, however, transform the numbers 1 to 8 not |
432 into a list, but into a set, and then compute the reminders |
433 into a list, but into a set, and then compute the reminders |
433 modulo 3 we obtain |
434 modulo 3 we obtain |
434 |
435 |
435 \begin{lstlisting}[language=Scala,numbers=none] |
436 \begin{lstlisting}[language=Scala,numbers=none] |
436 scala> (1 to 10).toSet[Int].map(n => n % 3) |
437 scala> (1 to 8).toSet[Int].map(n => n % 3) |
437 res5 = Set(2, 1, 0) |
438 res5 = Set(2, 1, 0) |
438 \end{lstlisting} |
439 \end{lstlisting} |
439 |
440 |
440 \noindent This is the correct result for sets, as there are |
441 \noindent This is the correct result for sets, as there are |
441 only three equivalence classes of integers modulo 3. Note that |
442 only three equivalence classes of integers modulo 3. Note that |
470 m <- (1 to 3).toList; |
471 m <- (1 to 3).toList; |
471 if (n + m) % 2 == 0) yield (n, m) |
472 if (n + m) % 2 == 0) yield (n, m) |
472 res7 = List((1,1), (1,3), (2,2), (3,1), (3,3)) |
473 res7 = List((1,1), (1,3), (2,2), (3,1), (3,3)) |
473 \end{lstlisting} |
474 \end{lstlisting} |
474 |
475 |
475 \noindent |
476 \noindent The \code{if}-condition in the for-comprehension |
476 The \code{if}-condition filters out all pairs where the |
477 filters out all pairs where the sum is not even. |
477 sum is not even. |
|
478 |
478 |
479 While hopefully this all looks reasonable, there is one |
479 While hopefully this all looks reasonable, there is one |
480 complication: In the examples above we always wanted to |
480 complication: In the examples above we always wanted to |
481 transform one list into another list (e.g.~list of squares), |
481 transform one list into another list (e.g.~list of squares), |
482 or one set into another set (set of numbers into set of |
482 or one set into another set (set of numbers into set of |
565 important role. Scala is such a language. You have already |
551 important role. Scala is such a language. You have already |
566 seen built-in types, like \code{Int}, \code{Boolean}, |
552 seen built-in types, like \code{Int}, \code{Boolean}, |
567 \code{String} and \code{BigInt}, but also user-defined ones, |
553 \code{String} and \code{BigInt}, but also user-defined ones, |
568 like \code{Rexp}. Unfortunately, types can be a thorny |
554 like \code{Rexp}. Unfortunately, types can be a thorny |
569 subject, especially in Scala. For example, why do we need to |
555 subject, especially in Scala. For example, why do we need to |
570 give the type to \code{toSet[Int]} but not to \code{toList}? |
556 give the type to \code{toSet[Int]}, but not to \code{toList}? |
571 The reason is the power of Scala, which sometimes means it |
557 The reason is the power of Scala, which sometimes means it |
572 cannot infer all necessary typing information. At the |
558 cannot infer all necessary typing information. At the |
573 beginning while getting familiar with Scala, I recommend a |
559 beginning while getting familiar with Scala, I recommend a |
574 ``play-it-by-ear-approach'' to types. Fully understanding |
560 ``play-it-by-ear-approach'' to types. Fully understanding |
575 type-systems, especially complicated ones like in Scala, can |
561 type-systems, especially complicated ones like in Scala, can |
589 example \code{List[Int]} or \code{Set[List[String]]} or |
575 example \code{List[Int]} or \code{Set[List[String]]} or |
590 \code{Map[Int, Int]}. |
576 \code{Map[Int, Int]}. |
591 |
577 |
592 There are a few special type-constructors that fall outside |
578 There are a few special type-constructors that fall outside |
593 this pattern. One is for tuples, where the type is written |
579 this pattern. One is for tuples, where the type is written |
594 with parentheses. For example \code{(Int, Int, String)} for a |
580 with parentheses. For example |
595 triple consisting of two integers and a string. Tuples are |
581 |
596 helpful if you want to define functions with multiple |
582 \begin{lstlisting}[language=Scala, numbers=none] |
597 results, say the function returning the quotient and reminder |
583 (Int, Int, String) |
598 of two numbers. For this you might define: |
584 \end{lstlisting} |
|
585 |
|
586 \noindent is for a triple (a tuple with three components---two |
|
587 integers and a string). Tuples are helpful if you want to |
|
588 define functions with multiple results, say the function |
|
589 returning the quotient and reminder of two numbers. For this |
|
590 you might define: |
599 |
591 |
600 |
592 |
601 \begin{lstlisting}[language=Scala, numbers=none] |
593 \begin{lstlisting}[language=Scala, numbers=none] |
602 def quo_rem(m: Int, n: Int) : (Int, Int) = (m / n, m % n) |
594 def quo_rem(m: Int, n: Int) : (Int, Int) = (m / n, m % n) |
603 \end{lstlisting} |
595 \end{lstlisting} |
604 |
596 |
605 |
597 |
606 \noindent |
598 \noindent Since this function returns a pair of integers, its |
607 Since this function returns a pair of integers, its |
599 return type needs to be of type \code{(Int, Int)}. |
608 return type needs to be \code{(Int, Int)}. Incidentally, |
600 Incidentally, this is also the input type of this function. |
609 this is also the input type of this function. Notice it takes |
601 Notice this function takes \emph{two} arguments, namely |
610 \emph{two} arguments, namely \code{m} and \code{n}, both |
602 \code{m} and \code{n}, both of which are integers. They are |
611 of which are integers. They are ``packaged'' in a pair. |
603 ``packaged'' in a pair. Consequently the complete type of |
612 Consequently the complete type of \code{quo_rem} is |
604 \code{quo_rem} is |
613 |
605 |
614 \begin{lstlisting}[language=Scala, numbers=none] |
606 \begin{lstlisting}[language=Scala, numbers=none] |
615 (Int, Int) => (Int, Int) |
607 (Int, Int) => (Int, Int) |
616 \end{lstlisting} |
608 \end{lstlisting} |
617 |
609 |
618 Another special type-constructor is for functions, written |
610 Another special type-constructor is for functions, written |
619 as the arrow \code{=>}. For example, the type |
611 as the arrow \code{=>}. For example, the type |
620 \code{Int => String} is for a function that takes an integer as argument |
612 \code{Int => String} is for a function that takes an integer as argument |
621 and produces a string. A function of this type is for instance |
613 and produces a string. A function of this type is for instance |
622 |
|
623 |
614 |
624 \begin{lstlisting}[language=Scala,numbers=none] |
615 \begin{lstlisting}[language=Scala,numbers=none] |
625 def mk_string(n: Int) : String = n match { |
616 def mk_string(n: Int) : String = n match { |
626 case 0 => "zero" |
617 case 0 => "zero" |
627 case 1 => "one" |
618 case 1 => "one" |
628 case 2 => "two" |
619 case 2 => "two" |
629 case _ => "many" |
620 case _ => "many" |
630 } |
621 } |
631 \end{lstlisting} |
622 \end{lstlisting} |
632 |
623 |
633 |
|
634 \noindent It takes an integer as argument and returns a |
624 \noindent It takes an integer as argument and returns a |
635 string. Unlike other functional programming languages, there |
625 string. Unlike other functional programming languages, there |
636 is in Scala no easy way to find out the types of existing |
626 is in Scala no easy way to find out the types of existing |
637 functions, except by looking into the documentation |
627 functions, except by looking into the documentation |
638 |
628 |
646 and the result of the function is a boolean. Though silly, a |
636 and the result of the function is a boolean. Though silly, a |
647 function of this type would be |
637 function of this type would be |
648 |
638 |
649 |
639 |
650 \begin{lstlisting}[language=Scala,numbers=none] |
640 \begin{lstlisting}[language=Scala,numbers=none] |
651 def chk_string(n: Int, s: String) : Boolean = |
641 def chk_string(n: Int)(s: String) : Boolean = |
652 mk_string(n) == s |
642 mk_string(n) == s |
653 \end{lstlisting} |
643 \end{lstlisting} |
654 |
644 |
655 |
645 |
656 \noindent which checks whether the integer \code{n} corresponds |
646 \noindent which checks whether the integer \code{n} |
657 to the name \code{s} given by the function \code{mk\_string}. |
647 corresponds to the name \code{s} given by the function |
|
648 \code{mk\_string}. Notice the unusual way of specifying the |
|
649 arguments of this function: the arguments are given one after |
|
650 the other, instead of being in a pair (what would be the type |
|
651 of this function then?). This way of specifying the arguments |
|
652 can be useful, for example in situations like this |
|
653 |
|
654 \begin{lstlisting}[language=Scala,numbers=none] |
|
655 scala> List("one", "two", "three", "many").map(chk_string(2)) |
|
656 res4 = List(false, true, false, false) |
|
657 |
|
658 scala> List("one", "two", "three", "many").map(chk_string(3)) |
|
659 res5 = List(false, false, false, true) |
|
660 \end{lstlisting} |
|
661 |
|
662 \noindent In each case we can give to \code{map} a specialised |
|
663 version of \code{chk_string}---once specialised to 2 and once |
|
664 to 3. This kind of ``specialising'' a function is called |
|
665 \emph{partial application}---we have not yet given to this |
|
666 function all arguments it needs, but only one of them. |
658 |
667 |
659 Coming back to the type \code{Int => String => Boolean}. The |
668 Coming back to the type \code{Int => String => Boolean}. The |
660 rule about such function types is that the right-most type |
669 rule about such function types is that the right-most type |
661 specifies what the function returns (a boolean in this case). |
670 specifies what the function returns (a boolean in this case). |
662 The types before that specify how many arguments the function |
671 The types before that specify how many arguments the function |
663 expects and what is their type (in this case two arguments, |
672 expects and what their type is (in this case two arguments, |
664 one of type \code{Int} and another of type \code{String}). |
673 one of type \code{Int} and another of type \code{String}). |
665 Given this rule, what kind of function has type |
674 Given this rule, what kind of function has type |
666 \mbox{\code{(Int => String) => Boolean}}? Well, it returns a |
675 \mbox{\code{(Int => String) => Boolean}}? Well, it returns a |
667 boolean. More interestingly, though, it only takes a single |
676 boolean. More interestingly, though, it only takes a single |
668 argument (because of the parentheses). The single argument |
677 argument (because of the parentheses). The single argument |
669 happens to be another function (taking an integer as input and |
678 happens to be another function (taking an integer as input and |
670 returning a string). |
679 returning a string). Remember that \code{mk_string} is just |
671 |
680 such a function. So how can we use it? For this define |
672 Now you might ask, what is the point of having function as |
681 the somewhat silly function \code{apply_3} |
673 arguments to other functions? In Java there is no need of this |
682 |
674 kind of feature. But in all functional programming languages, |
683 \begin{lstlisting}[language=Scala,numbers=none] |
675 including Scala, it is really essential. Above you already |
684 def apply_3(f: Int => String): Bool = f(3) == "many" |
676 seen \code{map} and \code{foreach} which need this. Consider |
685 |
677 the functions \code{print} and \code{println}, which both |
686 scala> apply_3(mk_string) |
678 print out strings, but the latter adds a line break. You can |
687 res6 = true |
679 call \code{foreach} with either of them and thus changing how, |
688 \end{lstlisting} |
680 for example, five numbers are printed. |
689 |
|
690 You might ask, apart from silly functions like above, what is |
|
691 the point of having function as arguments to other functions? |
|
692 In Java there is indeed no need of this kind of feature. But |
|
693 in all functional programming languages, including Scala, it |
|
694 is really essential. Above you already seen \code{map} and |
|
695 \code{foreach} which need this. Consider the functions |
|
696 \code{print} and \code{println}, which both print out strings, |
|
697 but the latter adds a line break. You can call \code{foreach} |
|
698 with either of them and thus changing how, for example, five |
|
699 numbers are printed. |
681 |
700 |
682 |
701 |
683 \begin{lstlisting}[language=Scala,numbers=none] |
702 \begin{lstlisting}[language=Scala,numbers=none] |
684 scala> (1 to 5).toList.foreach(print) |
703 scala> (1 to 5).toList.foreach(print) |
685 12345 |
704 12345 |
750 My second wow-moment I had with a feature of Scala that other |
769 My second wow-moment I had with a feature of Scala that other |
751 functional programming languages also do not have. This |
770 functional programming languages also do not have. This |
752 feature is about implicit type conversions. If you have |
771 feature is about implicit type conversions. If you have |
753 regular expressions and want to use them for language |
772 regular expressions and want to use them for language |
754 processing you often want to recognise keywords in a language, |
773 processing you often want to recognise keywords in a language, |
755 for example \code{for}, \code{if}, \code{yield} and so on. But |
774 for example \code{for},{} \code{if},{} \code{yield} and so on. But |
756 the basic regular expression, \code{CHAR}, can only recognise |
775 the basic regular expression, \code{CHAR}, can only recognise |
757 a single character. In order to recognise a whole string, like |
776 a single character. In order to recognise a whole string, like |
758 \code{ for}, you have to put many of those together using |
777 \code{ for}, you have to put many of those together using |
759 \code{SEQ}: |
778 \code{SEQ}: |
760 |
779 |