7 \begin{document} |
7 \begin{document} |
8 |
8 |
9 \section*{A Crash-Course on Scala} |
9 \section*{A Crash-Course on Scala} |
10 |
10 |
11 Scala is a programming language that combines functional and |
11 Scala is a programming language that combines functional and |
12 object-oriented programming-styles, and has received in the |
12 object-oriented programming-styles. It has received in the |
13 last five years or so quite a bit of attention. One reason for |
13 last five years or so quite a bit of attention. One reason for |
14 this attention is that, like the Java programming language, |
14 this attention is that, like the Java programming language, |
15 Scala compiles to the Java Virtual Machine (JVM) and therefore |
15 Scala compiles to the Java Virtual Machine (JVM) and therefore |
16 Scala programs can run under MacOSX, Linux and |
16 Scala programs can run under MacOSX, Linux and |
17 Windows.\footnote{There are also experimental backends for |
17 Windows.\footnote{There are also experimental backends for |
31 Why do I use Scala in the AFL module? Actually, you can do |
31 Why do I use Scala in the AFL module? Actually, you can do |
32 \emph{any} part of the coursework in \emph{any} programming |
32 \emph{any} part of the coursework in \emph{any} programming |
33 language you like. I use Scala for showing you code during the |
33 language you like. I use Scala for showing you code during the |
34 lectures because its functional programming-style allows me to |
34 lectures because its functional programming-style allows me to |
35 implement the functions we will discuss with very small |
35 implement the functions we will discuss with very small |
36 code-snippets. If I had to do this in Java, for example, I |
36 code-snippets. If I had to do this in Java, I would first have |
37 would first have to go through heaps of boilerplate code and |
37 to go through heaps of boilerplate code and the code-snippets |
38 the code-snippets would not look pretty. Since the Scala |
38 would not look pretty. Since the Scala compiler is free, you |
39 compiler is free, you can download the code-snippets and run |
39 can download the code-snippets and run every example I give. |
40 every example I give. But if you prefer, you can also easily |
40 But if you prefer, you can also easily translate them into any |
41 translate them into any other functional language, for example |
41 other functional language, for example Haskell, Standard ML, |
42 Haskell, Standard ML, F$^\#$, Ocaml and so on. |
42 F$^\#$, Ocaml and so on. |
43 |
43 |
44 Developing programs in Scala can be done with the Eclipse IDE |
44 Developing programs in Scala can be done with the Eclipse IDE |
45 and also with IntelliJ IDE, but for the small programs I will |
45 and also with IntelliJ IDE, but for the small programs I will |
46 develop the good old Emacs-editor is adequate for me and I |
46 develop the good old Emacs-editor is adequate for me and I |
47 will run the programs on the command line. One advantage of |
47 will run the programs on the command line. One advantage of |
63 |
63 |
64 \noindent The precise response may vary due to the platform |
64 \noindent The precise response may vary due to the platform |
65 where you installed Scala. At the Scala prompt you can type |
65 where you installed Scala. At the Scala prompt you can type |
66 things like \code{2 + 3} \keys{Ret} and the output will be |
66 things like \code{2 + 3} \keys{Ret} and the output will be |
67 |
67 |
68 \begin{lstlisting}[language=Scala,numbers=none] |
68 \begin{lstlisting}[numbers=none] |
69 scala> 2 + 3 |
69 scala> 2 + 3 |
70 res0: Int = 5 |
70 res0: Int = 5 |
71 \end{lstlisting} |
71 \end{lstlisting} |
72 |
72 |
73 \noindent indicating that the result of the addition is of |
73 \noindent indicating that the result of the addition is of |
74 type \code{Int} and the actual result is 5. Another classic |
74 type \code{Int} and the actual result is 5. Another classic |
75 example you can try out is |
75 example you can try out is |
76 |
76 |
77 \begin{lstlisting}[language=Scala,numbers=none] |
77 \begin{lstlisting}[numbers=none] |
78 scala> print("hello world") |
78 scala> print("hello world") |
79 hello world |
79 hello world |
80 \end{lstlisting} |
80 \end{lstlisting} |
81 |
81 |
82 \noindent Note that in this case there is no result. The |
82 \noindent Note that in this case there is no result. The |
158 clause in the grammar. The cases for $\varnothing$ and |
158 clause in the grammar. The cases for $\varnothing$ and |
159 $\epsilon$ do not have any arguments, while in all the other |
159 $\epsilon$ do not have any arguments, while in all the other |
160 cases we do have arguments. For example the character regular |
160 cases we do have arguments. For example the character regular |
161 expression needs to take as an argument the character it is |
161 expression needs to take as an argument the character it is |
162 supposed to recognise. In Scala, the cases without arguments |
162 supposed to recognise. In Scala, the cases without arguments |
163 are called \emph{case objects}, while the ones with arguments |
163 are called \emph{case objects}, whereas the ones with |
164 are \emph{case classes}. The corresponding Scala code is as |
164 arguments are \emph{case classes}. The corresponding Scala |
165 follows: |
165 code is as follows: |
166 |
166 |
167 \begin{lstlisting}[language=Scala,numbers=none] |
167 \begin{lstlisting}[numbers=none] |
168 abstract class Rexp |
168 abstract class Rexp |
169 case object NULL extends Rexp |
169 case object NULL extends Rexp |
170 case object EMPTY extends Rexp |
170 case object EMPTY extends Rexp |
171 case class CHAR (c: Char) extends Rexp |
171 case class CHAR (c: Char) extends Rexp |
172 case class SEQ (r1: Rexp, r2: Rexp) extends Rexp |
172 case class SEQ (r1: Rexp, r2: Rexp) extends Rexp |
188 later section we will see how we can support the more |
188 later section we will see how we can support the more |
189 mathematical infix notation for regular expression operators |
189 mathematical infix notation for regular expression operators |
190 in Scala. If you want to assign this regular expression to a |
190 in Scala. If you want to assign this regular expression to a |
191 variable, you can use the keyword \code{val} and type |
191 variable, you can use the keyword \code{val} and type |
192 |
192 |
193 \begin{lstlisting}[language=Scala,numbers=none] |
193 \begin{lstlisting}[numbers=none] |
194 scala> val r = ALT(CHAR('a'), CHAR('b')) |
194 scala> val r = ALT(CHAR('a'), CHAR('b')) |
195 r: ALT = ALT(CHAR(a),CHAR(b)) |
195 r: ALT = ALT(CHAR(a),CHAR(b)) |
196 \end{lstlisting} |
196 \end{lstlisting} |
197 |
197 |
198 \noindent As you can see, in order to make such assignments, |
198 \noindent As you can see, in order to make such assignments, |
211 abstract class. But in this case there is no need to give |
211 abstract class. But in this case there is no need to give |
212 \code{r} the more general type of \code{Rexp}. This is |
212 \code{r} the more general type of \code{Rexp}. This is |
213 different if you want to form a list of regular expressions, |
213 different if you want to form a list of regular expressions, |
214 for example |
214 for example |
215 |
215 |
216 \begin{lstlisting}[language=Scala,numbers=none] |
216 \begin{lstlisting}[numbers=none] |
217 scala> val ls = List(ALT(CHAR('a'), CHAR('b')), NULL) |
217 scala> val ls = List(ALT(CHAR('a'), CHAR('b')), NULL) |
218 ls: List[Rexp] = List(ALT(CHAR(a),CHAR(b)), NULL) |
218 ls: List[Rexp] = List(ALT(CHAR(a),CHAR(b)), NULL) |
219 \end{lstlisting} |
219 \end{lstlisting} |
220 |
220 |
221 \noindent In this case, Scala needs to assign a common type to |
221 \noindent In this case, Scala needs to assign a common type to |
241 easily in Scala. To show you this, lets first consider a |
241 easily in Scala. To show you this, lets first consider a |
242 problem from number theory, called the \emph{Collatz-series}, |
242 problem from number theory, called the \emph{Collatz-series}, |
243 which corresponds to a famous unsolved problem in |
243 which corresponds to a famous unsolved problem in |
244 mathematics.\footnote{See for example |
244 mathematics.\footnote{See for example |
245 \url{http://mathworld.wolfram.com/CollatzProblem.html}.} |
245 \url{http://mathworld.wolfram.com/CollatzProblem.html}.} |
246 Mathematician define this series as: |
246 Mathematicians define this series as: |
247 |
247 |
248 \[ |
248 \[ |
249 collatz_{n + 1} \dn |
249 collatz_{n + 1} \dn |
250 \begin{cases} |
250 \begin{cases} |
251 \frac{1}{2} * collatz_n & \text{if $collatz_n$ is even}\\ |
251 \frac{1}{2} * collatz_n & \text{if $collatz_n$ is even}\\ |
252 3 * collatz_n + 1 & \text{if $collatz_n$ is odd} |
252 3 * collatz_n + 1 & \text{if $collatz_n$ is odd} |
253 \end{cases} |
253 \end{cases} |
254 \] |
254 \] |
255 |
255 |
256 \noindent The famous unsolved question is whether this |
256 \noindent The famous unsolved question is whether this |
257 series started with any $n > 0$ as $collaz_0$ will always |
257 series started with any $n > 0$ as $collatz_0$ will always |
258 return to $1$. This is obvious when started with $1$, and also |
258 return to $1$. This is obvious when started with $1$, and also |
259 with $2$, but already needs a bit of head-scratching for the |
259 with $2$, but already needs a bit of head-scratching for the |
260 case of $3$. |
260 case of $3$. |
261 |
261 |
262 If we want to avoid the head-scratching, we could implement |
262 If we want to avoid the head-scratching, we could implement |
263 this as the following function in Scala: |
263 this as the following function in Scala: |
264 |
264 |
265 |
265 \lstinputlisting[numbers=none]{../progs/collatz.scala} |
266 \lstinputlisting[language=Scala,numbers=none]{../progs/collatz.scala} |
|
267 |
|
268 |
266 |
269 \noindent The keyword for function definitions is \code{def} |
267 \noindent The keyword for function definitions is \code{def} |
270 followed by the name of the function. After that you have a |
268 followed by the name of the function. After that you have a |
271 list of arguments (enclosed in parentheses and separated by |
269 list of arguments (enclosed in parentheses and separated by |
272 commas). Each argument in this list needs its type annotated. |
270 commas). Each argument in this list needs its type to be |
273 In this case we only have one argument, which is of type |
271 annotated. In this case we only have one argument, which is of |
274 \code{BigInt}. This type stands in Scala for arbitrary precision |
272 type \code{BigInt}. This type stands in Scala for arbitrary |
275 integers (in case you want to try out the function on really |
273 precision integers (in case you want to try out the function |
276 big numbers). After the arguments comes the type of what the |
274 on really big numbers). After the arguments comes the type of |
277 function returns---a Boolean in this case for indicating that |
275 what the function returns---a Boolean in this case for |
278 the function has reached 1. Finally, after the \code{=} |
276 indicating that the function has reached 1. Finally, after the |
279 comes the \emph{body} of the function implementing what the |
277 \code{=} comes the \emph{body} of the function implementing |
280 function is supposed to do. What the \code{collatz} function |
278 what the function is supposed to do. What the \code{collatz} |
281 does should be pretty self-explanatory: the function first |
279 function does should be pretty self-explanatory: the function |
282 tests whether \code{n} is equal to 1 in which case it returns |
280 first tests whether \code{n} is equal to 1 in which case it |
283 \code{true} and so on. |
281 returns \code{true} and so on. |
284 |
282 |
285 Notice a quirk in Scala's syntax for \code{if}s: The condition |
283 Notice a quirk in Scala's syntax for \code{if}s: The condition |
286 needs to be enclosed in parentheses and the then-case comes |
284 needs to be enclosed in parentheses and the then-case comes |
287 right after the condition---there is no \code{then} keyword in |
285 right after the condition---there is no \code{then} keyword in |
288 Scala. |
286 Scala. |
291 define functions by \emph{pattern matching}. In the |
289 define functions by \emph{pattern matching}. In the |
292 \code{collatz} function above we need to test each case using a |
290 \code{collatz} function above we need to test each case using a |
293 sequence of \code{if}s. This can be very cumbersome and brittle |
291 sequence of \code{if}s. This can be very cumbersome and brittle |
294 if there are many cases. If we wanted to define a function |
292 if there are many cases. If we wanted to define a function |
295 over regular expressions in Java, for example, which does not |
293 over regular expressions in Java, for example, which does not |
296 have pattern-matching, the resulting code would be just |
294 have pattern-matching, the resulting code would just be |
297 awkward. |
295 awkward. |
298 |
296 |
299 Mathematicians already use the power of pattern-matching when |
297 Mathematicians already use the power of pattern-matching when |
300 they define the function that takes a regular expression and |
298 they define the function that takes a regular expression and |
301 produces another regular expression that can recognise the |
299 produces another regular expression that can recognise the |
302 reversed strings. The resulting recursive function is often |
300 reversed strings. They define this function as follows: |
303 defined as follows: |
|
304 |
301 |
305 \begin{center} |
302 \begin{center} |
306 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l} |
303 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l} |
307 $rev(\varnothing)$ & $\dn$ & $\varnothing$\\ |
304 $rev(\varnothing)$ & $\dn$ & $\varnothing$\\ |
308 $rev(\epsilon)$ & $\dn$ & $\epsilon$\\ |
305 $rev(\epsilon)$ & $\dn$ & $\epsilon$\\ |
311 $rev(r_1 \cdot r_2)$ & $\dn$ & $rev(r_2) \cdot rev(r_1)$\\ |
308 $rev(r_1 \cdot r_2)$ & $\dn$ & $rev(r_2) \cdot rev(r_1)$\\ |
312 $rev(r^*)$ & $\dn$ & $rev(r)^*$\\ |
309 $rev(r^*)$ & $\dn$ & $rev(r)^*$\\ |
313 \end{tabular} |
310 \end{tabular} |
314 \end{center} |
311 \end{center} |
315 |
312 |
316 \noindent This function is defined by recursion analysing each |
313 \noindent It is defined by recursion analysing each pattern of |
317 pattern of what the regular expression could look like. The |
314 what the regular expression might look like. The corresponding |
318 corresponding Scala code looks very similar to this |
315 Scala code looks very similar to this definition, thanks to |
319 definition, thanks to pattern-matching. |
316 pattern-matching. |
320 |
317 |
321 \lstinputlisting[language=Scala]{../progs/rev.scala} |
318 \lstinputlisting[language=Scala]{../progs/rev.scala} |
322 |
319 |
323 \noindent The keyword for starting a pattern-match is |
320 \noindent The keyword for starting a pattern-match is |
324 \code{match} followed by a list of \code{case}s. Before the match |
321 \code{match} followed by a list of \code{case}s. Before the |
325 keyword can be another pattern, but often as in the case |
322 match keyword can be another pattern, but often, as in the |
326 above, it is just a variable you want to pattern-match |
323 case above, it is just a variable you want to pattern-match |
327 (the \code{r} after \code{=} in Line 1). |
324 (the \code{r} after \code{=} in Line 1). |
328 |
325 |
329 Each case in this definition follows the structure of how we |
326 Each case in this definition follows the structure of how we |
330 defined regular expressions as inductive datatype. For example |
327 defined regular expressions as inductive datatype. For example |
331 the case in Line 3 you can read as: if the regular expression |
328 the case in Line 3 you can read as: if the regular expression |
343 \code{rev} produces $ba + ca$, which can recognise the reversed |
340 \code{rev} produces $ba + ca$, which can recognise the reversed |
344 strings $ba$ and $ca$. |
341 strings $ba$ and $ca$. |
345 |
342 |
346 In Scala each pattern-match can also be guarded as in |
343 In Scala each pattern-match can also be guarded as in |
347 |
344 |
348 \begin{lstlisting}[language=Scala, numbers=none] |
345 \begin{lstlisting}[ numbers=none] |
349 case Pattern if Condition => Do_Something |
346 case Pattern if Condition => Do_Something |
350 \end{lstlisting} |
347 \end{lstlisting} |
351 |
348 |
352 \noindent This allows us, for example, to re-write the |
349 \noindent This allows us, for example, to re-write the |
353 \code{collatz}-function from above as follows: |
350 \code{collatz}-function from above as follows: |
354 |
351 |
355 \lstinputlisting[language=Scala]{../progs/collatz2.scala} |
352 \lstinputlisting[language=Scala]{../progs/collatz2.scala} |
356 |
353 |
357 |
354 |
358 \noindent Although in this particular case the pattern-match |
355 \noindent Although in this particular case the pattern-match |
359 does not improve the code in any way. In cases like \code{rev} |
356 does not improve the code in any way. In cases like |
360 it is really crucial. The underscore in Line 4 indicates that |
357 \code{rev}, however, it is really crucial. The underscore in |
361 we do not care what the patter8n looks like. Thus this case |
358 Line 4 indicates that we do not care what the pattern looks |
362 acts like a default case whenever the cases above did not |
359 like. Thus this case acts like a default case whenever the |
363 match. Cases are always tried out from top to bottom. |
360 cases above did not match. Cases are always tried out from top |
|
361 to bottom. |
364 |
362 |
365 \subsection*{Loops, or the Absence of} |
363 \subsection*{Loops, or the Absence of} |
366 |
364 |
367 Coming from Java or C, you might be surprised that Scala does |
365 Coming from Java or C, you might be surprised that Scala does |
368 not really have loops. It has instead, what is in functional |
366 not really have loops. It has instead, what is in functional |
369 programming called \emph{maps}. To illustrate how they work, |
367 programming called ,\emph{maps}. To illustrate how they work, |
370 lets assume you have a list of numbers from 1 to 8 and want to |
368 lets assume you have a list of numbers from 1 to 8 and want to |
371 build the list of squares. The list of numbers from 1 to 8 |
369 build the list of squares. The list of numbers from 1 to 8 |
372 can be constructed in Scala as follows: |
370 can be constructed in Scala as follows: |
373 |
371 |
374 \begin{lstlisting}[language=Scala,numbers=none] |
372 \begin{lstlisting}[numbers=none] |
375 scala> (1 to 8).toList |
373 scala> (1 to 8).toList |
376 res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8) |
374 res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8) |
377 \end{lstlisting} |
375 \end{lstlisting} |
378 |
376 |
379 \noindent Generating from this list the list of squares in a |
377 \noindent Generating from this list, the list of squares in a |
380 programming language such as Java, you would assume the list |
378 programming language such as Java, you would assume the list |
381 is given as a kind of array. You would then iterate, or loop, |
379 is given as a kind of array. You would then iterate, or loop, |
382 an index over this array and replace each entry in the array |
380 an index over this array and replace each entry in the array |
383 by the square. Right? In Scala, and in other functional |
381 by the square. Right? In Scala, and in other functional |
384 programming languages, you use maps to achieve the same. |
382 programming languages, you use maps to achieve the same. |
388 which this function should work. There are two forms to |
386 which this function should work. There are two forms to |
389 express such maps in Scala. The first way is called a |
387 express such maps in Scala. The first way is called a |
390 \emph{for-comprehension}. Squaring the numbers from 1 to 8 |
388 \emph{for-comprehension}. Squaring the numbers from 1 to 8 |
391 would look in this form as follows: |
389 would look in this form as follows: |
392 |
390 |
393 \begin{lstlisting}[language=Scala,numbers=none] |
391 \begin{lstlisting}[numbers=none] |
394 scala> for (n <- (1 to 8).toList) yield n * n |
392 scala> for (n <- (1 to 8).toList) yield n * n |
395 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64) |
393 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64) |
396 \end{lstlisting} |
394 \end{lstlisting} |
397 |
395 |
398 \noindent The important keywords are \code{for} and |
396 \noindent The important keywords are \code{for} and |
402 each \code{n} comes from, namely \code{(1 to 8).toList}, and |
400 each \code{n} comes from, namely \code{(1 to 8).toList}, and |
403 how each element needs to be transformed. This can also be |
401 how each element needs to be transformed. This can also be |
404 expressed in a second way in Scala by using directly |
402 expressed in a second way in Scala by using directly |
405 \code{map}s as follows: |
403 \code{map}s as follows: |
406 |
404 |
407 \begin{lstlisting}[language=Scala,numbers=none] |
405 \begin{lstlisting}[numbers=none] |
408 scala> (1 to 8).toList.map(n => n * n) |
406 scala> (1 to 8).toList.map(n => n * n) |
409 res3 = List(1, 4, 9, 16, 25, 36, 49, 64) |
407 res3 = List(1, 4, 9, 16, 25, 36, 49, 64) |
410 \end{lstlisting} |
408 \end{lstlisting} |
411 |
409 |
412 \noindent In this way, the expression \code{n => n * n} stands |
410 \noindent In this way, the expression \code{n => n * n} stands |
423 for-comprehensions can be written for any kind of data |
421 for-comprehensions can be written for any kind of data |
424 collection, such as lists, sets, vectors, options and so on. |
422 collection, such as lists, sets, vectors, options and so on. |
425 For example if we instead compute the reminders modulo 3 of |
423 For example if we instead compute the reminders modulo 3 of |
426 this list, we can write |
424 this list, we can write |
427 |
425 |
428 \begin{lstlisting}[language=Scala,numbers=none] |
426 \begin{lstlisting}[numbers=none] |
429 scala> (1 to 8).toList.map(n => n % 3) |
427 scala> (1 to 8).toList.map(n => n % 3) |
430 res4 = List(1, 2, 0, 1, 2, 0, 1, 2) |
428 res4 = List(1, 2, 0, 1, 2, 0, 1, 2) |
431 \end{lstlisting} |
429 \end{lstlisting} |
432 |
430 |
433 \noindent If we, however, transform the numbers 1 to 8 not |
431 \noindent If we, however, transform the numbers 1 to 8 not |
434 into a list, but into a set, and then compute the reminders |
432 into a list, but into a set, and then compute the reminders |
435 modulo 3 we obtain |
433 modulo 3 we obtain |
436 |
434 |
437 \begin{lstlisting}[language=Scala,numbers=none] |
435 \begin{lstlisting}[numbers=none] |
438 scala> (1 to 8).toSet[Int].map(n => n % 3) |
436 scala> (1 to 8).toSet[Int].map(n => n % 3) |
439 res5 = Set(2, 1, 0) |
437 res5 = Set(2, 1, 0) |
440 \end{lstlisting} |
438 \end{lstlisting} |
441 |
439 |
442 \noindent This is the correct result for sets, as there are |
440 \noindent This is the correct result for sets, as there are |
445 numbers into a set of integers by explicitly annotating the |
443 numbers into a set of integers by explicitly annotating the |
446 type \code{Int}. Since maps and for-comprehensions are |
444 type \code{Int}. Since maps and for-comprehensions are |
447 just syntactic variants of each other, the latter can also be |
445 just syntactic variants of each other, the latter can also be |
448 written as |
446 written as |
449 |
447 |
450 \begin{lstlisting}[language=Scala,numbers=none] |
448 \begin{lstlisting}[numbers=none] |
451 scala> for (n <- (1 to 8).toSet[Int]) yield n % 3 |
449 scala> for (n <- (1 to 8).toSet[Int]) yield n % 3 |
452 res5 = Set(2, 1, 0) |
450 res5 = Set(2, 1, 0) |
453 \end{lstlisting} |
451 \end{lstlisting} |
454 |
452 |
455 For-comprehensions can also be nested and the selection of |
453 For-comprehensions can also be nested and the selection of |
456 elements can be guarded. For example if we want to pair up |
454 elements can be guarded. For example if we want to pair up |
457 the numbers 1 to 4 with the letters a to c, we can write |
455 the numbers 1 to 4 with the letters a to c, we can write |
458 |
456 |
459 \begin{lstlisting}[language=Scala,numbers=none] |
457 \begin{lstlisting}[numbers=none] |
460 scala> for (n <- (1 to 4).toList; |
458 scala> for (n <- (1 to 4).toList; |
461 l <- ('a' to 'c').toList) yield (n, l) |
459 m <- ('a' to 'c').toList) yield (n, m) |
462 res6 = List((1,a), (1,b), (1,c), (2,a), (2,b), (2,c), |
460 res6 = List((1,a), (1,b), (1,c), (2,a), (2,b), (2,c), |
463 (3,a), (3,b), (3,c), (4,a), (4,b), (4,c)) |
461 (3,a), (3,b), (3,c), (4,a), (4,b), (4,c)) |
464 \end{lstlisting} |
462 \end{lstlisting} |
465 |
463 |
466 \noindent |
464 \noindent |
467 Or if we want to find all pairs of numbers between 1 and 3 |
465 Or if we want to find all pairs of numbers between 1 and 3 |
468 where the sum is an even number, we can write |
466 where the sum is an even number, we can write |
469 |
467 |
470 \begin{lstlisting}[language=Scala,numbers=none] |
468 \begin{lstlisting}[numbers=none] |
471 scala> for (n <- (1 to 3).toList; |
469 scala> for (n <- (1 to 3).toList; |
472 m <- (1 to 3).toList; |
470 m <- (1 to 3).toList; |
473 if (n + m) % 2 == 0) yield (n, m) |
471 if (n + m) % 2 == 0) yield (n, m) |
474 res7 = List((1,1), (1,3), (2,2), (3,1), (3,3)) |
472 res7 = List((1,1), (1,3), (2,2), (3,1), (3,3)) |
475 \end{lstlisting} |
473 \end{lstlisting} |
487 guessed it, does not produce any result, but only produces |
485 guessed it, does not produce any result, but only produces |
488 what is in the functional-programming-lingo called a |
486 what is in the functional-programming-lingo called a |
489 side-effect. Printing out the list of numbers from 1 to 5 |
487 side-effect. Printing out the list of numbers from 1 to 5 |
490 would look as follows |
488 would look as follows |
491 |
489 |
492 \begin{lstlisting}[language=Scala,numbers=none] |
490 \begin{lstlisting}[numbers=none] |
493 scala> for (n <- (1 to 5).toList) print(n) |
491 scala> for (n <- (1 to 5).toList) print(n) |
494 12345 |
492 12345 |
495 \end{lstlisting} |
493 \end{lstlisting} |
496 |
494 |
497 \noindent |
495 \noindent |
498 where you need to omit the keyword \code{yield}. You can |
496 where you need to omit the keyword \code{yield}. You can |
499 also do more elaborate calculations such as |
497 also do more elaborate calculations such as |
500 |
498 |
501 \begin{lstlisting}[language=Scala,numbers=none] |
499 \begin{lstlisting}[numbers=none] |
502 scala> for (n <- (1 to 5).toList) { |
500 scala> for (n <- (1 to 5).toList) { |
503 val square_n = n * n |
501 val square_n = n * n |
504 println(s"$n * $n = $square_n") |
502 println(s"$n * $n = $square_n") |
505 } |
503 } |
506 1 * 1 = 1 |
504 1 * 1 = 1 |
509 4 * 4 = 16 |
507 4 * 4 = 16 |
510 5 * 5 = 25 |
508 5 * 5 = 25 |
511 \end{lstlisting} |
509 \end{lstlisting} |
512 |
510 |
513 \noindent In this code I use a variable assignment (\code{val |
511 \noindent In this code I use a variable assignment (\code{val |
514 square_n = ...} ) and what is called a |
512 square_n = ...} ) and also what is called in Scala a |
515 \emph{string interpolation}, written \code{s"..."}, in order to |
513 \emph{string interpolation}, written \code{s"..."}. The latter |
516 print out an equation. The string interpolation allows me to |
514 is for printing out an equation. It allows me to refer to the |
517 refer to the integer values \code{n} and \code{square\_n} inside |
515 integer values \code{n} and \code{square\_n} inside a string. |
518 a string. This is very convenient for printing out ``things''. |
516 This is very convenient for printing out ``things''. |
519 |
517 |
520 The corresponding map construction for functions with |
518 The corresponding map construction for functions with |
521 side-effects is in Scala called \code{foreach}. So you |
519 side-effects is in Scala called \code{foreach}. So you |
522 could also write |
520 could also write |
523 |
521 |
524 |
522 |
525 \begin{lstlisting}[language=Scala,numbers=none] |
523 \begin{lstlisting}[numbers=none] |
526 scala> (1 to 5).toList.foreach(n => print(n)) |
524 scala> (1 to 5).toList.foreach(n => print(n)) |
527 12345 |
525 12345 |
528 \end{lstlisting} |
526 \end{lstlisting} |
529 |
527 |
530 |
528 |
531 \noindent or even just |
529 \noindent or even just |
532 |
530 |
533 \begin{lstlisting}[language=Scala,numbers=none] |
531 \begin{lstlisting}[numbers=none] |
534 scala> (1 to 5).toList.foreach(print) |
532 scala> (1 to 5).toList.foreach(print) |
535 12345 |
533 12345 |
536 \end{lstlisting} |
534 \end{lstlisting} |
537 |
535 |
538 \noindent Again I hope this reminds you a bit of your |
536 \noindent Again I hope this reminds you a bit of your |
546 above. Scala will still allow \code{map} with side-effect |
544 above. Scala will still allow \code{map} with side-effect |
547 functions, but then reacts with a slightly interesting result. |
545 functions, but then reacts with a slightly interesting result. |
548 |
546 |
549 \subsection*{Types} |
547 \subsection*{Types} |
550 |
548 |
551 In most functional programming languages types play an |
549 In most functional programming languages, types play an |
552 important role. Scala is such a language. You have already |
550 important role. Scala is such a language. You have already |
553 seen built-in types, like \code{Int}, \code{Boolean}, |
551 seen built-in types, like \code{Int}, \code{Boolean}, |
554 \code{String} and \code{BigInt}, but also user-defined ones, |
552 \code{String} and \code{BigInt}, but also user-defined ones, |
555 like \code{Rexp}. Unfortunately, types can be a thorny |
553 like \code{Rexp}. Unfortunately, types can be a thorny |
556 subject, especially in Scala. For example, why do we need to |
554 subject, especially in Scala. For example, why do we need to |
578 |
576 |
579 There are a few special type-constructors that fall outside |
577 There are a few special type-constructors that fall outside |
580 this pattern. One is for tuples, where the type is written |
578 this pattern. One is for tuples, where the type is written |
581 with parentheses. For example |
579 with parentheses. For example |
582 |
580 |
583 \begin{lstlisting}[language=Scala, numbers=none] |
581 \begin{lstlisting}[ numbers=none] |
584 (Int, Int, String) |
582 (Int, Int, String) |
585 \end{lstlisting} |
583 \end{lstlisting} |
586 |
584 |
587 \noindent is for a triple (a tuple with three components---two |
585 \noindent is for a triple (a tuple with three components---two |
588 integers and a string). Tuples are helpful if you want to |
586 integers and a string). Tuples are helpful if you want to |
589 define functions with multiple results, say the function |
587 define functions with multiple results, say the function |
590 returning the quotient and reminder of two numbers. For this |
588 returning the quotient and reminder of two numbers. For this |
591 you might define: |
589 you might define: |
592 |
590 |
593 |
591 |
594 \begin{lstlisting}[language=Scala, numbers=none] |
592 \begin{lstlisting}[ numbers=none] |
595 def quo_rem(m: Int, n: Int) : (Int, Int) = (m / n, m % n) |
593 def quo_rem(m: Int, n: Int) : (Int, Int) = (m / n, m % n) |
596 \end{lstlisting} |
594 \end{lstlisting} |
597 |
595 |
598 |
596 |
599 \noindent Since this function returns a pair of integers, its |
597 \noindent Since this function returns a pair of integers, its |
602 Notice this function takes \emph{two} arguments, namely |
600 Notice this function takes \emph{two} arguments, namely |
603 \code{m} and \code{n}, both of which are integers. They are |
601 \code{m} and \code{n}, both of which are integers. They are |
604 ``packaged'' in a pair. Consequently the complete type of |
602 ``packaged'' in a pair. Consequently the complete type of |
605 \code{quo_rem} is |
603 \code{quo_rem} is |
606 |
604 |
607 \begin{lstlisting}[language=Scala, numbers=none] |
605 \begin{lstlisting}[ numbers=none] |
608 (Int, Int) => (Int, Int) |
606 (Int, Int) => (Int, Int) |
609 \end{lstlisting} |
607 \end{lstlisting} |
610 |
608 |
611 Another special type-constructor is for functions, written |
609 Another special type-constructor is for functions, written |
612 as the arrow \code{=>}. For example, the type |
610 as the arrow \code{=>}. For example, the type |
613 \code{Int => String} is for a function that takes an integer as argument |
611 \code{Int => String} is for a function that takes an integer as argument |
614 and produces a string. A function of this type is for instance |
612 and produces a string. A function of this type is for instance |
615 |
613 |
616 \begin{lstlisting}[language=Scala,numbers=none] |
614 \begin{lstlisting}[numbers=none] |
617 def mk_string(n: Int) : String = n match { |
615 def mk_string(n: Int) : String = n match { |
618 case 0 => "zero" |
616 case 0 => "zero" |
619 case 1 => "one" |
617 case 1 => "one" |
620 case 2 => "two" |
618 case 2 => "two" |
621 case _ => "many" |
619 case _ => "many" |
650 arguments of this function: the arguments are given one after |
648 arguments of this function: the arguments are given one after |
651 the other, instead of being in a pair (what would be the type |
649 the other, instead of being in a pair (what would be the type |
652 of this function then?). This way of specifying the arguments |
650 of this function then?). This way of specifying the arguments |
653 can be useful, for example in situations like this |
651 can be useful, for example in situations like this |
654 |
652 |
655 \begin{lstlisting}[language=Scala,numbers=none] |
653 \begin{lstlisting}[numbers=none] |
656 scala> List("one", "two", "three", "many").map(chk_string(2)) |
654 scala> List("one", "two", "three", "many").map(chk_string(2)) |
657 res4 = List(false, true, false, false) |
655 res4 = List(false, true, false, false) |
658 |
656 |
659 scala> List("one", "two", "three", "many").map(chk_string(3)) |
657 scala> List("one", "two", "three", "many").map(chk_string(3)) |
660 res5 = List(false, false, false, true) |
658 res5 = List(false, false, false, true) |
677 boolean. More interestingly, though, it only takes a single |
675 boolean. More interestingly, though, it only takes a single |
678 argument (because of the parentheses). The single argument |
676 argument (because of the parentheses). The single argument |
679 happens to be another function (taking an integer as input and |
677 happens to be another function (taking an integer as input and |
680 returning a string). Remember that \code{mk_string} is just |
678 returning a string). Remember that \code{mk_string} is just |
681 such a function. So how can we use it? For this define |
679 such a function. So how can we use it? For this define |
682 the somewhat silly function \code{apply_3} |
680 the somewhat silly function \code{apply_3}: |
683 |
681 |
684 \begin{lstlisting}[language=Scala,numbers=none] |
682 \begin{lstlisting}[numbers=none] |
685 def apply_3(f: Int => String): Bool = f(3) == "many" |
683 def apply_3(f: Int => String): Bool = f(3) == "many" |
686 |
684 |
687 scala> apply_3(mk_string) |
685 scala> apply_3(mk_string) |
688 res6 = true |
686 res6 = true |
689 \end{lstlisting} |
687 \end{lstlisting} |
690 |
688 |
691 You might ask, apart from silly functions like above, what is |
689 You might ask: Apart from silly functions like above, what is |
692 the point of having function as arguments to other functions? |
690 the point of having function as arguments to other functions? |
693 In Java there is indeed no need of this kind of feature. But |
691 In Java there is indeed no need of this kind of feature. But |
694 in all functional programming languages, including Scala, it |
692 in all functional programming languages, including Scala, it |
695 is really essential. Above you already seen \code{map} and |
693 is really essential. Above you already seen \code{map} and |
696 \code{foreach} which need this. Consider the functions |
694 \code{foreach} which need this. Consider the functions |
747 |
745 |
748 The first wow-moment I had with Scala was when I came across |
746 The first wow-moment I had with Scala was when I came across |
749 the following code-snippet for reading a web-page. |
747 the following code-snippet for reading a web-page. |
750 |
748 |
751 |
749 |
752 \begin{lstlisting}[language=Scala, numbers=none] |
750 \begin{lstlisting}[ numbers=none] |
753 import io.Source |
751 import io.Source |
754 val url = """http://www.inf.kcl.ac.uk/staff/urbanc/""" |
752 val url = """http://www.inf.kcl.ac.uk/staff/urbanc/""" |
755 Source.fromURL(url).take(10000).mkString |
753 Source.fromURL(url).take(10000).mkString |
756 \end{lstlisting} |
754 \end{lstlisting} |
757 |
755 |
758 |
756 |
759 \noindent These three lines return a string containing the |
757 \noindent These three lines return a string containing the |
760 HTML-code of my webpage. It actually already does something |
758 HTML-code of my webpage. It actually already does something |
761 more sophisticated, namely only returns the first 10000 |
759 more sophisticated, namely only returns the first 10000 |
762 characters of a webpage in case a ``webpage'' is too large. |
760 characters of a webpage in case it is too large. Why is that |
763 Why is that code-snippet of any interest? Well, try |
761 code-snippet of any interest? Well, try implementing |
764 implementing reading-from-a-webpage in Java. I also like the |
762 reading-from-a-webpage in Java. I also like the possibility of |
765 possibility of triple-quoting strings, which I have only seen |
763 triple-quoting strings, which I have only seen in Scala so |
766 in Scala so far. The idea behind this is that in such a string |
764 far. The idea behind this is that in such a string all |
767 all characters are interpreted literally---there are no |
765 characters are interpreted literally---there are no escaped |
768 escaped characters, like \verb|\n| for newlines. |
766 characters, like \verb|\n| for newlines. |
769 |
767 |
770 My second wow-moment I had with a feature of Scala that other |
768 My second wow-moment I had with a feature of Scala that other |
771 functional programming languages also do not have. This |
769 functional programming languages do not have. This feature is |
772 feature is about implicit type conversions. If you have |
770 about implicit type conversions. If you have regular |
773 regular expressions and want to use them for language |
771 expressions and want to use them for language processing you |
774 processing you often want to recognise keywords in a language, |
772 often want to recognise keywords in a language, for example |
775 for example \code{for},{} \code{if},{} \code{yield} and so on. But |
773 \code{for},{} \code{if},{} \code{yield} and so on. But the |
776 the basic regular expression, \code{CHAR}, can only recognise |
774 basic regular expression \code{CHAR} can only recognise a |
777 a single character. In order to recognise a whole string, like |
775 single character. In order to recognise a whole string, like |
778 \code{ for}, you have to put many of those together using |
776 \code{for}, you have to put many of those together using |
779 \code{SEQ}: |
777 \code{SEQ}: |
780 |
778 |
781 |
779 |
782 \begin{lstlisting}[language=Scala,numbers=none] |
780 \begin{lstlisting}[numbers=none] |
783 SEQ(CHAR('f'), SEQ(CHAR('o'), CHAR('r'))) |
781 SEQ(CHAR('f'), SEQ(CHAR('o'), CHAR('r'))) |
784 \end{lstlisting} |
782 \end{lstlisting} |
785 |
|
786 |
783 |
787 \noindent This gets quickly unreadable when the strings and |
784 \noindent This gets quickly unreadable when the strings and |
788 regular expressions get more complicated. In other functional |
785 regular expressions get more complicated. In other functional |
789 programming language, you can explicitly write a conversion |
786 programming language, you can explicitly write a conversion |
790 function that takes a string, say \code{for}, and generates the |
787 function that takes a string, say \dq{\pcode{for}}, and |
791 regular expression above. But then your code is littered with |
788 generates the regular expression above. But then your code is |
792 such conversion function. |
789 littered with such conversion functions. |
793 |
790 |
794 In Scala you can do better by ``hiding'' the conversion |
791 In Scala you can do better by ``hiding'' the conversion |
795 functions. The keyword for doing this is \code{implicit} and |
792 functions. The keyword for doing this is \code{implicit} and |
796 it needs a built-in library called |
793 it needs a built-in library called |
797 |
794 |
798 \begin{lstlisting}[language=Scala,numbers=none] |
795 \begin{lstlisting}[numbers=none] |
799 scala.language.implicitConversions |
796 scala.language.implicitConversions |
800 \end{lstlisting} |
797 \end{lstlisting} |
801 |
798 |
802 \noindent |
799 \noindent |
803 Consider the code |
800 Consider the code |
819 |
816 |
820 \noindent where the first seven lines implement a function |
817 \noindent where the first seven lines implement a function |
821 that given a list of characters generates the corresponding |
818 that given a list of characters generates the corresponding |
822 regular expression. In Lines 9 and 10, this function is used |
819 regular expression. In Lines 9 and 10, this function is used |
823 for transforming a string into a regular expression. Since the |
820 for transforming a string into a regular expression. Since the |
824 \code{string2rexp}-function is declared as \code{implicit} the |
821 \code{string2rexp}-function is declared as \code{implicit}, |
825 effect will be that whenever Scala expects a regular |
822 the effect will be that whenever Scala expects a regular |
826 expression, but I only give it a string, it will automatically |
823 expression, but I only give it a string, it will automatically |
827 insert a call to the \code{string2rexp}-function. I can now |
824 insert a call to the \code{string2rexp}-function. I can now |
828 write for example |
825 write for example |
829 |
826 |
830 \begin{lstlisting}[language=Scala,numbers=none] |
827 \begin{lstlisting}[numbers=none] |
831 scala> ALT("ab", "ac") |
828 scala> ALT("ab", "ac") |
832 res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
829 res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
833 \end{lstlisting} |
830 \end{lstlisting} |
834 |
831 |
835 \noindent \code{ALT} expects two regular expressions |
832 \noindent Recall that \code{ALT} expects two regular |
836 as arguments, but I only supply two strings. The implicit |
833 expressions as arguments, but I only supply two strings. The |
837 conversion function will transform the string into |
834 implicit conversion function will transform the string into a |
838 a regular expression. |
835 regular expression. |
839 |
836 |
840 Using implicit definitions, Scala allows me to introduce |
837 Using implicit definitions, Scala allows me to introduce |
841 some further syntactic sugar for regular expressions: |
838 some further syntactic sugar for regular expressions: |
842 |
839 |
843 |
840 |
844 \begin{lstlisting}[language=Scala, numbers=none] |
841 \begin{lstlisting}[ numbers=none] |
845 implicit def RexpOps(r: Rexp) = new { |
842 implicit def RexpOps(r: Rexp) = new { |
846 def | (s: Rexp) = ALT(r, s) |
843 def | (s: Rexp) = ALT(r, s) |
847 def ~ (s: Rexp) = SEQ(r, s) |
844 def ~ (s: Rexp) = SEQ(r, s) |
848 def % = STAR(r) |
845 def % = STAR(r) |
849 } |
846 } |
861 \noindent This might seem a bit overly complicated, but its effect is |
858 \noindent This might seem a bit overly complicated, but its effect is |
862 that I can now write regular expressions such as $ab + ac$ |
859 that I can now write regular expressions such as $ab + ac$ |
863 even simpler as |
860 even simpler as |
864 |
861 |
865 |
862 |
866 \begin{lstlisting}[language=Scala,numbers=none] |
863 \begin{lstlisting}[numbers=none] |
867 scala> "ab" | "ac" |
864 scala> "ab" | "ac" |
868 res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
865 res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
869 \end{lstlisting} |
866 \end{lstlisting} |
870 |
867 |
871 |
868 |
872 \noindent I leave you to figure out what the other |
869 \noindent I leave you to figure out what the other |
873 syntactic sugar in the code above stands for. |
870 syntactic sugar in the code above stands for. |
874 |
871 |
875 One more useful feature of Scala is the ability to define |
872 One more useful feature of Scala is the ability to define |
876 functions with variable argument lists. This is a feature that |
873 functions with varying argument lists. This is a feature that |
877 is already present in old languages, like C, but seems to have |
874 is already present in old languages, like C, but seems to have |
878 been forgotten in the meantime---Java does not have it. In the |
875 been forgotten in the meantime---Java does not have it. In the |
879 context of regular expressions this feature comes in handy: |
876 context of regular expressions this feature comes in handy: |
880 Say you are fed up with writing many alternatives as |
877 Say you are fed up with writing many alternatives as |
881 |
878 |
882 |
879 |
883 \begin{lstlisting}[language=Scala,numbers=none] |
880 \begin{lstlisting}[numbers=none] |
884 ALT(..., ALT(..., ALT(..., ...))) |
881 ALT(..., ALT(..., ALT(..., ...))) |
885 \end{lstlisting} |
882 \end{lstlisting} |
886 |
883 |
887 |
884 |
888 \noindent To make it difficult, you do not know how deep such |
885 \noindent To make it difficult, you do not know how deep such |
909 function which uses the feature of varying argument lists. The |
906 function which uses the feature of varying argument lists. The |
910 effect of this code is that I can write the regular |
907 effect of this code is that I can write the regular |
911 expression for keywords as |
908 expression for keywords as |
912 |
909 |
913 |
910 |
914 \begin{lstlisting}[language=Scala,numbers=none] |
911 \begin{lstlisting}[numbers=none] |
915 ALTS("for", "def", "yield", "implicit", "if", "match", "case") |
912 ALTS("for", "def", "yield", "implicit", "if", "match", "case") |
916 \end{lstlisting} |
913 \end{lstlisting} |
917 |
914 |
918 |
915 |
919 \noindent Again I leave you to it to find out how much this |
916 \noindent Again I leave it to you to find out how much this |
920 simplifies the regular expression in comparison if I had to |
917 simplifies the regular expression in comparison with if I had |
921 write this by hand using only the ``plain'' regular |
918 to write this by hand using only the ``plain'' regular |
922 expressions from the inductive datatype. |
919 expressions from the inductive datatype. |
923 |
920 |
924 \subsection*{More Info} |
921 \subsection*{More Info} |
925 |
922 |
926 There is much more to Scala than I can possibly describe in |
923 There is much more to Scala than I can possibly describe in |
944 also many ``deep'' ideas about types in Scala, which even to |
941 also many ``deep'' ideas about types in Scala, which even to |
945 me as seasoned functional programmer are puzzling. Whilst |
942 me as seasoned functional programmer are puzzling. Whilst |
946 implicits are great, they can also be a source of great |
943 implicits are great, they can also be a source of great |
947 headaches, for example consider the code: |
944 headaches, for example consider the code: |
948 |
945 |
949 |
946 \begin{lstlisting}[numbers=none] |
950 \begin{lstlisting}[language=Scala,numbers=none] |
|
951 scala> List (1, 2, 3) contains "your mom" |
947 scala> List (1, 2, 3) contains "your mom" |
952 res1: Boolean = false |
948 res1: Boolean = false |
953 \end{lstlisting} |
949 \end{lstlisting} |
954 |
|
955 |
950 |
956 \noindent Rather than returning \code{false}, this code should |
951 \noindent Rather than returning \code{false}, this code should |
957 throw a typing-error. There are also many limitations Scala |
952 throw a typing-error. There are also many limitations Scala |
958 inherited from the JVM that can be really annoying. For |
953 inherited from the JVM that can be really annoying. For |
959 example a fixed stack size. One can work around this |
954 example a fixed stack size. One can work around this |
962 Even if Scala has been a success in several high-profile |
957 Even if Scala has been a success in several high-profile |
963 companies, there is also a company (Yammer) that first used |
958 companies, there is also a company (Yammer) that first used |
964 Scala in their production code, but then moved away from it. |
959 Scala in their production code, but then moved away from it. |
965 Allegedly they did not like the steep learning curve of Scala |
960 Allegedly they did not like the steep learning curve of Scala |
966 and also that new versions of Scala often introduced |
961 and also that new versions of Scala often introduced |
967 incompatibilities in old code. |
962 incompatibilities in old code. In the past two months |
|
963 there have also been two forks of the Scala compiler. |
|
964 It needs to be seen what the future brings for Scala. |
968 |
965 |
969 So all in all, Scala might not be a great teaching language, |
966 So all in all, Scala might not be a great teaching language, |
970 but I hope this is mitigated by the fact that I never require |
967 but I hope this is mitigated by the fact that I never require |
971 you to write any Scala code. You only need to be able to read |
968 you to write any Scala code. You only need to be able to read |
972 it. In the coursework you can use any programming language you |
969 it. In the coursework you can use any programming language you |