| author | Christian Urban <urbanc@in.tum.de> | 
| Tue, 29 Oct 2019 11:11:44 +0000 | |
| changeset 280 | a56a6c28b700 | 
| parent 278 | 57b5bba67467 | 
| child 301 | aa0e86419773 | 
| permissions | -rw-r--r-- | 
| 197 | 1 | % !TEX program = xelatex | 
| 123 | 2 | \documentclass{article}
 | 
| 3 | \usepackage{../style}
 | |
| 4 | \usepackage{../langs}
 | |
| 272 | 5 | \usepackage{tikz}
 | 
| 6 | \usepackage{pgf}
 | |
| 123 | 7 | \usepackage{marvosym}
 | 
| 184 | 8 | \usepackage{boxedminipage}
 | 
| 123 | 9 | |
| 272 | 10 | |
| 123 | 11 | %cheat sheet | 
| 12 | %http://worldline.github.io/scala-cheatsheet/ | |
| 13 | ||
| 181 | 14 | % case class, apply, unapply | 
| 170 | 15 | % see https://medium.com/@thejasbabu/scala-pattern-matching-9c9e73ba9a8a | 
| 16 | ||
| 191 | 17 | % the art of programming | 
| 18 | % https://www.youtube.com/watch?v=QdVFvsCWXrA | |
| 19 | ||
| 20 | % functional programming in Scala | |
| 21 | %https://www.amazon.com/gp/product/1449311032/ref=as_li_ss_tl?ie=UTF8&tag=aleottshompag-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=1449311032 | |
| 22 | ||
| 197 | 23 | % functional programming in C | 
| 191 | 24 | %https://www.amazon.com/gp/product/0201419505/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=0201419505&linkCode=as2&tag=aleottshompag-20 | 
| 25 | ||
| 26 | %speeding through haskell | |
| 27 | %https://openlibra.com/en/book/download/speeding-through-haskell | |
| 28 | ||
| 29 | % fp books --- ocaml | |
| 30 | % http://courses.cms.caltech.edu/cs134/cs134b/book.pdf | |
| 31 | % http://alexott.net/en/fp/books/ | |
| 32 | ||
| 257 | 33 | %John Hughes’ simple words: | 
| 34 | %A combinator is a function which builds program fragments | |
| 35 | %from program fragments. | |
| 36 | ||
| 37 | ||
| 264 | 38 | %explain graph coloring program (examples from) | 
| 39 | %https://www.metalevel.at/prolog/optimization | |
| 40 | ||
| 41 | % nice example for map and reduce using Harry potter characters | |
| 42 | % https://www.matthewgerstman.com/map-filter-reduce/ | |
| 43 | ||
| 44 | ||
| 123 | 45 | \begin{document}
 | 
| 271 | 46 | \fnote{\copyright{} Christian Urban, King's College London, 2017, 2018, 2019}
 | 
| 123 | 47 | |
| 125 | 48 | \section*{A Crash-Course in Scala}
 | 
| 123 | 49 | |
| 182 | 50 | \mbox{}\hfill\textit{``Scala --- \underline{S}lowly \underline{c}ompiled 
 | 
| 51 | \underline{a}cademic \underline{la}nguage''}\smallskip\\
 | |
| 192 | 52 | \mbox{}\hfill\textit{ --- a joke(?) found on Twitter}\bigskip
 | 
| 195 | 53 | |
| 54 | \subsection*{Introduction}
 | |
| 55 | ||
| 178 | 56 | \noindent | 
| 170 | 57 | Scala is a programming language that combines functional and | 
| 58 | object-oriented programming-styles. It has received quite a bit of | |
| 181 | 59 | attention in the last five or so years. One reason for this attention is | 
| 60 | that, like the Java programming language, Scala compiles to the Java | |
| 61 | Virtual Machine (JVM) and therefore Scala programs can run under MacOSX, | |
| 195 | 62 | Linux and Windows. Because of this it has also access to | 
| 181 | 63 | the myriads of Java libraries. Unlike Java, however, Scala often allows | 
| 186 | 64 | programmers to write very concise and elegant code. Some therefore say | 
| 182 | 65 | ``Scala is the better Java''.\footnote{from
 | 
| 188 | 66 | \url{https://www.slideshare.net/maximnovak/joy-of-scala}} 
 | 
| 67 | ||
| 191 | 68 | A number of companies---the Guardian, Twitter, Coursera, FourSquare, | 
| 69 | Netflix, LinkedIn, ITV to name a few---either use Scala exclusively in | |
| 188 | 70 | production code, or at least to some substantial degree. Scala seems | 
| 71 | also useful in job-interviews (especially in data science) according to | |
| 72 | this anecdotal report | |
| 170 | 73 | |
| 181 | 74 | \begin{quote}
 | 
| 75 | \url{http://techcrunch.com/2016/06/14/scala-is-the-new-golden-child}
 | |
| 170 | 76 | \end{quote}
 | 
| 77 | ||
| 78 | \noindent | |
| 79 | The official Scala compiler can be downloaded from | |
| 80 | ||
| 81 | \begin{quote}
 | |
| 195 | 82 | \url{http://www.scala-lang.org}\medskip
 | 
| 83 | \end{quote}
 | |
| 170 | 84 | |
| 85 | \noindent | |
| 265 | 86 | If you are interested, there are also experimental backends of Scala | 
| 195 | 87 | for producing code under Android (\url{http://scala-android.org}); for
 | 
| 88 | generating JavaScript code (\url{https://www.scala-js.org}); and there
 | |
| 89 | is work under way to have a native Scala compiler generating X86-code | |
| 90 | (\url{http://www.scala-native.org}). Though be warned these backends
 | |
| 91 | are still rather beta or even alpha. | |
| 92 | ||
| 93 | \subsection*{VS Code and Scala}
 | |
| 94 | ||
| 184 | 95 | I found a convenient IDE for writing Scala programs is Microsoft's | 
| 181 | 96 | \textit{Visual Studio Code} (VS Code) which runs under MacOSX, Linux and
 | 
| 269 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 97 | obviously Windows.\footnote{\ldots{}unlike \emph{Microsoft Visual Studio}---note
 | 
| 191 | 98 | the minuscule difference in the name---which is a heavy-duty, | 
| 269 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 99 | Windows-only IDE\ldots{}jeez, with all their money could they not have come
 | 
| 191 | 100 | up with a completely different name for a complete different project? | 
| 101 | For the pedantic, Microsoft Visual Studio is an IDE, whereas Visual | |
| 269 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 102 | Studio Code is considered to be a \emph{source code editor}. Anybody knows what the
 | 
| 191 | 103 | difference is?} It can be downloaded for free from | 
| 181 | 104 | |
| 105 | \begin{quote}
 | |
| 106 | \url{https://code.visualstudio.com}
 | |
| 107 | \end{quote}
 | |
| 108 | ||
| 109 | \noindent | |
| 110 | and should already come pre-installed in the Department (together with | |
| 195 | 111 | the Scala compiler). Being a project that just started in 2015, VS Code is | 
| 189 | 112 | relatively new and thus far from perfect. However it includes a | 
| 182 | 113 | \textit{Marketplace} from which a multitude of extensions can be
 | 
| 184 | 114 | downloaded that make editing and running Scala code a little easier (see | 
| 115 | Figure~\ref{vscode} for my setup).
 | |
| 181 | 116 | |
| 117 | \begin{figure}[t]
 | |
| 184 | 118 | \begin{boxedminipage}{\textwidth}  
 | 
| 181 | 119 | \begin{center}  
 | 
| 120 | \includegraphics[scale=0.15]{../pics/vscode.png}\\[-10mm]\mbox{}
 | |
| 121 | \end{center}
 | |
| 195 | 122 | \caption{My installation of VS Code includes the following
 | 
| 272 | 123 |   packages from Marketplace: \textbf{Scala Syntax (official)} 0.3.4,
 | 
| 277 | 124 |   \textbf{Code Runner} 0.9.13, \textbf{Code Spell Checker} 1.7.17,
 | 
| 195 | 125 |   \textbf{Rewrap} 1.9.1 and \textbf{Subtle Match
 | 
| 126 |   Brackets} 3.0.0. I have also bound the keys \keys{Ctrl} \keys{Ret} to the
 | |
| 127 | action ``Run-Selected-Text-In-Active-Terminal'' in order to quickly | |
| 272 | 128 | evaluate small code snippets in the Scala REPL. I use the internal | 
| 129 |   terminal to run Scala.\label{vscode}}
 | |
| 184 | 130 | \end{boxedminipage}
 | 
| 181 | 131 | \end{figure}  
 | 
| 132 | ||
| 184 | 133 | What I like most about VS Code is that it provides easy access to the | 
| 186 | 134 | Scala REPL. But if you prefer another editor for coding, it is also | 
| 135 | painless to work with Scala completely on the command line (as you might | |
| 136 | have done with \texttt{g++} in the earlier part of PEP). For the
 | |
| 195 | 137 | lazybones among us, there are even online editors and environments for | 
| 197 | 138 | developing and running Scala programs: \textit{ScalaFiddle}
 | 
| 139 | and \textit{Scastie} are two of them. They require zero setup 
 | |
| 140 | (assuming you have a browser handy). You can access them at | |
| 181 | 141 | |
| 142 | \begin{quote}
 | |
| 195 | 143 |   \url{https://scalafiddle.io}\\
 | 
| 144 |   \url{https://scastie.scala-lang.org}\medskip
 | |
| 181 | 145 | \end{quote}
 | 
| 146 | ||
| 195 | 147 | \noindent | 
| 197 | 148 | But you should be careful if you use them for your coursework: they | 
| 149 | are meant to play around, not really for serious work. | |
| 150 | ||
| 269 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 151 | As one might expect, Scala can be used with the heavy-duty IDEs Eclipse and IntelliJ. | 
| 182 | 152 | A ready-made Scala bundle for Eclipse is available from | 
| 170 | 153 | |
| 154 | \begin{quote}
 | |
| 155 | \url{http://scala-ide.org/download/sdk.html}
 | |
| 156 | \end{quote}
 | |
| 157 | ||
| 158 | \noindent | |
| 191 | 159 | Also IntelliJ includes plugins for Scala. \underline{\textbf{BUT}}, 
 | 
| 160 | I do \textbf{not} recommend the usage of either Eclipse or IntelliJ for PEP: these IDEs
 | |
| 182 | 161 | seem to make your life harder, rather than easier, for the small | 
| 269 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 162 | programs that we will write in this module. They are really meant to be used | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 163 | when you have a million-lines codebase than with our small | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 164 | ``toy-programs''\ldots{}for example why on earth am I required to create a
 | 
| 182 | 165 | completely new project with several subdirectories when I just want to | 
| 272 | 166 | try out 20-lines of Scala code? Your mileage may vary though.~\texttt{;o)}
 | 
| 182 | 167 | |
| 168 | \subsection*{Why Functional Programming?}
 | |
| 169 | ||
| 186 | 170 | Before we go on, let me explain a bit more why we want to inflict upon | 
| 171 | you another programming language. You hopefully have mastered Java and | |
| 269 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 172 | C++\ldots{}the world should be your oyster, no? Well, this is not as
 | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 173 | simple as one might wish. We do require Scala in PEP, but actually we | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 174 | do not religiously care whether you learn Scala---after all it is just | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 175 | a programming language (albeit a nifty one IMHO). What we do care | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 176 | about is that you learn about \textit{functional programming}. Scala
 | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 177 | is just the vehicle for that. Still, you need to learn Scala well | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 178 | enough to get good marks in PEP, but functional programming could | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 179 | equally be taught with Haskell, F\#, SML, Ocaml, Kotlin, Clojure, | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 180 | Scheme, Elm and many other functional programming languages. | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 181 | %Your | 
| 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 182 | %friendly lecturer just happens to like Scala | 
| 186 | 183 | %and the Department agreed that it is a good idea to inflict Scala upon | 
| 184 | %you. | |
| 182 | 185 | |
| 186 | Very likely writing programs in a functional programming language is | |
| 183 | 187 | quite different from what you are used to in your study so far. It | 
| 188 | might even be totally alien to you. The reason is that functional | |
| 189 | programming seems to go against the core principles of | |
| 272 | 190 | \textit{imperative programming} (which is what you do in Java and C/C++
 | 
| 183 | 191 | for example). The main idea of imperative programming is that you have | 
| 277 | 192 | some form of \emph{state} in your program and you continuously change
 | 
| 193 | this state by issuing some commands---for example for updating a field | |
| 194 | in an array or for adding one to a variable and so on. The classic | |
| 195 | example for this style of programming is a \texttt{for}-loop in C/C++.
 | |
| 196 | Consider the snippet: | |
| 182 | 197 | |
| 198 | \begin{lstlisting}[language=C,numbers=none]
 | |
| 184 | 199 | for (int i = 10; i < 20; i++) { 
 | 
| 269 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 200 | //...do something with i... | 
| 184 | 201 | } | 
| 182 | 202 | \end{lstlisting}
 | 
| 203 | ||
| 184 | 204 | \noindent Here the integer variable \texttt{i} embodies the state, which
 | 
| 205 | is first set to \texttt{10} and then increased by one in each
 | |
| 206 | loop-iteration until it reaches \texttt{20} at which point the loop
 | |
| 207 | exits. When this code is compiled and actually runs, there will be some | |
| 186 | 208 | dedicated space reserved for \texttt{i} in memory. This space of
 | 
| 188 | 209 | typically 32 bits contains \texttt{i}'s current value\ldots\texttt{10}
 | 
| 269 
3ef2542207c4
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
265diff
changeset | 210 | at the beginning, and then the content will be overwritten with | 
| 191 | 211 | new content in every iteration. The main point here is that this kind of | 
| 277 | 212 | updating, or overwriting, of memory is 25.806\ldots or \textbf{THE ROOT OF
 | 
| 191 | 213 | ALL EVIL}!! | 
| 186 | 214 | |
| 215 | \begin{center}
 | |
| 216 | \includegraphics[scale=0.25]{../pics/root-of-all-evil.png}
 | |
| 217 | \end{center}  
 | |
| 218 | ||
| 182 | 219 | |
| 220 | \noindent | |
| 221 | \ldots{}Well, it is perfectly benign if you have a sequential program
 | |
| 222 | that gets run instruction by instruction...nicely one after another. | |
| 223 | This kind of running code uses a single core of your CPU and goes as | |
| 184 | 224 | fast as your CPU frequency, also called clock-speed, allows. The problem | 
| 225 | is that this clock-speed has not much increased over the past decade and | |
| 226 | no dramatic increases are predicted for any time soon. So you are a bit | |
| 277 | 227 | stuck. This is unlike previous generations of developers who could rely | 
| 278 | 228 | upon the fact that approximately every 2 years their code would run | 
| 229 | twice as fast because the clock-speed of their CPUs got twice as fast. | |
| 277 | 230 | Unfortunately this does not happen any more nowadays. To get you out of | 
| 231 | this dreadful situation, CPU producers pile more and more cores into | |
| 232 | CPUs in order to make them more powerful and potentially make software | |
| 233 | faster. The task for you as developer is to take somehow advantage of | |
| 234 | these cores by running as much of your code as possible in parallel on | |
| 235 | as many cores you have available (typically 4 in modern laptops and | |
| 278 | 236 | sometimes much more on high-end machines). In this situation | 
| 277 | 237 | \textit{mutable} variables like \texttt{i} above are evil, or at least a
 | 
| 238 | major nuisance: Because if you want to distribute some of the | |
| 183 | 239 | loop-iterations over the cores that are currently idle in your system, | 
| 277 | 240 | you need to be extremely careful about who can read and overwrite the | 
| 241 | variable \texttt{i}.\footnote{If you are of the mistaken belief that
 | |
| 242 | nothing nasty can happen to \texttt{i} inside the \texttt{for}-loop,
 | |
| 243 | then you need to go back over the C++ material.} Especially the writing | |
| 244 | operation is critical because you do not want that conflicting writes | |
| 245 | mess about with \texttt{i}. Take my word: an untold amount of misery has
 | |
| 246 | arisen from this problem. The catch is that if you try to solve this | |
| 247 | problem in C/C++ or Java, and be as defensive as possible about reads | |
| 248 | and writes to \texttt{i}, then you need to synchronise access to it. The
 | |
| 249 | result is that very often your program waits more than it runs, thereby | |
| 183 | 250 | defeating the point of trying to run the program in parallel in the | 
| 251 | first place. If you are less defensive, then usually all hell breaks | |
| 252 | loose by seemingly obtaining random results. And forget the idea of | |
| 253 | being able to debug such code. | |
| 182 | 254 | |
| 184 | 255 | The central idea of functional programming is to eliminate any state | 
| 195 | 256 | from programs---or at least from the ``interesting bits'' of the | 
| 257 | programs. Because then it is easy to parallelise the resulting | |
| 258 | programs: if you do not have any state, then once created, all memory | |
| 259 | content stays unchanged and reads to such memory are absolutely safe | |
| 260 | without the need of any synchronisation. An example is given in | |
| 261 | Figure~\ref{mand} where in the absence of the annoying state, Scala
 | |
| 262 | makes it very easy to calculate the Mandelbrot set on as many cores of | |
| 263 | your CPU as possible. Why is it so easy in this example? Because each | |
| 264 | pixel in the Mandelbrot set can be calculated independently and the | |
| 265 | calculation does not need to update any variable. It is so easy in | |
| 266 | fact that going from the sequential version of the Mandelbrot program | |
| 267 | to the parallel version can be achieved by adding just eight | |
| 268 | characters---in two places you have to add \texttt{.par}. Try the same
 | |
| 272 | 269 | in C/C++ or Java! | 
| 182 | 270 | |
| 271 | \begin{figure}[p]
 | |
| 184 | 272 | \begin{boxedminipage}{\textwidth}
 | 
| 187 | 273 | |
| 191 | 274 | A Scala program for generating pretty pictures of the Mandelbrot set.\smallskip\\ | 
| 275 | (See \url{https://en.wikipedia.org/wiki/Mandelbrot_set} or\\
 | |
| 276 | \phantom{(See }\url{https://www.youtube.com/watch?v=aSg2Db3jF_4}):
 | |
| 184 | 277 | \begin{center}    
 | 
| 278 | \begin{tabular}{c}  
 | |
| 191 | 279 | \includegraphics[scale=0.11]{../pics/mand1.png}\\[-8mm]\mbox{}
 | 
| 184 | 280 | \end{tabular}
 | 
| 187 | 281 | \end{center}
 | 
| 184 | 282 | |
| 187 | 283 | \begin{center}
 | 
| 284 | \begin{tabular}{@{}p{0.45\textwidth}|p{0.45\textwidth}@{}}
 | |
| 191 | 285 | \bf sequential version: & \bf parallel version on 4 cores:\smallskip\\ | 
| 182 | 286 | |
| 191 | 287 |   {\hfill\includegraphics[scale=0.11]{../pics/mand4.png}\hfill} &
 | 
| 288 |   {\hfill\includegraphics[scale=0.11]{../pics/mand3.png}\hfill} \\
 | |
| 187 | 289 | |
| 290 | {\footnotesize\begin{lstlisting}[xleftmargin=-1mm]
 | |
| 186 | 291 | for (y <- (0 until H)) {
 | 
| 292 |   for (x <- (0 until W)) {
 | |
| 293 | ||
| 294 | val c = start + | |
| 295 | (x * d_x + y * d_y * i) | |
| 296 | val iters = iterations(c, max) | |
| 191 | 297 | val colour = | 
| 186 | 298 | if (iters == max) black | 
| 299 | else colours(iters % 16) | |
| 300 | ||
| 191 | 301 | pixel(x, y, colour) | 
| 186 | 302 | } | 
| 303 | viewer.updateUI() | |
| 304 | } | |
| 187 | 305 | \end{lstlisting}}   
 | 
| 306 | & | |
| 307 | {\footnotesize\begin{lstlisting}[xleftmargin=0mm]
 | |
| 188 | 308 | for (y <- (0 until H)/*@\keys{\texttt{.par}}@*/) {
 | 
| 309 |   for (x <- (0 until W)/*@\keys{\texttt{.par}}@*/) {
 | |
| 187 | 310 | |
| 311 | val c = start + | |
| 312 | (x * d_x + y * d_y * i) | |
| 313 | val iters = iterations(c, max) | |
| 191 | 314 | val colour = | 
| 187 | 315 | if (iters == max) black | 
| 316 | else colours(iters % 16) | |
| 317 | ||
| 191 | 318 | pixel(x, y, colour) | 
| 187 | 319 | } | 
| 320 | viewer.updateUI() | |
| 321 | } | |
| 191 | 322 | \end{lstlisting}}\\[-2mm]
 | 
| 187 | 323 | |
| 324 | \centering\includegraphics[scale=0.5]{../pics/cpu2.png} &
 | |
| 188 | 325 | \centering\includegraphics[scale=0.5]{../pics/cpu1.png}
 | 
| 184 | 326 | \end{tabular}
 | 
| 327 | \end{center}
 | |
| 270 
38e13601cb1b
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
269diff
changeset | 328 | \caption{The code of the ``main'' loops in my version of the mandelbrot program.
 | 
| 191 | 329 | The parallel version differs only in \texttt{.par} being added to the
 | 
| 195 | 330 | ``ranges'' of the x and y coordinates. As can be seen from the CPU loads, in | 
| 331 | the sequential version there is a lower peak for an extended period, | |
| 191 | 332 | while in the parallel version there is a short sharp burst for | 
| 333 | essentially the same workload\ldots{}meaning you get more work done 
 | |
| 195 | 334 | in a shorter amount of time. This easy \emph{parallelisation} 
 | 
| 335 | only works reliably with an immutable program. | |
| 188 | 336 | \label{mand}} 
 | 
| 184 | 337 | \end{boxedminipage}
 | 
| 182 | 338 | \end{figure}  
 | 
| 339 | ||
| 275 | 340 | But remember this easy parallelisation of code requires that we have no | 
| 341 | state in our programs\ldots{}that is no counters like \texttt{i} in
 | |
| 342 | \texttt{for}-loops. You might then ask, how do I write loops without
 | |
| 343 | such counters? Well, teaching you that this is possible is one of the | |
| 344 | main points of the Scala-part in PEP. I can assure you it is possible, | |
| 345 | but you have to get your head around it. Once you have mastered this, it | |
| 346 | will be fun to have no state in your programs (a side product is that it | |
| 347 | much easier to debug state-less code and also more often than not easier | |
| 348 | to understand). So have fun with Scala!\footnote{If you are still not
 | |
| 349 | convinced about the function programming ``thing'', there are a few more | |
| 350 | arguments: a lot of research in programming languages happens to take | |
| 351 | place in functional programming languages. This has resulted in | |
| 352 | ultra-useful features such as pattern-matching, strong type-systems, | |
| 353 | laziness, implicits, algebraic datatypes to name a few. Imperative | |
| 354 | languages seem to often lag behind in adopting them: I know, for | |
| 355 | example, that Java will at some point in the future support | |
| 356 | pattern-matching, which has been used for example in SML for at least | |
| 357 | 40(!) years. See | |
| 186 | 358 | \url{http://cr.openjdk.java.net/~briangoetz/amber/pattern-match.html}.
 | 
| 275 | 359 | Automatic garbage collection was included in Java in 1995; the | 
| 360 | functional language LISP had this already in 1958. Generics were added | |
| 361 | to Java 5 in 2004; the functional language SML had it since 1990. | |
| 277 | 362 | Higher-order functions were added to C\# in 2007, to Java 8 in | 
| 275 | 363 | 2014; again LISP had them since 1958. Also Rust, a C-like programming | 
| 364 | language that has been developed since 2010 and is gaining quite some | |
| 365 | interest, borrows many ideas from functional programming from | |
| 277 | 366 | yesteryear.}\medskip | 
| 170 | 367 | |
| 277 | 368 | \noindent | 
| 369 | If you need any after-work distractions, you might have fun reading this | |
| 370 | about FP (functional programming): | |
| 371 | ||
| 372 | \begin{quote}
 | |
| 373 | \url{https://medium.com/better-programming/fp-toy-7f52ea0a947e}
 | |
| 374 | \end{quote}
 | |
| 188 | 375 | |
| 123 | 376 | \subsection*{The Very Basics}
 | 
| 377 | ||
| 378 | One advantage of Scala over Java is that it includes an interpreter (a | |
| 379 | REPL, or | |
| 380 | \underline{R}ead-\underline{E}val-\underline{P}rint-\underline{L}oop)
 | |
| 181 | 381 | with which you can run and test small code snippets without the need | 
| 123 | 382 | of a compiler. This helps a lot with interactively developing | 
| 188 | 383 | programs. It is my preferred way of writing small Scala | 
| 123 | 384 | programs. Once you installed Scala, you can start the interpreter by | 
| 385 | typing on the command line: | |
| 386 | ||
| 387 | \begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small]
 | |
| 388 | $ scala | |
| 271 | 389 | Welcome to Scala 2.13.0 (Java HotSpot(TM) 64-Bit Server VM, Java 9). | 
| 123 | 390 | Type in expressions for evaluation. Or try :help. | 
| 391 | ||
| 392 | scala> | |
| 393 | \end{lstlisting}%$
 | |
| 394 | ||
| 395 | \noindent The precise response may vary depending | |
| 396 | on the version and platform where you installed Scala. At the Scala | |
| 397 | prompt you can type things like \code{2 + 3}\;\keys{Ret} and
 | |
| 398 | the output will be | |
| 399 | ||
| 400 | \begin{lstlisting}[numbers=none]
 | |
| 401 | scala> 2 + 3 | |
| 402 | res0: Int = 5 | |
| 403 | \end{lstlisting}
 | |
| 404 | ||
| 188 | 405 | \noindent The answer means that he result of the addition is of type | 
| 124 | 406 | \code{Int} and the actual result is 5; \code{res0} is a name that
 | 
| 125 | 407 | Scala gives automatically to the result. You can reuse this name later | 
| 188 | 408 | on, for example | 
| 181 | 409 | |
| 410 | \begin{lstlisting}[numbers=none]
 | |
| 411 | scala> res0 + 4 | |
| 412 | res1: Int = 9 | |
| 413 | \end{lstlisting}
 | |
| 414 | ||
| 415 | \noindent | |
| 416 | Another classic example you can try out is | |
| 123 | 417 | |
| 418 | \begin{lstlisting}[numbers=none]
 | |
| 419 | scala> print("hello world")
 | |
| 420 | hello world | |
| 421 | \end{lstlisting}
 | |
| 422 | ||
| 423 | \noindent Note that in this case there is no result. The | |
| 424 | reason is that \code{print} does not actually produce a result
 | |
| 124 | 425 | (there is no \code{resX} and no type), rather it is a
 | 
| 123 | 426 | function that causes the \emph{side-effect} of printing out a
 | 
| 427 | string. Once you are more familiar with the functional | |
| 428 | programming-style, you will know what the difference is | |
| 429 | between a function that returns a result, like addition, and a | |
| 430 | function that causes a side-effect, like \code{print}. We
 | |
| 431 | shall come back to this point later, but if you are curious | |
| 432 | now, the latter kind of functions always has \code{Unit} as
 | |
| 188 | 433 | return type. It is just not printed by Scala. | 
| 123 | 434 | |
| 181 | 435 | You can try more examples with the Scala REPL, but feel free to | 
| 436 | first guess what the result is (not all answers by Scala are obvious): | |
| 123 | 437 | |
| 438 | \begin{lstlisting}[numbers=none]
 | |
| 439 | scala> 2 + 2 | |
| 440 | scala> 1 / 2 | |
| 441 | scala> 1.0 / 2 | |
| 442 | scala> 1 / 2.0 | |
| 443 | scala> 1 / 0 | |
| 444 | scala> 1.0 / 0.0 | |
| 445 | scala> true == false | |
| 446 | scala> true && false | |
| 447 | scala> 1 > 1.0 | |
| 448 | scala> "12345".length | |
| 181 | 449 | scala> List(1,2,1).size | 
| 450 | scala> Set(1,2,1).size | |
| 265 | 451 | scala> List(1) == List(1) | 
| 452 | scala> Array(1) == Array(1) | |
| 453 | scala> Array(1).sameElements(Array(1)) | |
| 181 | 454 | \end{lstlisting}\smallskip
 | 
| 123 | 455 | |
| 181 | 456 | \noindent | 
| 457 | Please take the Scala REPL seriously: If you want to take advantage of my | |
| 458 | reference implementation for the assignments, you will need to be | |
| 459 | able to ``play around'' with it! | |
| 460 | ||
| 461 | \subsection*{Standalone Scala Apps}
 | |
| 123 | 462 | |
| 277 | 463 | If you want to write a standalone app in Scala, you can | 
| 197 | 464 | implement an object that is an instance of \code{App}. For example
 | 
| 465 | write | |
| 123 | 466 | |
| 467 | \begin{lstlisting}[numbers=none]
 | |
| 468 | object Hello extends App {
 | |
| 469 |     println("hello world")
 | |
| 470 | } | |
| 471 | \end{lstlisting}
 | |
| 472 | ||
| 197 | 473 | \noindent save it in a file, say {\tt hello-world.scala}, and
 | 
| 188 | 474 | then run the compiler (\texttt{scalac}) and start the runtime
 | 
| 181 | 475 | environment (\texttt{scala}):
 | 
| 123 | 476 | |
| 477 | \begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small]
 | |
| 478 | $ scalac hello-world.scala | |
| 479 | $ scala Hello | |
| 480 | hello world | |
| 481 | \end{lstlisting}
 | |
| 482 | ||
| 124 | 483 | \noindent | 
| 123 | 484 | Like Java, Scala targets the JVM and consequently | 
| 485 | Scala programs can also be executed by the bog-standard Java | |
| 486 | Runtime. This only requires the inclusion of {\tt
 | |
| 487 | scala-library.jar}, which on my computer can be done as | |
| 488 | follows: | |
| 489 | ||
| 490 | \begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small]
 | |
| 491 | $ scalac hello-world.scala | |
| 492 | $ java -cp /usr/local/src/scala/lib/scala-library.jar:. Hello | |
| 493 | hello world | |
| 494 | \end{lstlisting}
 | |
| 495 | ||
| 496 | \noindent You might need to adapt the path to where you have | |
| 497 | installed Scala. | |
| 498 | ||
| 499 | \subsection*{Values}
 | |
| 500 | ||
| 124 | 501 | In the lectures I will try to avoid as much as possible the term | 
| 502 | \emph{variables} familiar from other programming languages. The reason
 | |
| 503 | is that Scala has \emph{values}, which can be seen as abbreviations of
 | |
| 271 | 504 | larger expressions. The keyword for defining values is \code{val}.
 | 
| 505 | For example | |
| 123 | 506 | |
| 507 | \begin{lstlisting}[numbers=none]
 | |
| 508 | scala> val x = 42 | |
| 509 | x: Int = 42 | |
| 510 | ||
| 511 | scala> val y = 3 + 4 | |
| 512 | y: Int = 7 | |
| 513 | ||
| 514 | scala> val z = x / y | |
| 515 | z: Int = 6 | |
| 516 | \end{lstlisting}
 | |
| 517 | ||
| 518 | \noindent | |
| 272 | 519 | As can be seen, we first define \code{x} and {y} with admittedly some silly
 | 
| 271 | 520 | expressions, and then reuse these values in the definition of \code{z}.
 | 
| 272 | 521 | All easy, right? Why the kerfuffle about values? Well, values are | 
| 271 | 522 | \emph{immutable}. You cannot change their value after you defined them.
 | 
| 523 | If you try to reassign \code{z} above, Scala will yell at you:
 | |
| 123 | 524 | |
| 525 | \begin{lstlisting}[numbers=none]
 | |
| 526 | scala> z = 9 | |
| 527 | error: reassignment to val | |
| 528 | z = 9 | |
| 529 | ^ | |
| 530 | \end{lstlisting}
 | |
| 531 | ||
| 532 | \noindent | |
| 533 | So it would be a bit absurd to call values as variables...you cannot | |
| 195 | 534 | change them; they cannot vary. You might think you can reassign them like | 
| 123 | 535 | |
| 536 | \begin{lstlisting}[numbers=none]
 | |
| 537 | scala> val x = 42 | |
| 538 | scala> val z = x / 7 | |
| 539 | scala> val x = 70 | |
| 540 | scala> println(z) | |
| 541 | \end{lstlisting}
 | |
| 542 | ||
| 124 | 543 | \noindent but try to guess what Scala will print out | 
| 123 | 544 | for \code{z}?  Will it be \code{6} or \code{10}? A final word about
 | 
| 545 | values: Try to stick to the convention that names of values should be | |
| 188 | 546 | lower case, like \code{x}, \code{y}, \code{foo41} and so on. Upper-case
 | 
| 271 | 547 | names you should reserve for what is called \emph{constructors}. And 
 | 
| 548 | forgive me when I call values as variables\ldots{}it is just something that
 | |
| 549 | has been in imprinted into my developer-DNA during my early days and | |
| 272 | 550 | is difficult to get rid of.~\texttt{;o)}  
 | 
| 123 | 551 | |
| 552 | ||
| 553 | \subsection*{Function Definitions}
 | |
| 554 | ||
| 181 | 555 | We do functional programming! So defining functions will be our main occupation. | 
| 182 | 556 | As an example, a function named \code{f} taking a single argument of type 
 | 
| 181 | 557 | \code{Int} can be defined in Scala as follows:
 | 
| 123 | 558 | |
| 559 | \begin{lstlisting}[numbers=none]
 | |
| 181 | 560 | def f(x: Int) : String = ...EXPR... | 
| 123 | 561 | \end{lstlisting} 
 | 
| 562 | ||
| 563 | \noindent | |
| 124 | 564 | This function returns the value resulting from evaluating the expression | 
| 271 | 565 | \code{EXPR} (whatever is substituted for this). Since we declared
 | 
| 566 | \code{String}, the result of this function will be of type
 | |
| 567 | \code{String}. It is a good habit to always include this information
 | |
| 272 | 568 | about the return type, while it is only strictly necessary to give this | 
| 569 | type in recursive functions. Simple examples of Scala functions are: | |
| 123 | 570 | |
| 571 | \begin{lstlisting}[numbers=none]
 | |
| 572 | def incr(x: Int) : Int = x + 1 | |
| 573 | def double(x: Int) : Int = x + x | |
| 574 | def square(x: Int) : Int = x * x | |
| 575 | \end{lstlisting}
 | |
| 576 | ||
| 577 | \noindent | |
| 578 | The general scheme for a function is | |
| 579 | ||
| 580 | \begin{lstlisting}[numbers=none]
 | |
| 581 | def fname(arg1: ty1, arg2: ty2,..., argn: tyn): rty = {
 | |
| 271 | 582 | ...BODY... | 
| 123 | 583 | } | 
| 584 | \end{lstlisting}
 | |
| 585 | ||
| 586 | \noindent | |
| 197 | 587 | where each argument, \texttt{arg1}, \texttt{arg2} and so on, requires 
 | 
| 588 | its type and the result type of the | |
| 589 | function, \code{rty}, should also be given. If the body of the function is
 | |
| 124 | 590 | more complex, then it can be enclosed in braces, like above. If it it | 
| 591 | is just a simple expression, like \code{x + 1}, you can omit the
 | |
| 195 | 592 | braces. Very often functions are recursive (that is call themselves), | 
| 593 | like the venerable factorial function: | |
| 123 | 594 | |
| 595 | \begin{lstlisting}[numbers=none]
 | |
| 271 | 596 | def fact(n: Int) : Int = | 
| 123 | 597 | if (n == 0) 1 else n * fact(n - 1) | 
| 598 | \end{lstlisting}
 | |
| 188 | 599 | |
| 600 | \noindent | |
| 272 | 601 | We could also have written this with braces as | 
| 271 | 602 | |
| 603 | \begin{lstlisting}[numbers=none]
 | |
| 604 | def fact(n: Int) : Int = {
 | |
| 605 | if (n == 0) 1 | |
| 606 | else n * fact(n - 1) | |
| 607 | } | |
| 608 | \end{lstlisting}
 | |
| 609 | ||
| 610 | \noindent | |
| 272 | 611 | but this seems a bit overkill for a small function like \code{fact}.
 | 
| 188 | 612 | Note that Scala does not have a \code{then}-keyword in an \code{if}-statement.
 | 
| 271 | 613 | Note also that there are a few other ways of how to define a function. We | 
| 272 | 614 | will see some of them in the next sections. | 
| 615 | ||
| 616 | Before we go on, let me explain one tricky point in function | |
| 617 | definitions, especially in larger definitions. What does a Scala function | |
| 618 | actually return? Scala has a \code{return} keyword, but it is
 | |
| 619 | used for something different than in Java (and C/C++). Therefore please | |
| 620 | make sure no \code{return} slips into your Scala code.
 | |
| 621 | ||
| 622 | So in the absence of \code{return}, what value does a Scala function
 | |
| 623 | actually produce? A rule-of-thumb is whatever is in the last line of the | |
| 624 | function is the value that will be returned. Consider the following | |
| 625 | example:\footnote{We could have written this function in just one line,
 | |
| 626 | but for the sake of argument lets keep the two intermediate values.} | |
| 627 | ||
| 628 | \begin{lstlisting}[numbers=none]
 | |
| 277 | 629 | def average(xs: List[Int]) : Int = {
 | 
| 272 | 630 | val s = xs.sum | 
| 631 | val n = xs.length | |
| 632 | s / n | |
| 633 | } | |
| 634 | \end{lstlisting}
 | |
| 635 | ||
| 636 | \noindent In this example the expression \code{s / n} is in the last
 | |
| 637 | line of the function---so this will be the result the function | |
| 638 | calculates. The two lines before just calculate intermediate values. | |
| 277 | 639 | This principle of the ``last-line'' comes in handy when you need to print | 
| 272 | 640 | out values, for example, for debugging purposes. Suppose you want | 
| 641 | rewrite the function as | |
| 642 | ||
| 643 | \begin{lstlisting}[numbers=none]
 | |
| 277 | 644 | def average(xs: List[Int]) : Int = {
 | 
| 272 | 645 | val s = xs.sum | 
| 646 | val n = xs.length | |
| 647 | val h = xs.head | |
| 648 | println(s"Input $xs with first element $h") | |
| 649 | s / n | |
| 650 | } | |
| 651 | \end{lstlisting}
 | |
| 652 | ||
| 653 | \noindent | |
| 654 | Here the function still only returns the expression in the last line. | |
| 655 | The \code{println} before just prints out some information about the
 | |
| 656 | input of this function, but does not contribute to the result of the | |
| 657 | function. Similarly, the value \code{h} is used in the \code{println}
 | |
| 658 | but does not contribute to what integer is returned. However note that | |
| 659 | the idea with the ``last line'' is only a rough rule-of-thumb. A better | |
| 277 | 660 | rule might be: the last expression that is evaluated in the function. | 
| 272 | 661 | Consider the following version of \code{iaverage}:
 | 
| 662 | ||
| 663 | \begin{lstlisting}[numbers=none]
 | |
| 277 | 664 | def average(xs: List[Int]) : Int = {
 | 
| 272 | 665 | if (xs.length == 0) 0 | 
| 666 | else xs.sum / xs.length | |
| 667 | } | |
| 668 | \end{lstlisting}
 | |
| 669 | ||
| 670 | \noindent | |
| 671 | What does this function return? Well are two possibilities: either the | |
| 672 | result of \code{xs.sum / xs.length} in the last line provided the list
 | |
| 673 | \code{xs} is nonempty, \textbf{or} if the list is empty, then it will
 | |
| 674 | return \code{0} from the \code{if}-branch (which is technically not the
 | |
| 675 | last line, but the last expression evaluated by the function in the | |
| 676 | empty-case). | |
| 677 | ||
| 678 | Summing up, do not use \code{return} in your Scala code! A function
 | |
| 679 | returns what is evaluated by the function as the last expression. There | |
| 680 | is always only one such last expression. Previous expressions might | |
| 277 | 681 | calculate intermediate values, but they are not returned. If your | 
| 682 | function is supposed to return multiple things, then one way in Scala is | |
| 683 | to use tuples. For example returning the minimum, average and maximum | |
| 684 | can be achieved by | |
| 271 | 685 | |
| 277 | 686 | \begin{lstlisting}[numbers=none]
 | 
| 687 | def avr_minmax(xs: List[Int]) : (Int, Int, Int) = {
 | |
| 688 | if (xs.length == 0) (0, 0, 0) | |
| 689 | else (xs.min, xs.sum / xs.length, xs.max) | |
| 690 | } | |
| 691 | \end{lstlisting}
 | |
| 692 | ||
| 693 | \noindent | |
| 694 | which still satisfies the rule-of-thumb. | |
| 695 | ||
| 696 | ||
| 697 | \subsection*{Loops, or Better the Absence Thereof}
 | |
| 123 | 698 | |
| 272 | 699 | Coming from Java or C/C++, you might be surprised that Scala does | 
| 123 | 700 | not really have loops. It has instead, what is in functional | 
| 701 | programming called, \emph{maps}. To illustrate how they work,
 | |
| 702 | let us assume you have a list of numbers from 1 to 8 and want to | |
| 703 | build the list of squares. The list of numbers from 1 to 8 | |
| 704 | can be constructed in Scala as follows: | |
| 705 | ||
| 706 | \begin{lstlisting}[numbers=none]
 | |
| 707 | scala> (1 to 8).toList | |
| 708 | res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8) | |
| 709 | \end{lstlisting}
 | |
| 710 | ||
| 197 | 711 | \noindent Generating from this list the list of corresponding | 
| 712 | squares in a programming language such as Java, you would assume | |
| 713 | the list is given as a kind of array. You would then iterate, or loop, | |
| 123 | 714 | an index over this array and replace each entry in the array | 
| 715 | by the square. Right? In Scala, and in other functional | |
| 716 | programming languages, you use maps to achieve the same. | |
| 717 | ||
| 272 | 718 | A map essentially takes a function that describes how each element is | 
| 719 | transformed (in this example the function is $n \rightarrow n * n$) and | |
| 720 | a list over which this function should work. Pictorially you can think | |
| 721 | of the idea behind maps as follows: | |
| 722 | ||
| 723 | \begin{center}
 | |
| 724 | \begin{tikzpicture}
 | |
| 725 | ||
| 726 |   \node (A0) at (1.2,0) {\texttt{List(}};
 | |
| 727 |   \node (A1) at (2.0,0) {\texttt{1\makebox[0mm]{ ,}}};
 | |
| 728 |   \node (A2) at (2.9,0) {\texttt{2\makebox[0mm]{ ,}}};
 | |
| 729 |   \node (A3) at (3.8,0) {\texttt{3\makebox[0mm]{ ,}}};
 | |
| 730 |   \node (A4) at (4.7,0) {\texttt{4\makebox[0mm]{ ,}}};
 | |
| 731 |   \node (A5) at (5.6,0) {\texttt{5\makebox[0mm]{ ,}}};
 | |
| 732 |   \node (A6) at (6.5,0) {\texttt{6\makebox[0mm]{ ,}}};
 | |
| 733 |   \node (A7) at (7.4,0) {\texttt{7\makebox[0mm]{ ,}}};
 | |
| 734 |   \node (A8) at (8.3,0) {\texttt{8)}};
 | |
| 735 | ||
| 736 |   \node (B0) at (1.2,-3) {\texttt{List(}};
 | |
| 737 |   \node (B1) at (2.0,-3) {\texttt{1\makebox[0mm]{ ,}}};
 | |
| 738 |   \node (B2) at (3.0,-3) {\texttt{4\makebox[0mm]{ ,}}};
 | |
| 739 |   \node (B3) at (4.1,-3) {\texttt{9\makebox[0mm]{ ,}}};
 | |
| 740 |   \node (B4) at (5.2,-3) {\texttt{16\makebox[0mm]{ ,}}};
 | |
| 741 |   \node (B5) at (6.3,-3) {\texttt{25\makebox[0mm]{ ,}}};
 | |
| 742 |   \node (B6) at (7.4,-3) {\texttt{36\makebox[0mm]{ ,}}};
 | |
| 743 |   \node (B7) at (8.4,-3) {\texttt{49\makebox[0mm]{ ,}}};
 | |
| 744 |   \node (B8) at (9.4,-3) {\texttt{64\makebox[0mm]{ )}}};
 | |
| 745 | ||
| 746 | \draw [->,line width=1mm] (A1.south) -- (B1.north); | |
| 747 | \draw [->,line width=1mm] (A2.south) -- (B2.north); | |
| 748 | \draw [->,line width=1mm] (A3.south) -- (B3.north); | |
| 749 | \draw [->,line width=1mm] (A4.south) -- (B4.north); | |
| 750 | \draw [->,line width=1mm] (A5.south) -- (B5.north); | |
| 751 | \draw [->,line width=1mm] (A6.south) -- (B6.north); | |
| 752 | \draw [->,line width=1mm] (A7.south) -- (B7.north); | |
| 753 | \draw [->,line width=1mm] (A8.south) -- (B8.north); | |
| 754 | ||
| 277 | 755 |   \node [red] (Q0) at (-0.3,-0.3) {\large\texttt{n}}; 
 | 
| 756 |   \node (Q1) at (-0.3,-0.4) {};
 | |
| 757 |   \node (Q2) at (-0.3,-2.5) {};
 | |
| 758 |   \node [red] (Q3) at (-0.3,-2.65) {\large\texttt{n\,*\,n}};
 | |
| 272 | 759 | \draw [->,red,line width=1mm] (Q1.south) -- (Q2.north); | 
| 760 | ||
| 761 |   \node [red] at (-1.3,-1.5) {\huge{}\it\textbf{map}};
 | |
| 762 |  \end{tikzpicture}
 | |
| 763 | \end{center}
 | |
| 764 | ||
| 765 | \noindent | |
| 766 | On top is the ``input'' list we want to transform; on the left is the | |
| 767 | ``map'' function for how to transform each element in the input list | |
| 768 | (the square function in this case); at the bottom is the result list of | |
| 277 | 769 | the map. This means that a map generates a \emph{new} list, unlike a
 | 
| 273 | 770 | for-loop in Java or C/C++ which would most likely just update the | 
| 277 | 771 | existing list/array. | 
| 272 | 772 | |
| 277 | 773 | Now there are two ways for expressing such maps in Scala. The first way is | 
| 272 | 774 | called a \emph{for-comprehension}. The keywords are \code{for} and
 | 
| 775 | \code{yield}. Squaring the numbers from 1 to 8 with a for-comprehension
 | |
| 123 | 776 | would look as follows: | 
| 777 | ||
| 778 | \begin{lstlisting}[numbers=none]
 | |
| 779 | scala> for (n <- (1 to 8).toList) yield n * n | |
| 780 | res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64) | |
| 781 | \end{lstlisting}
 | |
| 782 | ||
| 272 | 783 | \noindent This for-comprehension states that from the list of numbers | 
| 277 | 784 | we draw some elements. We use the name \code{n} to range over these
 | 
| 785 | elements (whereby the name is arbitrary; we could use something more | |
| 786 | descriptive if we wanted to). Using \code{n} we compute the result of
 | |
| 787 | \code{n * n} after the \code{yield}. This way of writing a map resembles
 | |
| 788 | a bit the for-loops from imperative languages, even though the ideas | |
| 789 | behind for-loops and for-comprehensions are quite different. Also, this | |
| 790 | is a simple example---what comes after \code{yield} can be a complex
 | |
| 791 | expression enclosed in \texttt{\{...\}}. A more complicated example
 | |
| 792 | might be | |
| 272 | 793 | |
| 794 | \begin{lstlisting}[numbers=none]
 | |
| 795 | scala> for (n <- (1 to 8).toList) yield {
 | |
| 796 | val i = n + 1 | |
| 797 | val j = n - 1 | |
| 273 | 798 | i * j + 1 | 
| 272 | 799 | } | 
| 273 | 800 | res3: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64) | 
| 272 | 801 | \end{lstlisting}
 | 
| 802 | ||
| 803 | As you can see in for-comprehensions above, we specified the list where | |
| 804 | each \code{n} comes from, namely \code{(1 to 8).toList}, and how each
 | |
| 805 | element needs to be transformed. This can also be expressed in a second | |
| 806 | way in Scala by using directly the function \code{map} as follows:
 | |
| 123 | 807 | |
| 808 | \begin{lstlisting}[numbers=none]
 | |
| 809 | scala> (1 to 8).toList.map(n => n * n) | |
| 810 | res3 = List(1, 4, 9, 16, 25, 36, 49, 64) | |
| 811 | \end{lstlisting}
 | |
| 812 | ||
| 272 | 813 | \noindent In this way, the expression \code{n => n * n} stands for the
 | 
| 814 | function that calculates the square (this is how the \code{n}s are
 | |
| 815 | transformed by the map). It might not be obvious, but | |
| 277 | 816 | the for-comprehensions above are just syntactic sugar: when compiling such | 
| 273 | 817 | code, Scala translates for-comprehensions into equivalent maps. This | 
| 818 | even works when for-comprehensions get more complicated (see below). | |
| 123 | 819 | |
| 820 | The very charming feature of Scala is that such maps or | |
| 272 | 821 | for-comprehensions can be written for any kind of data collection, such | 
| 822 | as lists, sets, vectors, options and so on. For example if we instead | |
| 823 | compute the remainders modulo 3 of this list, we can write | |
| 123 | 824 | |
| 825 | \begin{lstlisting}[numbers=none]
 | |
| 826 | scala> (1 to 8).toList.map(n => n % 3) | |
| 827 | res4 = List(1, 2, 0, 1, 2, 0, 1, 2) | |
| 828 | \end{lstlisting}
 | |
| 829 | ||
| 830 | \noindent If we, however, transform the numbers 1 to 8 not | |
| 270 
38e13601cb1b
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
269diff
changeset | 831 | into a list, but into a set, and then compute the remainders | 
| 123 | 832 | modulo 3 we obtain | 
| 833 | ||
| 834 | \begin{lstlisting}[numbers=none]
 | |
| 835 | scala> (1 to 8).toSet[Int].map(n => n % 3) | |
| 836 | res5 = Set(2, 1, 0) | |
| 837 | \end{lstlisting}
 | |
| 838 | ||
| 839 | \noindent This is the correct result for sets, as there are | |
| 840 | only three equivalence classes of integers modulo 3. Note that | |
| 841 | in this example we need to ``help'' Scala to transform the | |
| 842 | numbers into a set of integers by explicitly annotating the | |
| 843 | type \code{Int}. Since maps and for-comprehensions are
 | |
| 844 | just syntactic variants of each other, the latter can also be | |
| 845 | written as | |
| 846 | ||
| 847 | \begin{lstlisting}[numbers=none]
 | |
| 848 | scala> for (n <- (1 to 8).toSet[Int]) yield n % 3 | |
| 849 | res5 = Set(2, 1, 0) | |
| 850 | \end{lstlisting}
 | |
| 851 | ||
| 852 | For-comprehensions can also be nested and the selection of | |
| 853 | elements can be guarded. For example if we want to pair up | |
| 854 | the numbers 1 to 4 with the letters a to c, we can write | |
| 855 | ||
| 856 | \begin{lstlisting}[numbers=none]
 | |
| 857 | scala> for (n <- (1 to 4).toList; | |
| 858 |             m <- ('a' to 'c').toList) yield (n, m)
 | |
| 859 | res6 = List((1,a), (1,b), (1,c), (2,a), (2,b), (2,c), | |
| 860 | (3,a), (3,b), (3,c), (4,a), (4,b), (4,c)) | |
| 861 | \end{lstlisting}
 | |
| 862 | ||
| 863 | \noindent | |
| 272 | 864 | In this example the for-comprehension ranges over two lists, and | 
| 277 | 865 | produces a list of pairs as output. Or, if we want to find all pairs of | 
| 272 | 866 | numbers between 1 and 3 where the sum is an even number, we can write | 
| 123 | 867 | |
| 868 | \begin{lstlisting}[numbers=none]
 | |
| 869 | scala> for (n <- (1 to 3).toList; | |
| 870 | m <- (1 to 3).toList; | |
| 871 | if (n + m) % 2 == 0) yield (n, m) | |
| 872 | res7 = List((1,1), (1,3), (2,2), (3,1), (3,3)) | |
| 873 | \end{lstlisting}
 | |
| 874 | ||
| 272 | 875 | \noindent The \code{if}-condition in this for-comprehension filters out
 | 
| 277 | 876 | all pairs where the sum is not even (therefore \code{(1, 2)}, \code{(2,
 | 
| 877 | 1)} and \code{(3, 2)} are not in the result because their sum is odd). 
 | |
| 272 | 878 | |
| 278 | 879 | To summarise, maps (or for-comprehensions) transform one collection into | 
| 273 | 880 | another. For example a list of \code{Int}s into a list of squares, and
 | 
| 881 | so on. There is no need for for-loops in Scala. But please do not be | |
| 882 | tempted to write anything like | |
| 272 | 883 | |
| 884 | \begin{lstlisting}[numbers=none]
 | |
| 885 | scala> val cs = ('a' to 'h').toList
 | |
| 886 | scala> for (n <- (0 until cs.length).toList) | |
| 887 | yield cs(n).capitalize | |
| 888 | res8: List[Char] = List(A, B, C, D, E, F, G, H) | |
| 889 | \end{lstlisting}
 | |
| 890 | ||
| 891 | \noindent | |
| 277 | 892 | This is accepted Scala-code, but utterly bad style (it is more like | 
| 893 | Java). It can be written much clearer as: | |
| 272 | 894 | |
| 895 | \begin{lstlisting}[numbers=none]
 | |
| 896 | scala> val cs = ('a' to 'h').toList
 | |
| 897 | scala> for (c <- cs) yield c.capitalize | |
| 898 | res9: List[Char] = List(A, B, C, D, E, F, G, H) | |
| 899 | \end{lstlisting}
 | |
| 123 | 900 | |
| 271 | 901 | \subsection*{Results and Side-Effects}
 | 
| 902 | ||
| 903 | While hopefully this all about maps looks reasonable, there is one | |
| 273 | 904 | complication: In the examples above we always wanted to transform one | 
| 905 | list into another list (e.g.~list of squares), or one set into another | |
| 906 | set (set of numbers into set of remainders modulo 3). What happens if we | |
| 907 | just want to print out a list of integers? In these cases the | |
| 908 | for-comprehensions need to be modified. The reason is that \code{print},
 | |
| 909 | you guessed it, does not produce any result, but only produces what is | |
| 910 | in the functional-programming-lingo called a \emph{side-effect}\ldots it
 | |
| 911 | prints something out on the screen. Printing out the list of numbers | |
| 912 | from 1 to 5 would look as follows | |
| 123 | 913 | |
| 914 | \begin{lstlisting}[numbers=none]
 | |
| 915 | scala> for (n <- (1 to 5).toList) print(n) | |
| 916 | 12345 | |
| 917 | \end{lstlisting}
 | |
| 918 | ||
| 919 | \noindent | |
| 920 | where you need to omit the keyword \code{yield}. You can
 | |
| 921 | also do more elaborate calculations such as | |
| 922 | ||
| 923 | \begin{lstlisting}[numbers=none]
 | |
| 924 | scala> for (n <- (1 to 5).toList) {
 | |
| 197 | 925 | val square = n * n | 
| 926 | println(s"$n * $n = $square") | |
| 123 | 927 | } | 
| 928 | 1 * 1 = 1 | |
| 929 | 2 * 2 = 4 | |
| 930 | 3 * 3 = 9 | |
| 931 | 4 * 4 = 16 | |
| 932 | 5 * 5 = 25 | |
| 933 | \end{lstlisting}%$
 | |
| 934 | ||
| 935 | \noindent In this code I use a variable assignment (\code{val
 | |
| 197 | 936 | square = ...} ) and also what is called in Scala a | 
| 123 | 937 | \emph{string interpolation}, written \code{s"..."}. The latter
 | 
| 938 | is for printing out an equation. It allows me to refer to the | |
| 270 
38e13601cb1b
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
269diff
changeset | 939 | integer values \code{n} and \code{square} inside a string.
 | 
| 123 | 940 | This is very convenient for printing out ``things''. | 
| 941 | ||
| 942 | The corresponding map construction for functions with | |
| 943 | side-effects is in Scala called \code{foreach}. So you 
 | |
| 944 | could also write | |
| 945 | ||
| 946 | ||
| 947 | \begin{lstlisting}[numbers=none]
 | |
| 948 | scala> (1 to 5).toList.foreach(n => print(n)) | |
| 949 | 12345 | |
| 950 | \end{lstlisting}
 | |
| 951 | ||
| 952 | ||
| 953 | \noindent or even just | |
| 954 | ||
| 955 | \begin{lstlisting}[numbers=none]
 | |
| 956 | scala> (1 to 5).toList.foreach(print) | |
| 957 | 12345 | |
| 958 | \end{lstlisting}
 | |
| 959 | ||
| 273 | 960 | \noindent | 
| 123 | 961 | If you want to find out more about maps and functions with | 
| 962 | side-effects, you can ponder about the response Scala gives if | |
| 963 | you replace \code{foreach} by \code{map} in the expression
 | |
| 964 | above. Scala will still allow \code{map} with side-effect
 | |
| 965 | functions, but then reacts with a slightly interesting result. | |
| 966 | ||
| 273 | 967 | \subsection*{Aggregates}
 | 
| 968 | ||
| 969 | There is one more usage of for-loops in Java, C/C++ and the like: | |
| 970 | sometimes you want to \emph{aggregate} something about a list, for
 | |
| 278 | 971 | example summing up all its elements. In this case you cannot use maps, | 
| 273 | 972 | because maps \emph{transform} one data collection into another data
 | 
| 973 | collection. They cannot be used to generate a single integer | |
| 278 | 974 | representing an aggregate. So how is this kind of aggregation done in | 
| 975 | Scala? Let us suppose you want to sum up all elements from a list. You | |
| 976 | might be tempted to write something like | |
| 273 | 977 | |
| 978 | \begin{lstlisting}[numbers=none]
 | |
| 979 | var cnt = 0 | |
| 980 | for (n <- (1 to 8).toList) {
 | |
| 981 | cnt += n | |
| 982 | } | |
| 983 | print(cnt) | |
| 984 | \end{lstlisting}
 | |
| 985 | ||
| 986 | \noindent | |
| 277 | 987 | and indeed this is accepted Scala code and produces the expected result, | 
| 273 | 988 | namely \code{36}, \textbf{BUT} this is imperative style and not
 | 
| 277 | 989 | permitted in PEP. It uses a \code{var} and therefore violates the
 | 
| 278 | 990 | immutability property I ask for in your code. Sorry! | 
| 273 | 991 | |
| 992 | So how to do that same thing without using a \code{var}? Well there are
 | |
| 993 | several ways. One way is to define the following recursive | |
| 994 | \code{sum}-function:
 | |
| 995 | ||
| 996 | \begin{lstlisting}[numbers=none]
 | |
| 997 | def sum(xs: List[Int]) : Int = | |
| 998 | if (xs.isEmpty) 0 else xs.head + sum(xs.tail) | |
| 999 | \end{lstlisting}  
 | |
| 1000 | ||
| 1001 | \noindent | |
| 1002 | You can then call \code{sum((1 to 8).toList)} and obtain the same result
 | |
| 278 | 1003 | without a mutable variable and without a for-loop. Obviously for simple things like | 
| 277 | 1004 | sum, you could have written \code{xs.sum} in the first place. But not
 | 
| 1005 | all aggregate functions are pre-defined and often you have to write your | |
| 1006 | own recursive function for this. | |
| 273 | 1007 | |
| 1008 | ||
| 271 | 1009 | \subsection*{Higher-Order Functions}
 | 
| 1010 | ||
| 274 | 1011 | TBD | 
| 1012 | ||
| 123 | 1013 | \subsection*{Types}
 | 
| 1014 | ||
| 1015 | In most functional programming languages, types play an | |
| 1016 | important role. Scala is such a language. You have already | |
| 1017 | seen built-in types, like \code{Int}, \code{Boolean},
 | |
| 1018 | \code{String} and \code{BigInt}, but also user-defined ones,
 | |
| 195 | 1019 | like \code{Rexp} (see coursework). Unfortunately, types can be a thorny
 | 
| 123 | 1020 | subject, especially in Scala. For example, why do we need to | 
| 1021 | give the type to \code{toSet[Int]}, but not to \code{toList}?
 | |
| 1022 | The reason is the power of Scala, which sometimes means it | |
| 1023 | cannot infer all necessary typing information. At the | |
| 195 | 1024 | beginning, while getting familiar with Scala, I recommend a | 
| 123 | 1025 | ``play-it-by-ear-approach'' to types. Fully understanding | 
| 1026 | type-systems, especially complicated ones like in Scala, can | |
| 1027 | take a module on their own.\footnote{Still, such a study can
 | |
| 1028 | be a rewarding training: If you are in the business of | |
| 1029 | designing new programming languages, you will not be able to | |
| 1030 | turn a blind eye to types. They essentially help programmers | |
| 1031 | to avoid common programming errors and help with maintaining | |
| 1032 | code.} | |
| 1033 | ||
| 1034 | In Scala, types are needed whenever you define an inductive | |
| 1035 | datatype and also whenever you define functions (their | |
| 1036 | arguments and their results need a type). Base types are types | |
| 1037 | that do not take any (type)arguments, for example \code{Int}
 | |
| 1038 | and \code{String}. Compound types take one or more arguments,
 | |
| 1039 | which as seen earlier need to be given in angle-brackets, for | |
| 1040 | example \code{List[Int]} or \code{Set[List[String]]} or 
 | |
| 1041 | \code{Map[Int, Int]}.
 | |
| 1042 | ||
| 1043 | There are a few special type-constructors that fall outside | |
| 1044 | this pattern. One is for tuples, where the type is written | |
| 1045 | with parentheses. For example | |
| 1046 | ||
| 1047 | \begin{lstlisting}[ numbers=none]
 | |
| 1048 | (Int, Int, String) | |
| 1049 | \end{lstlisting}
 | |
| 1050 | ||
| 1051 | \noindent is for a triple (a tuple with three components---two | |
| 1052 | integers and a string). Tuples are helpful if you want to | |
| 1053 | define functions with multiple results, say the function | |
| 270 
38e13601cb1b
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
269diff
changeset | 1054 | returning the quotient and remainder of two numbers. For this | 
| 123 | 1055 | you might define: | 
| 1056 | ||
| 1057 | ||
| 1058 | \begin{lstlisting}[ numbers=none]
 | |
| 1059 | def quo_rem(m: Int, n: Int) : (Int, Int) = (m / n, m % n) | |
| 1060 | \end{lstlisting}
 | |
| 1061 | ||
| 1062 | ||
| 1063 | \noindent Since this function returns a pair of integers, its | |
| 277 | 1064 | \emph{return type} needs to be of type \code{(Int, Int)}. Incidentally,
 | 
| 1065 | this is also the \emph{input type} of this function. For this notice
 | |
| 1066 | \code{quo_rem} takes \emph{two} arguments, namely \code{m} and \code{n},
 | |
| 1067 | both of which are integers. They are ``packaged'' in a pair. | |
| 1068 | Consequently the complete type of \code{quo_rem} is
 | |
| 123 | 1069 | |
| 1070 | \begin{lstlisting}[ numbers=none]
 | |
| 1071 | (Int, Int) => (Int, Int) | |
| 1072 | \end{lstlisting}
 | |
| 1073 | ||
| 277 | 1074 | This uses another special type-constructor, written as the arrow | 
| 1075 | \code{=>}. For example, the type \code{Int => String} is for a function
 | |
| 1076 | that takes an integer as input argument and produces a string as result. | |
| 1077 | A function of this type is for instance | |
| 123 | 1078 | |
| 1079 | \begin{lstlisting}[numbers=none]
 | |
| 1080 | def mk_string(n: Int) : String = n match {
 | |
| 1081 | case 0 => "zero" | |
| 1082 | case 1 => "one" | |
| 1083 | case 2 => "two" | |
| 1084 | case _ => "many" | |
| 1085 | } | |
| 1086 | \end{lstlisting}
 | |
| 1087 | ||
| 1088 | \noindent It takes an integer as input argument and returns a | |
| 277 | 1089 | string. | 
| 1090 | ||
| 1091 | Unfortunately, unlike other functional programming languages, there is | |
| 1092 | in Scala no easy way to find out the types of existing functions, except | |
| 1093 | by looking into the documentation | |
| 123 | 1094 | |
| 1095 | \begin{quote}
 | |
| 1096 | \url{http://www.scala-lang.org/api/current/}
 | |
| 1097 | \end{quote}
 | |
| 1098 | ||
| 1099 | The function arrow can also be iterated, as in | |
| 1100 | \code{Int => String => Boolean}. This is the type for a function
 | |
| 1101 | taking an integer as first argument and a string as second, | |
| 1102 | and the result of the function is a boolean. Though silly, a | |
| 1103 | function of this type would be | |
| 1104 | ||
| 1105 | ||
| 1106 | \begin{lstlisting}[numbers=none]
 | |
| 1107 | def chk_string(n: Int)(s: String) : Boolean = | |
| 1108 | mk_string(n) == s | |
| 1109 | \end{lstlisting}
 | |
| 1110 | ||
| 1111 | ||
| 1112 | \noindent which checks whether the integer \code{n}
 | |
| 1113 | corresponds to the name \code{s} given by the function
 | |
| 1114 | \code{mk\_string}. Notice the unusual way of specifying the
 | |
| 1115 | arguments of this function: the arguments are given one after | |
| 1116 | the other, instead of being in a pair (what would be the type | |
| 1117 | of this function then?). This way of specifying the arguments | |
| 1118 | can be useful, for example in situations like this | |
| 1119 | ||
| 1120 | \begin{lstlisting}[numbers=none]
 | |
| 1121 | scala> List("one", "two", "three", "many").map(chk_string(2))
 | |
| 1122 | res4 = List(false, true, false, false) | |
| 1123 | ||
| 1124 | scala> List("one", "two", "three", "many").map(chk_string(3))
 | |
| 1125 | res5 = List(false, false, false, true) | |
| 1126 | \end{lstlisting}
 | |
| 1127 | ||
| 1128 | \noindent In each case we can give to \code{map} a specialised
 | |
| 1129 | version of \code{chk_string}---once specialised to 2 and once
 | |
| 1130 | to 3. This kind of ``specialising'' a function is called | |
| 1131 | \emph{partial application}---we have not yet given to this
 | |
| 1132 | function all arguments it needs, but only some of them. | |
| 1133 | ||
| 1134 | Coming back to the type \code{Int => String => Boolean}. The
 | |
| 1135 | rule about such function types is that the right-most type | |
| 1136 | specifies what the function returns (a boolean in this case). | |
| 1137 | The types before that specify how many arguments the function | |
| 1138 | expects and what their type is (in this case two arguments, | |
| 1139 | one of type \code{Int} and another of type \code{String}).
 | |
| 1140 | Given this rule, what kind of function has type | |
| 1141 | \mbox{\code{(Int => String) => Boolean}}? Well, it returns a
 | |
| 1142 | boolean. More interestingly, though, it only takes a single | |
| 1143 | argument (because of the parentheses). The single argument | |
| 1144 | happens to be another function (taking an integer as input and | |
| 1145 | returning a string). Remember that \code{mk_string} is just 
 | |
| 1146 | such a function. So how can we use it? For this define | |
| 1147 | the somewhat silly function \code{apply_3}:
 | |
| 1148 | ||
| 1149 | \begin{lstlisting}[numbers=none]
 | |
| 1150 | def apply_3(f: Int => String): Bool = f(3) == "many" | |
| 1151 | ||
| 1152 | scala> apply_3(mk_string) | |
| 1153 | res6 = true | |
| 1154 | \end{lstlisting}
 | |
| 1155 | ||
| 1156 | You might ask: Apart from silly functions like above, what is | |
| 1157 | the point of having functions as input arguments to other | |
| 1158 | functions? In Java there is indeed no need of this kind of | |
| 1159 | feature: at least in the past it did not allow such | |
| 197 | 1160 | constructions. I think, the point of Java 8 and successors was to lift this | 
| 123 | 1161 | restriction. But in all functional programming languages, | 
| 1162 | including Scala, it is really essential to allow functions as | |
| 1163 | input argument. Above you already seen \code{map} and
 | |
| 1164 | \code{foreach} which need this. Consider the functions
 | |
| 1165 | \code{print} and \code{println}, which both print out strings,
 | |
| 1166 | but the latter adds a line break. You can call \code{foreach}
 | |
| 1167 | with either of them and thus changing how, for example, five | |
| 1168 | numbers are printed. | |
| 1169 | ||
| 1170 | ||
| 1171 | \begin{lstlisting}[numbers=none]
 | |
| 1172 | scala> (1 to 5).toList.foreach(print) | |
| 1173 | 12345 | |
| 1174 | scala> (1 to 5).toList.foreach(println) | |
| 1175 | 1 | |
| 1176 | 2 | |
| 1177 | 3 | |
| 1178 | 4 | |
| 1179 | 5 | |
| 1180 | \end{lstlisting}
 | |
| 1181 | ||
| 1182 | ||
| 1183 | \noindent This is actually one of the main design principles | |
| 1184 | in functional programming. You have generic functions like | |
| 1185 | \code{map} and \code{foreach} that can traverse data containers,
 | |
| 1186 | like lists or sets. They then take a function to specify what | |
| 1187 | should be done with each element during the traversal. This | |
| 1188 | requires that the generic traversal functions can cope with | |
| 1189 | any kind of function (not just functions that, for example, | |
| 1190 | take as input an integer and produce a string like above). | |
| 1191 | This means we cannot fix the type of the generic traversal | |
| 1192 | functions, but have to keep them | |
| 181 | 1193 | \emph{polymorphic}.\footnote{Another interesting topic about
 | 
| 123 | 1194 | types, but we omit it here for the sake of brevity.} | 
| 1195 | ||
| 1196 | There is one more type constructor that is rather special. It | |
| 1197 | is called \code{Unit}. Recall that \code{Boolean} has two
 | |
| 1198 | values, namely \code{true} and \code{false}. This can be used,
 | |
| 1199 | for example, to test something and decide whether the test | |
| 1200 | succeeds or not. In contrast the type \code{Unit} has only a
 | |
| 1201 | single value, written \code{()}. This seems like a completely
 | |
| 1202 | useless type and return value for a function, but is actually | |
| 1203 | quite useful. It indicates when the function does not return | |
| 1204 | any result. The purpose of these functions is to cause | |
| 1205 | something being written on the screen or written into a file, | |
| 1206 | for example. This is what is called they cause some effect on | |
| 1207 | the side, namely a new content displayed on the screen or some | |
| 1208 | new data in a file. Scala uses the \code{Unit} type to indicate
 | |
| 1209 | that a function does not have a result, but potentially causes | |
| 1210 | some side-effect. Typical examples are the printing functions, | |
| 1211 | like \code{print}.
 | |
| 1212 | ||
| 272 | 1213 | \subsection*{User-Defined Types}
 | 
| 123 | 1214 | |
| 143 | 1215 | % \subsection*{Cool Stuff}
 | 
| 123 | 1216 | |
| 143 | 1217 | % The first wow-moment I had with Scala was when I came across | 
| 1218 | % the following code-snippet for reading a web-page. | |
| 123 | 1219 | |
| 1220 | ||
| 143 | 1221 | % \begin{lstlisting}[ numbers=none]
 | 
| 1222 | % import io.Source | |
| 1223 | % val url = """http://www.inf.kcl.ac.uk/staff/urbanc/""" | |
| 1224 | % Source.fromURL(url)("ISO-8859-1").take(10000).mkString
 | |
| 1225 | % \end{lstlisting}
 | |
| 123 | 1226 | |
| 1227 | ||
| 143 | 1228 | % \noindent These three lines return a string containing the | 
| 1229 | % HTML-code of my webpage. It actually already does something | |
| 1230 | % more sophisticated, namely only returns the first 10000 | |
| 1231 | % characters of a webpage in case it is too large. Why is that | |
| 1232 | % code-snippet of any interest? Well, try implementing | |
| 1233 | % reading-from-a-webpage in Java. I also like the possibility of | |
| 1234 | % triple-quoting strings, which I have only seen in Scala so | |
| 1235 | % far. The idea behind this is that in such a string all | |
| 1236 | % characters are interpreted literally---there are no escaped | |
| 1237 | % characters, like \verb|\n| for newlines. | |
| 123 | 1238 | |
| 143 | 1239 | % My second wow-moment I had with a feature of Scala that other | 
| 1240 | % functional programming languages do not have. This feature is | |
| 1241 | % about implicit type conversions. If you have regular | |
| 1242 | % expressions and want to use them for language processing you | |
| 1243 | % often want to recognise keywords in a language, for example | |
| 1244 | % \code{for},{} \code{if},{} \code{yield} and so on. But the
 | |
| 1245 | % basic regular expression \code{CHAR} can only recognise a
 | |
| 1246 | % single character. In order to recognise a whole string, like | |
| 1247 | % \code{for}, you have to put many of those together using
 | |
| 1248 | % \code{SEQ}:
 | |
| 123 | 1249 | |
| 1250 | ||
| 143 | 1251 | % \begin{lstlisting}[numbers=none]
 | 
| 1252 | % SEQ(CHAR('f'), SEQ(CHAR('o'), CHAR('r')))
 | |
| 1253 | % \end{lstlisting}
 | |
| 123 | 1254 | |
| 143 | 1255 | % \noindent This gets quickly unreadable when the strings and | 
| 1256 | % regular expressions get more complicated. In other functional | |
| 1257 | % programming languages, you can explicitly write a conversion | |
| 1258 | % function that takes a string, say \dq{\pcode{for}}, and
 | |
| 1259 | % generates the regular expression above. But then your code is | |
| 1260 | % littered with such conversion functions. | |
| 123 | 1261 | |
| 143 | 1262 | % In Scala you can do better by ``hiding'' the conversion | 
| 1263 | % functions. The keyword for doing this is \code{implicit} and
 | |
| 1264 | % it needs a built-in library called | |
| 123 | 1265 | |
| 143 | 1266 | % \begin{lstlisting}[numbers=none]
 | 
| 1267 | % scala.language.implicitConversions | |
| 1268 | % \end{lstlisting}
 | |
| 123 | 1269 | |
| 143 | 1270 | % \noindent | 
| 1271 | % Consider the code | |
| 123 | 1272 | |
| 1273 | ||
| 143 | 1274 | % \begin{lstlisting}[language=Scala]
 | 
| 1275 | % import scala.language.implicitConversions | |
| 123 | 1276 | |
| 143 | 1277 | % def charlist2rexp(s: List[Char]) : Rexp = s match {
 | 
| 1278 | % case Nil => EMPTY | |
| 1279 | % case c::Nil => CHAR(c) | |
| 1280 | % case c::s => SEQ(CHAR(c), charlist2rexp(s)) | |
| 1281 | % } | |
| 123 | 1282 | |
| 143 | 1283 | % implicit def string2rexp(s: String) : Rexp = | 
| 1284 | % charlist2rexp(s.toList) | |
| 1285 | % \end{lstlisting}
 | |
| 123 | 1286 | |
| 1287 | ||
| 143 | 1288 | % \noindent where the first seven lines implement a function | 
| 1289 | % that given a list of characters generates the corresponding | |
| 1290 | % regular expression. In Lines 9 and 10, this function is used | |
| 1291 | % for transforming a string into a regular expression. Since the | |
| 1292 | % \code{string2rexp}-function is declared as \code{implicit},
 | |
| 1293 | % the effect will be that whenever Scala expects a regular | |
| 1294 | % expression, but I only give it a string, it will automatically | |
| 1295 | % insert a call to the \code{string2rexp}-function. I can now
 | |
| 1296 | % write for example | |
| 123 | 1297 | |
| 143 | 1298 | % \begin{lstlisting}[numbers=none]
 | 
| 1299 | % scala> ALT("ab", "ac")
 | |
| 1300 | % res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) | |
| 1301 | % \end{lstlisting}
 | |
| 123 | 1302 | |
| 143 | 1303 | % \noindent Recall that \code{ALT} expects two regular
 | 
| 1304 | % expressions as arguments, but I only supply two strings. The | |
| 1305 | % implicit conversion function will transform the string into a | |
| 1306 | % regular expression. | |
| 123 | 1307 | |
| 143 | 1308 | % Using implicit definitions, Scala allows me to introduce | 
| 1309 | % some further syntactic sugar for regular expressions: | |
| 123 | 1310 | |
| 1311 | ||
| 143 | 1312 | % \begin{lstlisting}[ numbers=none]
 | 
| 1313 | % implicit def RexpOps(r: Rexp) = new {
 | |
| 1314 | % def | (s: Rexp) = ALT(r, s) | |
| 1315 | % def ~ (s: Rexp) = SEQ(r, s) | |
| 1316 | % def % = STAR(r) | |
| 1317 | % } | |
| 123 | 1318 | |
| 143 | 1319 | % implicit def stringOps(s: String) = new {
 | 
| 1320 | % def | (r: Rexp) = ALT(s, r) | |
| 1321 | % def | (r: String) = ALT(s, r) | |
| 1322 | % def ~ (r: Rexp) = SEQ(s, r) | |
| 1323 | % def ~ (r: String) = SEQ(s, r) | |
| 1324 | % def % = STAR(s) | |
| 1325 | % } | |
| 1326 | % \end{lstlisting}
 | |
| 123 | 1327 | |
| 1328 | ||
| 143 | 1329 | % \noindent This might seem a bit overly complicated, but its effect is | 
| 1330 | % that I can now write regular expressions such as $ab + ac$ | |
| 1331 | % simply as | |
| 123 | 1332 | |
| 1333 | ||
| 143 | 1334 | % \begin{lstlisting}[numbers=none]
 | 
| 1335 | % scala> "ab" | "ac" | |
| 1336 | % res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) | |
| 1337 | % \end{lstlisting}
 | |
| 123 | 1338 | |
| 1339 | ||
| 143 | 1340 | % \noindent I leave you to figure out what the other | 
| 1341 | % syntactic sugar in the code above stands for. | |
| 123 | 1342 | |
| 143 | 1343 | % One more useful feature of Scala is the ability to define | 
| 1344 | % functions with varying argument lists. This is a feature that | |
| 1345 | % is already present in old languages, like C, but seems to have | |
| 1346 | % been forgotten in the meantime---Java does not have it. In the | |
| 1347 | % context of regular expressions this feature comes in handy: | |
| 1348 | % Say you are fed up with writing many alternatives as | |
| 123 | 1349 | |
| 1350 | ||
| 143 | 1351 | % \begin{lstlisting}[numbers=none]
 | 
| 1352 | % ALT(..., ALT(..., ALT(..., ...))) | |
| 1353 | % \end{lstlisting}
 | |
| 123 | 1354 | |
| 1355 | ||
| 143 | 1356 | % \noindent To make it difficult, you do not know how deep such | 
| 1357 | % alternatives are nested. So you need something flexible that | |
| 1358 | % can take as many alternatives as needed. In Scala one can | |
| 1359 | % achieve this by adding a \code{*} to the type of an argument.
 | |
| 1360 | % Consider the code | |
| 123 | 1361 | |
| 1362 | ||
| 143 | 1363 | % \begin{lstlisting}[language=Scala]
 | 
| 1364 | % def Alts(rs: List[Rexp]) : Rexp = rs match {
 | |
| 1365 | % case Nil => NULL | |
| 1366 | % case r::Nil => r | |
| 1367 | % case r::rs => ALT(r, Alts(rs)) | |
| 1368 | % } | |
| 123 | 1369 | |
| 143 | 1370 | % def ALTS(rs: Rexp*) = Alts(rs.toList) | 
| 1371 | % \end{lstlisting}
 | |
| 123 | 1372 | |
| 1373 | ||
| 143 | 1374 | % \noindent The function in Lines 1 to 5 takes a list of regular | 
| 1375 | % expressions and converts it into an appropriate alternative | |
| 1376 | % regular expression. In Line 7 there is a wrapper for this | |
| 1377 | % function which uses the feature of varying argument lists. The | |
| 1378 | % effect of this code is that I can write the regular | |
| 1379 | % expression for keywords as | |
| 123 | 1380 | |
| 1381 | ||
| 143 | 1382 | % \begin{lstlisting}[numbers=none]
 | 
| 1383 | % ALTS("for", "def", "yield", "implicit", "if", "match", "case")
 | |
| 1384 | % \end{lstlisting}
 | |
| 123 | 1385 | |
| 1386 | ||
| 143 | 1387 | % \noindent Again I leave it to you to find out how much this | 
| 1388 | % simplifies the regular expression in comparison with if I had | |
| 1389 | % to write this by hand using only the ``plain'' regular | |
| 1390 | % expressions from the inductive datatype. | |
| 1391 | ||
| 197 | 1392 | %\bigskip\noindent | 
| 1393 | %\textit{More TBD.}
 | |
| 123 | 1394 | |
| 197 | 1395 | %\subsection*{Coursework}
 | 
| 181 | 1396 | |
| 195 | 1397 | |
| 1398 | ||
| 123 | 1399 | \subsection*{More Info}
 | 
| 1400 | ||
| 1401 | There is much more to Scala than I can possibly describe in | |
| 197 | 1402 | this document and teach in the lectures. Fortunately there are a | 
| 1403 | number of free books | |
| 123 | 1404 | about Scala and of course lots of help online. For example | 
| 1405 | ||
| 1406 | \begin{itemize}
 | |
| 1407 | \item \url{http://www.scala-lang.org/docu/files/ScalaByExample.pdf}
 | |
| 1408 | \item \url{http://www.scala-lang.org/docu/files/ScalaTutorial.pdf}
 | |
| 1409 | \item \url{https://www.youtube.com/user/ShadowofCatron}
 | |
| 1410 | \item \url{http://docs.scala-lang.org/tutorials}
 | |
| 1411 | \item \url{https://www.scala-exercises.org}
 | |
| 188 | 1412 | \item \url{https://twitter.github.io/scala_school}
 | 
| 123 | 1413 | \end{itemize}
 | 
| 188 | 1414 | |
| 197 | 1415 | \noindent There is also an online course at Coursera on Functional | 
| 123 | 1416 | Programming Principles in Scala by Martin Odersky, the main | 
| 1417 | developer of the Scala language. And a document that explains | |
| 1418 | Scala for Java programmers | |
| 1419 | ||
| 1420 | \begin{itemize}
 | |
| 1421 | \item \small\url{http://docs.scala-lang.org/tutorials/scala-for-java-programmers.html}
 | |
| 1422 | \end{itemize}
 | |
| 1423 | ||
| 1424 | While I am quite enthusiastic about Scala, I am also happy to | |
| 1425 | admit that it has more than its fair share of faults. The | |
| 1426 | problem seen earlier of having to give an explicit type to | |
| 1427 | \code{toSet}, but not \code{toList} is one of them. There are
 | |
| 1428 | also many ``deep'' ideas about types in Scala, which even to | |
| 1429 | me as seasoned functional programmer are puzzling. Whilst | |
| 1430 | implicits are great, they can also be a source of great | |
| 1431 | headaches, for example consider the code: | |
| 1432 | ||
| 1433 | \begin{lstlisting}[numbers=none]
 | |
| 1434 | scala> List (1, 2, 3) contains "your mom" | |
| 1435 | res1: Boolean = false | |
| 1436 | \end{lstlisting}
 | |
| 1437 | ||
| 1438 | \noindent Rather than returning \code{false}, this code should
 | |
| 1439 | throw a typing-error. There are also many limitations Scala | |
| 1440 | inherited from the JVM that can be really annoying. For | |
| 1441 | example a fixed stack size. One can work around this | |
| 1442 | particular limitation, but why does one have to? | |
| 1443 | More such `puzzles' can be found at | |
| 1444 | ||
| 1445 | \begin{center}
 | |
| 1446 |   \url{http://scalapuzzlers.com} and
 | |
| 1447 |   \url{http://latkin.org/blog/2017/05/02/when-the-scala-compiler-doesnt-help/}
 | |
| 1448 | \end{center}
 | |
| 191 | 1449 | |
| 1450 | Even if Scala has been a success in several high-profile companies, | |
| 1451 | there is also a company (Yammer) that first used Scala in their | |
| 1452 | production code, but then moved away from it. Allegedly they did not | |
| 1453 | like the steep learning curve of Scala and also that new versions of | |
| 1454 | Scala often introduced incompatibilities in old code. Also the Java | |
| 197 | 1455 | language is lately developing at lightening speed (in comparison to the past) | 
| 1456 | taking on many | |
| 191 | 1457 | features of Scala and other languages, and it seems even it introduces | 
| 1458 | new features on its own. | |
| 123 | 1459 | |
| 152 | 1460 | %So all in all, Scala might not be a great teaching language, | 
| 1461 | %but I hope this is mitigated by the fact that I never require | |
| 1462 | %you to write any Scala code. You only need to be able to read | |
| 1463 | %it. In the coursework you can use any programming language you | |
| 1464 | %like. If you want to use Scala for this, then be my guest; if | |
| 1465 | %you do not want, stick with the language you are most familiar | |
| 1466 | %with. | |
| 123 | 1467 | |
| 1468 | ||
| 191 | 1469 | \subsection*{Conclusion}
 | 
| 1470 | ||
| 198 | 1471 | I hope you liked the short journey through the Scala language---but remember we | 
| 197 | 1472 | like you to take on board the functional programming point of view, | 
| 198 | 1473 | rather than just learning another language. There is an interesting | 
| 1474 | blog article about Scala by a convert: | |
| 1475 | ||
| 1476 | \begin{center}
 | |
| 1477 | \url{https://www.skedulo.com/tech-blog/technology-scala-programming/}
 | |
| 1478 | \end{center}  
 | |
| 1479 | ||
| 1480 | \noindent | |
| 1481 | He makes pretty much the same arguments about functional programming and | |
| 1482 | immutability (one section is teasingly called \textit{``Where Did all
 | |
| 1483 | the Bugs Go?''}). If you happen to moan about all the idiotic features | |
| 1484 | of Scala, well, I guess this is part of the package according to this | |
| 1485 | quote:\bigskip | |
| 197 | 1486 | |
| 1487 | %\begin{itemize}
 | |
| 1488 | %\item no exceptions....there two kinds, one ``global'' exceptions, like | |
| 1489 | %out of memory (not much can be done about this by the ``individual'' | |
| 1490 | %programmer); and ``local one'' open a file that might not exists - in | |
| 1491 | %the latter you do not want to use exceptions, but Options | |
| 1492 | %\end{itemize}
 | |
| 123 | 1493 | |
| 182 | 1494 | \begin{flushright}\it
 | 
| 1495 | There are only two kinds of languages: the ones people complain | |
| 1496 | about\\ and the ones nobody uses.\smallskip\\ | |
| 1497 | \mbox{}\hfill\small{}---Bjarne Stroustrup (the inventor of C++)
 | |
| 1498 | \end{flushright}
 | |
| 1499 | ||
| 123 | 1500 | \end{document}
 | 
| 1501 | ||
| 1502 | %%% Local Variables: | |
| 1503 | %%% mode: latex | |
| 1504 | %%% TeX-master: t | |
| 1505 | %%% End: |