handouts/pep-ho.tex
changeset 272 da3d30ae67ec
parent 271 48e12e7aee6e
child 273 b19227660752
--- a/handouts/pep-ho.tex	Tue Aug 06 12:46:27 2019 +0100
+++ b/handouts/pep-ho.tex	Wed Aug 07 14:32:21 2019 +0100
@@ -2,9 +2,12 @@
 \documentclass{article}
 \usepackage{../style}
 \usepackage{../langs}
+\usepackage{tikz}
+\usepackage{pgf}
 \usepackage{marvosym}
 \usepackage{boxedminipage}
 
+
 %cheat sheet
 %http://worldline.github.io/scala-cheatsheet/
 
@@ -119,12 +122,13 @@
 \includegraphics[scale=0.15]{../pics/vscode.png}\\[-10mm]\mbox{}
 \end{center}
 \caption{My installation of VS Code includes the following
-  packages from Marketplace: \textbf{Scala Syntax (official)} 0.2.0,
-  \textbf{Code Runner} 0.9.5, \textbf{Code Spell Checker} 1.6.10,
+  packages from Marketplace: \textbf{Scala Syntax (official)} 0.3.4,
+  \textbf{Code Runner} 0.9.12, \textbf{Code Spell Checker} 1.7.17,
   \textbf{Rewrap} 1.9.1 and \textbf{Subtle Match
   Brackets} 3.0.0. I have also bound the keys \keys{Ctrl} \keys{Ret} to the
   action ``Run-Selected-Text-In-Active-Terminal'' in order to quickly
-  evaluate small code snippets in the Scala REPL.\label{vscode}}
+  evaluate small code snippets in the Scala REPL. I use the internal
+  terminal to run Scala.\label{vscode}}
 \end{boxedminipage}
 \end{figure}  
 
@@ -161,7 +165,7 @@
 when you have a million-lines codebase than with our small
 ``toy-programs''\ldots{}for example why on earth am I required to create a
 completely new project with several subdirectories when I just want to
-try out 20-lines of Scala code? Your mileage may vary though. ;o)
+try out 20-lines of Scala code? Your mileage may vary though.~\texttt{;o)}
 
 \subsection*{Why Functional Programming?}
 
@@ -185,7 +189,7 @@
 quite different from what you are  used to in your study so far. It
 might even be totally alien to you. The reason is that functional
 programming seems to go against the core principles of
-\textit{imperative programming} (which is what you do in Java and C++
+\textit{imperative programming} (which is what you do in Java and C/C++
 for example). The main idea of imperative programming  is that you have
 some form of \emph{state} in your program and you continuously change this
 state by issuing some commands---for example for updating a field in an
@@ -242,7 +246,7 @@
 is critical because you do not want that conflicting writes mess about
 with \texttt{i}. Take my word: an untold amount of misery has arisen
 from this problem. The catch is that if you try to solve this problem in
-C++ or Java, and be as defensive as possible about reads and writes to
+C/C++ or Java, and be as defensive as possible about reads and writes to
 \texttt{i}, then you need to synchronise access to it. The result is that
 very often your program waits more than it runs, thereby
 defeating the point of trying to run the program in parallel in the
@@ -264,7 +268,7 @@
 fact that going from the sequential version of the Mandelbrot program
 to the parallel version can be achieved by adding just eight
 characters---in two places you have to add \texttt{.par}. Try the same
-in C++ or Java!
+in C/C++ or Java!
 
 \begin{figure}[p]
 \begin{boxedminipage}{\textwidth}
@@ -503,9 +507,9 @@
 \end{lstlisting}
 
 \noindent
-As can be seen we first define \code{x} and {y} with some silly
+As can be seen, we first define \code{x} and {y} with admittedly some silly
 expressions, and then reuse these values in the definition of \code{z}.
-All easy, no? Why the kerfuffle about values? Well, values are
+All easy, right? Why the kerfuffle about values? Well, values are
 \emph{immutable}. You cannot change their value after you defined them.
 If you try to reassign \code{z} above, Scala will yell at you:
 
@@ -534,7 +538,7 @@
 names you should reserve for what is called \emph{constructors}. And 
 forgive me when I call values as variables\ldots{}it is just something that
 has been in imprinted into my developer-DNA during my early days and
-difficult to get rid of.~\texttt{;o)}  
+is difficult to get rid of.~\texttt{;o)}  
 
 
 \subsection*{Function Definitions}
@@ -552,7 +556,8 @@
 \code{EXPR} (whatever is substituted for this). Since we declared
 \code{String}, the result of this function will be of type
 \code{String}. It is a good habit to always include this information
-about the return type. Simple examples of Scala functions are:
+about the return type, while it is only strictly necessary to give this
+type in recursive functions. Simple examples of Scala functions are:
 
 \begin{lstlisting}[numbers=none]
 def incr(x: Int) : Int = x + 1
@@ -584,7 +589,7 @@
 \end{lstlisting}
 
 \noindent
-We could also have written this as
+We could also have written this with braces as
 
 \begin{lstlisting}[numbers=none]
 def fact(n: Int) : Int = {
@@ -594,14 +599,81 @@
 \end{lstlisting}
 
 \noindent
-but this seems a bit over-kill for a small function like \code{fact}.
+but this seems a bit overkill for a small function like \code{fact}.
 Note that Scala does not have a \code{then}-keyword in an \code{if}-statement.
 Note also that there are a few other ways of how to define a function. We 
-will see some in the next sections.
+will see some of them in the next sections.
+
+Before we go on, let me explain one tricky point in function
+definitions, especially in larger definitions. What does a Scala function
+actually return? Scala has a \code{return} keyword, but it is
+used for something different than in Java (and C/C++). Therefore please
+make sure no \code{return} slips into your Scala code.
+
+So in the absence of \code{return}, what value does a Scala function
+actually produce? A rule-of-thumb is whatever is in the last line of the
+function is the value that will be returned. Consider the following
+example:\footnote{We could have written this function in just one line,
+but for the sake of argument lets keep the two intermediate values.}
+
+\begin{lstlisting}[numbers=none]
+def iaverage(xs: List[Int]) : Int = {
+  val s = xs.sum
+  val n = xs.length
+  s / n
+}    
+\end{lstlisting}
+
+\noindent In this example the expression \code{s / n} is in the last
+line of the function---so this will be the result the function
+calculates. The two lines before just calculate intermediate values.
+This principle of the `last-line' comes in handy when you need to print
+out values, for example, for debugging purposes. Suppose you want
+rewrite the function as
+
+\begin{lstlisting}[numbers=none]
+def iaverage(xs: List[Int]) : Int = {
+  val s = xs.sum
+  val n = xs.length
+  val h = xs.head
+  println(s"Input $xs with first element $h")
+  s / n
+}    
+\end{lstlisting}
+
+\noindent
+Here the function still only returns the expression in the last line.
+The \code{println} before just prints out some information about the
+input of this function, but does not contribute to the result of the
+function. Similarly, the value \code{h} is used in the \code{println}
+but does not contribute to what integer is returned. However note that
+the idea with the ``last line'' is only a rough rule-of-thumb. A better
+rule is probably, the last expression that is evaluated in the function.
+Consider the following version of \code{iaverage}:
+
+\begin{lstlisting}[numbers=none]
+def iaverage(xs: List[Int]) : Int = {
+  if (xs.length == 0) 0
+  else xs.sum / xs.length
+}    
+\end{lstlisting}
+
+\noindent
+What does this function return? Well are two possibilities: either the
+result of \code{xs.sum / xs.length} in the last line provided the list
+\code{xs} is nonempty, \textbf{or} if the list is empty, then it will
+return \code{0} from the \code{if}-branch (which is technically not the
+last line, but the last expression evaluated by the function in the
+empty-case).
+
+Summing up, do not use \code{return} in your Scala code! A function
+returns what is evaluated by the function as the last expression. There
+is always only one such last expression. Previous expressions might
+calculate intermediate values, but they are not returned.
 
 \subsection*{Loops, or better the Absence thereof}
 
-Coming from Java or C++, you might be surprised that Scala does
+Coming from Java or C/C++, you might be surprised that Scala does
 not really have loops. It has instead, what is in functional
 programming called, \emph{maps}. To illustrate how they work,
 let us assume you have a list of numbers from 1 to 8 and want to
@@ -620,11 +692,64 @@
 by the square. Right? In Scala, and in other functional
 programming languages, you use maps to achieve the same. 
 
-A map essentially takes a function that describes how each
-element is transformed (for example squared) and a list over
-which this function should work. There are two forms to
-express such maps in Scala. The first way is called a
-\emph{for-comprehension}. Squaring the numbers from 1 to 8
+A map essentially takes a function that describes how each element is
+transformed (in this example the function is $n \rightarrow n * n$) and
+a list over which this function should work. Pictorially you can think
+of the idea behind maps as follows:
+
+\begin{center}
+\begin{tikzpicture}
+                      
+  \node (A0) at (1.2,0) {\texttt{List(}};
+  \node (A1) at (2.0,0) {\texttt{1\makebox[0mm]{ ,}}};
+  \node (A2) at (2.9,0) {\texttt{2\makebox[0mm]{ ,}}};
+  \node (A3) at (3.8,0) {\texttt{3\makebox[0mm]{ ,}}};
+  \node (A4) at (4.7,0) {\texttt{4\makebox[0mm]{ ,}}};
+  \node (A5) at (5.6,0) {\texttt{5\makebox[0mm]{ ,}}};
+  \node (A6) at (6.5,0) {\texttt{6\makebox[0mm]{ ,}}};
+  \node (A7) at (7.4,0) {\texttt{7\makebox[0mm]{ ,}}};
+  \node (A8) at (8.3,0) {\texttt{8)}};
+
+  \node (B0) at (1.2,-3) {\texttt{List(}};
+  \node (B1) at (2.0,-3) {\texttt{1\makebox[0mm]{ ,}}};
+  \node (B2) at (3.0,-3) {\texttt{4\makebox[0mm]{ ,}}};
+  \node (B3) at (4.1,-3) {\texttt{9\makebox[0mm]{ ,}}};
+  \node (B4) at (5.2,-3) {\texttt{16\makebox[0mm]{ ,}}};
+  \node (B5) at (6.3,-3) {\texttt{25\makebox[0mm]{ ,}}};
+  \node (B6) at (7.4,-3) {\texttt{36\makebox[0mm]{ ,}}};
+  \node (B7) at (8.4,-3) {\texttt{49\makebox[0mm]{ ,}}};
+  \node (B8) at (9.4,-3) {\texttt{64\makebox[0mm]{ )}}};
+
+  \draw [->,line width=1mm] (A1.south) -- (B1.north);
+  \draw [->,line width=1mm] (A2.south) -- (B2.north);
+  \draw [->,line width=1mm] (A3.south) -- (B3.north);
+  \draw [->,line width=1mm] (A4.south) -- (B4.north);
+  \draw [->,line width=1mm] (A5.south) -- (B5.north);
+  \draw [->,line width=1mm] (A6.south) -- (B6.north);
+  \draw [->,line width=1mm] (A7.south) -- (B7.north);
+  \draw [->,line width=1mm] (A8.south) -- (B8.north);
+
+  \node [red] (Q0) at (-0.3,0) {\large\texttt{n}}; 
+  \node (Q1) at (-0.3,-0.1) {};
+  \node (Q2) at (-0.3,-2.8) {};
+  \node [red] (Q3) at (-0.3,-2.95) {\large\texttt{n\,*\,n}};
+  \draw [->,red,line width=1mm] (Q1.south) -- (Q2.north);
+
+  \node [red] at (-1.3,-1.5) {\huge{}\it\textbf{map}};
+ \end{tikzpicture}
+\end{center}
+
+\noindent
+On top is the ``input'' list we want to transform; on the left is the
+``map'' function for how to transform each element in the input list
+(the square function in this case); at the bottom is the result list of
+the map. This means that a map produces a \emph{new} list as a result,
+unlike a for-loop in Java or C/C++ which would most likely update the list
+exists in memory after the map.
+
+Now there are two ways to express such maps in Scala. The first way is
+called a \emph{for-comprehension}. The keywords are \code{for} and
+\code{yield}. Squaring the numbers from 1 to 8 with a for-comprehension
 would look as follows:
 
 \begin{lstlisting}[numbers=none]
@@ -632,38 +757,45 @@
 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64)
 \end{lstlisting}
 
-\noindent The important keywords are \code{for} and
-\code{yield}. This for-comprehension roughly states that from
-the list of numbers we draw elements that are given the name 
-\code{n} and compute the result
-of \code{n * n}. This is a simple example---what comes after 
+\noindent  This for-comprehension states that from the list of numbers
+we draw elements that are given the name \code{n} (which can be
+arbitrary, not just \code{n}) and compute the result of \code{n * n}.
+This way of writing a map resembles a bit the for-loops from imperative
+languages, even though the idea behind for-loops and for-comprehensions
+is quite different. Also, this is a simple example---what comes after
 \code{yield} can be a complex expression enclosed in \texttt{\{...\}}.
-As you can see, we specified the list where
-each \code{n} comes from, namely \code{(1 to 8).toList}, and
-how each element needs to be transformed. This can also be
-expressed in a second way in Scala by using directly
-\code{map}s as follows:
+An example might be
+
+\begin{lstlisting}[numbers=none]
+scala> for (n <- (1 to 8).toList) yield {
+         val i = n + 1
+         val j = n - 1
+         i * j
+       }
+res3: List[Int] = List(0, 3, 8, 15, 24, 35, 48, 63)
+\end{lstlisting}
+
+As you can see in for-comprehensions above, we specified the list where
+each \code{n} comes from, namely \code{(1 to 8).toList}, and how each
+element needs to be transformed. This can also be expressed in a second
+way in Scala by using directly the function \code{map} as follows:
 
 \begin{lstlisting}[numbers=none]
 scala> (1 to 8).toList.map(n => n * n)
 res3 = List(1, 4, 9, 16, 25, 36, 49, 64)
 \end{lstlisting}
 
-\noindent In this way, the expression \code{n => n * n} stands
-for the function that calculates the square (this is how the
-\code{n}s are transformed). This expression for functions
-might remind you of your lessons about the lambda-calculus
-where this would have been written as $\lambda n.\,n * n$. It
-might not be obvious, but for-comprehensions are just
-syntactic sugar: when compiling, Scala translates
-for-comprehensions into equivalent maps. This even works
-when for-comprehensions get more complicated (see below).
+\noindent In this way, the expression \code{n => n * n} stands for the
+function that calculates the square (this is how the \code{n}s are
+transformed by the map).  It might not be obvious, but
+for-comprehensions above are just syntactic sugar: when compiling, Scala
+translates for-comprehensions into equivalent maps. This even works when
+for-comprehensions get more complicated (see below).
 
 The very charming feature of Scala is that such maps or
-for-comprehensions can be written for any kind of data
-collection, such as lists, sets, vectors, options and so on.
-For example if we instead compute the remainders modulo 3 of
-this list, we can write
+for-comprehensions can be written for any kind of data collection, such
+as lists, sets, vectors, options and so on. For example if we instead
+compute the remainders modulo 3 of this list, we can write
 
 \begin{lstlisting}[numbers=none]
 scala> (1 to 8).toList.map(n => n % 3)
@@ -704,8 +836,9 @@
 \end{lstlisting}
 
 \noindent 
-Or if we want to find all pairs of numbers between 1 and 3
-where the sum is an even number, we can write
+In this example the for-comprehension ranges over two lists, and
+produces a list of pairs as output. Or if we want to find all pairs of
+numbers between 1 and 3 where the sum is an even number, we can write
 
 \begin{lstlisting}[numbers=none]
 scala> for (n <- (1 to 3).toList; 
@@ -714,8 +847,31 @@
 res7 = List((1,1), (1,3), (2,2), (3,1), (3,3))
 \end{lstlisting}
 
-\noindent The \code{if}-condition in the for-comprehension
-filters out all pairs where the sum is not even.
+\noindent The \code{if}-condition in this for-comprehension filters out
+all pairs where the sum is not even (therefore \code{(1, 2)} is not in
+the result because the sum is odd). 
+
+To sum up, maps (or for-comprehensions) transform one collection into
+another. For example a list of \code{Int}s into a list of squares, or a
+list of \code{Int}s into a set of \code{Int}s and so on. There is no need
+for for-loops in Scala. But please do not be tempted to write anything like
+
+\begin{lstlisting}[numbers=none]
+scala> val cs = ('a' to 'h').toList
+scala> for (n <- (0 until cs.length).toList) 
+          yield cs(n).capitalize
+res8: List[Char] = List(A, B, C, D, E, F, G, H)
+\end{lstlisting}
+
+\noindent
+This is accepted Scala-code, but utterly bad style. It can be written
+much clearer as:
+
+\begin{lstlisting}[numbers=none]
+scala> val cs = ('a' to 'h').toList
+scala> for (c <- cs) yield c.capitalize
+res9: List[Char] = List(A, B, C, D, E, F, G, H)
+\end{lstlisting}
 
 \subsection*{Results and Side-Effects}
 
@@ -990,6 +1146,7 @@
 some side-effect. Typical examples are the printing functions, 
 like \code{print}.
 
+\subsection*{User-Defined Types}
 
 % \subsection*{Cool Stuff}