--- a/handouts/pep-ho.tex Sat Jun 16 15:13:48 2018 +0100
+++ b/handouts/pep-ho.tex Sat Jun 16 20:54:58 2018 +0100
@@ -2,6 +2,7 @@
\usepackage{../style}
\usepackage{../langs}
\usepackage{marvosym}
+\usepackage{boxedminipage}
%cheat sheet
%http://worldline.github.io/scala-cheatsheet/
@@ -16,7 +17,7 @@
\mbox{}\hfill\textit{``Scala --- \underline{S}lowly \underline{c}ompiled
\underline{a}cademic \underline{la}nguage''}\smallskip\\
-\mbox{}\hfill\textit{ --- a joke on Twitter}\bigskip
+\mbox{}\hfill\textit{ --- a joke(?) on Twitter}\bigskip
\noindent
Scala is a programming language that combines functional and
@@ -37,7 +38,7 @@
of companies---the Guardian, Twitter, Coursera, FourSquare, LinkedIn,
Netflix to name a few---either use Scala exclusively in production code,
or at least to some substantial degree. Scala seems also useful in
-job-interviews (especially in Data Science) according to this anecdotal
+job-interviews (especially in data science) according to this anecdotal
report
\begin{quote}
@@ -52,7 +53,7 @@
\end{quote}
\noindent
-I found a convenient IDE for Scala programming is Microsoft's
+I found a convenient IDE for writing Scala programs is Microsoft's
\textit{Visual Studio Code} (VS Code) which runs under MacOSX, Linux and
obviously Windows. It can be downloaded for free from
@@ -64,32 +65,34 @@
and should already come pre-installed in the Department (together with
the Scala compiler). VS Code is far from perfect, however it includes a
\textit{Marketplace} from which a multitude of extensions can be
-downloaded that make editing and running Scala code easier (see
-Figure~\ref{vscode} for my own setup).
+downloaded that make editing and running Scala code a little easier (see
+Figure~\ref{vscode} for my setup).
\begin{figure}[t]
+\begin{boxedminipage}{\textwidth}
\begin{center}
\includegraphics[scale=0.15]{../pics/vscode.png}\\[-10mm]\mbox{}
\end{center}
\caption{My personal installation of VS Code includes the following
packages from Marketplace: Scala Syntax (official), Code Runner, Code
Spell Checker, Rewrap and Subtle Match Brackets. I have also bound keys
-\keys{\^{}} \keys{Ret} to the action
+the \keys{Ctrl} \keys{Ret} to the action
``Run-Selected-Text-In-Active-Terminal'' in order to quickly evaluate
small code snippets in the Scala REPL.\label{vscode}}
+\end{boxedminipage}
\end{figure}
-What I like most about VS Code is that it provides an easy access to the
+What I like most about VS Code is that it provides easy access to the
Scala REPL. But if you prefer your own editor for coding, it
is also painless to work with Scala completely on the command line (like you
might have done with \texttt{g++} in the earlier part of PEP). For the
lazybones among us, there is even an online editor and environment for
developing and running Scala programs called \textit{ScalaFiddle}, which
requires zero setup (assuming you have a browser handy). You can access
-it from:
+it from:
\begin{quote}
-\url{https://scalafiddle.io}
+\url{https://scalafiddle.io}\medskip
\end{quote}
@@ -108,14 +111,14 @@
when you have a million-lines codebase, rather than our
``toy-programs''\ldots{}for example why on earth am I required to create a
completely new project with several subdirectories when I just want to
-try out 20-lines of Scala code? ;o)
+try out 20-lines of Scala code? Your mileage may vary. ;o)
\subsection*{Why Functional Programming?}
Before we go on, let me explain a bit more why we want to inflict
upon you another programming language. You hopefully have mastered Java and
C++\ldots{}the world should be your oyster, no? Well, it is not that
-easy. We require Scala in PEP, but actually we do not deeply care
+easy. We do require Scala in PEP, but actually we do not deeply care
whether you learn Scala---after all it is just a programming language
(albeit a nifty one IMHO). What we do care about is that you learn about
\textit{functional programming}. Scala is just the vehicle for that.
@@ -132,88 +135,108 @@
of programming is a \texttt{for}-loop, say
\begin{lstlisting}[language=C,numbers=none]
- for (int i = 10; i < 20; i++) {
+for (int i = 10; i < 20; i++) {
...Do something interesting with i...
- }
+}
\end{lstlisting}
-\noindent Here the variable \texttt{i} embodies the state, which is
-first set to \texttt{10} and then increased by one in each
-loop-iteration until it reaches \texttt{20} when the loop is exited.
-When this code is compiled and actually runs, there will be some
-dedicated space reserved in memory which contains the value of
-\texttt{i}\ldots\texttt{10} at the beginning, and then the content will
-be updated, or replaced, by some new content in every iteration. The
-main point here is that this kind of updating, or manipulating, memory
-is \textbf{PURE EVIL}!!
+\noindent Here the integer variable \texttt{i} embodies the state, which
+is first set to \texttt{10} and then increased by one in each
+loop-iteration until it reaches \texttt{20} at which point the loop
+exits. When this code is compiled and actually runs, there will be some
+dedicated space reserved for \texttt{i} in memory which contains its
+current value\ldots\texttt{10} at the beginning, and then the content
+will be updated, or replaced, by some new content in every iteration.
+The main point here is that this kind of updating, or manipulating,
+memory is \textbf{PURE EVIL}!!
\noindent
\ldots{}Well, it is perfectly benign if you have a sequential program
that gets run instruction by instruction...nicely one after another.
This kind of running code uses a single core of your CPU and goes as
-fast as your CPU frequency (or clock-speed) allows. Unfortunately, this
-clock-speed has not much increased over the past few years and no
-dramatic increases are predicted any time soon. So you are a bit stuck,
-unlike previous generations of developers who could rely upon the fact
-that every 2 years or so their code run twice as fast (in ideal
-circumstances) because the clock-speed of their CPUs got twice as fast.
-This unfortunately does not happen any more nowadays. To get you out of
-this embarrassing situation, CPU producers pile more and more cores into
-CPUs in order to make them more powerful and potentially make software
-faster. The task for you as developer is to take somehow advantage of
-these cores by running as much of your code as possible in parallel on
-as many core you have available (typically 4 in modern laptops and
-sometimes much more on high-end machines). In this situation,
-\textit{mutable} variables like \texttt{i} above are evil, or at least a
-major nuisance. Because if you want to distribute some of the
+fast as your CPU frequency, also called clock-speed, allows. The problem
+is that this clock-speed has not much increased over the past decade and
+no dramatic increases are predicted for any time soon. So you are a bit
+stuck, unlike previous generations of developers who could rely upon the
+fact that every 2 years or so their code would run twice as fast (in
+ideal circumstances) because the clock-speed of their CPUs got twice as
+fast. This unfortunately does not happen any more nowadays. To get you
+out of this dreadful situation, CPU producers pile more and more
+cores into CPUs in order to make them more powerful and potentially make
+software faster. The task for you as developer is to take somehow
+advantage of these cores by running as much of your code as possible in
+parallel on as many core you have available (typically 4 in modern
+laptops and sometimes much more on high-end machines). In this
+situation, \textit{mutable} variables like \texttt{i} above are evil, or
+at least a major nuisance: Because if you want to distribute some of the
loop-iterations over the cores that are currently idle in your system,
you need to be extremely careful about who can read and write (update)
the variable \texttt{i}.\footnote{If you are of the belief that nothing
nasty can happen to \texttt{i} inside the \texttt{for}-loop, then you
need to go back over the C++ material.} Especially the writing operation
is critical because you do not want that conflicting writes mess about
-with \texttt{i}. An untold amount of misery has arisen from this
-problem. The catch is that if you try to solve this problem in C++ or
-Java, and be as defensive as possible about reads and writes to
-\texttt{i}, then you need to synchronise access to it and as a result
+with \texttt{i}. Take my word: an untold amount of misery has arisen
+from this problem. The catch is that if you try to solve this problem in
+C++ or Java, and be as defensive as possible about reads and writes to
+\texttt{i}, then you need to synchronise access to it. The result is that
your program more often than not waits more than it runs, thereby
defeating the point of trying to run the program in parallel in the
first place. If you are less defensive, then usually all hell breaks
loose by seemingly obtaining random results. And forget the idea of
being able to debug such code.
-The idea of functional programming is to eliminate any state from
-programs. Because then it is easy to parallelize the resulting programs:
-if you do not have any state, then once created all memory content stays
-unchanged and reads to such memory are absolutely safe without the need
-of any synchronisations. An example is given in Figure~\ref{mand} where
-in the absence of annoying state, Scala makes it easy to calculate the
-Mandelbrot set on as many cores of your CPU as possible. Why is it so
-easy in this example? Because each pixel in the Mandelbrot set can be
-calculated independently and the calculation does not need to update any
-variable. It is so easy in fact, that going from the sequential version
-of the program to the parallel version can be done by adding just eight
-characters. What is not to be liked about that (try the same in C++)?
+The central idea of functional programming is to eliminate any state
+from programs---or at least from the ``interesting'' (computational
+intensive) parts. Because then it is easy to parallelize the resulting
+programs: if you do not have any state, then once created, all memory
+content stays unchanged and reads to such memory are absolutely safe
+without the need of any synchronisation. An example is given in
+Figure~\ref{mand} where in the absence of the annoying state, Scala
+makes it very easy to calculate the Mandelbrot set on as many cores of
+your CPU as possible. Why is it so easy in this example? Because each
+pixel in the Mandelbrot set can be calculated independently and the
+calculation does not need to update any variable. It is so easy in fact
+that going from the sequential version of the Mandelbrot program to the
+parallel version can be achieved by adding just eight characters.
+Try the same in C++ or Java!
\begin{figure}[p]
- \includegraphics[scale=0.15]{../pics/mand1.png}
+\begin{boxedminipage}{\textwidth}
+\begin{center}
+\begin{tabular}{c}
+\includegraphics[scale=0.15]{../pics/mand1.png}\\
+\end{tabular}
+
+Wellknown Mandelbrot program for generating pretty pictures due to
+Benoit Mandelbrot. (\url{https://en.wikipedia.org/wiki/Mandelbrot_set})
+\bigskip
+
- \includegraphics[scale=0.15]{../pics/mand4.png}
- \includegraphics[scale=0.15]{../pics/mand3.png}
-\caption{\label{mand}}
+\begin{tabular}[t]{p{5cm}|p{5cm}}
+ \includegraphics[scale=0.15]{../pics/mand4.png} &
+ \includegraphics[scale=0.15]{../pics/mand3.png} \\
+\begin{minipage}{0.5\textwidth}\small
+a
+\begin{lstlisting}[numbers=none]
+ww
+\end{lstlisting}
+\end{minipage}
+& \\
+\end{tabular}
+\end{center}
+\caption{Test \label{mand}}
+\end{boxedminipage}
\end{figure}
But remember that this easy parallelisation of code requires that we
-have no state in our program\ldots{} that is no counters like\texttt{i}
+have no state in our program\ldots{} that is no counters like \texttt{i}
in \texttt{for}-loops. You might then ask, how do I write loops without
-such counters? Well, teaching you that this is possible is the main
-point of the Scala-part in PEP. I can assure you it is possible, but you
+such counters? Well, teaching you that this is possible is one of the main
+points of the Scala-part in PEP. I can assure you it is possible, but you
have to get your head around it. Once you mastered this, it will be fun
to have no state in your programs (a side product is that it much easier
-to debug state-less code; and the memory we might waste by not allowing
-in-place updates is taken care of by the memory garbage collector of
-Java and Scala).
-
+to debug state-less code and also more often than not easier to understand).
+So good luck with Scala!
\subsection*{The Very Basics}