updated
authorChristian Urban <urbanc@in.tum.de>
Tue, 09 May 2017 12:31:55 +0100
changeset 490 4fee50f38305
parent 489 e28d7a327870
child 491 d5776c6018f0
updated
handouts/ho03.pdf
handouts/ho03.tex
handouts/ho04.pdf
handouts/ho04.tex
langs.sty
progs/nfa.scala
progs/re0.scala
Binary file handouts/ho03.pdf has changed
--- a/handouts/ho03.tex	Sun May 07 03:01:29 2017 +0100
+++ b/handouts/ho03.tex	Tue May 09 12:31:55 2017 +0100
@@ -40,7 +40,7 @@
 \item $\delta$ is the transition function.
 \end{itemize}
 
-\noindent I am sure you have seen this defininition already
+\noindent I am sure you have seen this definition already
 before. The transition function determines how to ``transition'' from
 one state to the next state with respect to a character. We have the
 assumption that these transition functions do not need to be defined
@@ -122,13 +122,11 @@
 \]
 
 \noindent I let you think about a definition that describes the set of
-all strings accepted by a determinsitic finite automaton.
+all strings accepted by a deterministic finite automaton.
 
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
-                  {../progs/display/dfa.scala}
+\lstinputlisting[numbers=left]{../progs/display/dfa.scala}
 \caption{A Scala implementation of DFAs using partial functions.
   Note some subtleties: \texttt{deltas} implements the delta-hat
   construction by lifting the (partial) transition  function to lists
@@ -171,8 +169,7 @@
 sink state (catch-all-state), you can use Scala's pattern matching and
 write something like
 
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
+{\small\begin{lstlisting}[language=Scala]
 abstract class State
 ...
 case object Sink extends State
@@ -267,8 +264,7 @@
 mathematical definition of NFAs. An example of a transition function
 in Scala for the NFA shown above is
 
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
+{\small\begin{lstlisting}[language=Scala]
 val nfa_delta : (State, Char) :=> Set[State] = 
   { case (Q0, 'a') => Set(Q1, Q2)
     case (Q0, 'b') => Set(Q0)
@@ -276,7 +272,7 @@
     case (Q1, 'b') => Set(Q0, Q1) }
 \end{lstlisting}}  
 
-\noindent Like in the mathematical definition, \texttt{starts} is in
+Like in the mathematical definition, \texttt{starts} is in
 NFAs a set of states; \texttt{fins} is again a function from states to
 booleans. The \texttt{next} function calculates the set of next states
 reachable from a single state \texttt{q} by a character~\texttt{c}. In
@@ -287,9 +283,7 @@
  
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
-                  {../progs/display/nfa.scala}
+\lstinputlisting[numbers=left]{../progs/display/nfa.scala}
 \caption{A Scala implementation of NFAs using partial functions.
   Notice that the function \texttt{accepts} implements the
   acceptance of a string in a breath-first search fashion. This can be a costly
@@ -311,8 +305,7 @@
 \texttt{accepts} using Scala's \texttt{exists}-function as follows:
 
 
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
+{\small\begin{lstlisting}[language=Scala]
 def search(q: A, s: List[C]) : Boolean = s match {
   case Nil => fins(q)
   case c::cs => next(q, c).exists(search(_, cs)) 
@@ -332,7 +325,7 @@
 Ruby and Python. I like to show you this in the next two sections.
 
 
-\subsubsection*{Epsilon NFAs}
+\subsection*{Epsilon NFAs}
 
 In order to get an idea what calculations are performed by Java \&
 friends, we need a method for transforming a regular expression into
@@ -389,7 +382,7 @@
 this example, if you are in the starting state $Q_0$, you can silently
 move either to $Q_1$ or $Q_2$. You can see that once you are in $Q_1$,
 respectively $Q_2$, you cannot ``go back'' to the other states. So it
-seems allowing $\epsilon$-transitions is a rather substancial
+seems allowing $\epsilon$-transitions is a rather substantial
 extension to NFAs. On first appearances, $\epsilon$-transitions might
 even look rather strange, or even silly. To start with, silent
 transitions make the decision whether a string is accepted by an
@@ -407,7 +400,7 @@
 unfortunately.  If we were to follow the many textbooks on the
 subject, we would now start with defining what $\epsilon$NFAs
 are---that would require extending the transition relation of
-NFAs. Next, we woudl show that the $\epsilon$NFAs are equivalent to
+NFAs. Next, we would show that the $\epsilon$NFAs are equivalent to
 NFAs and so on. Once we have done all this on paper, we would need to
 implement $\epsilon$NFAs. Lets try to take a shortcut instead. We are
 not really interested in $\epsilon$NFAs; they are only a convenient
@@ -464,12 +457,10 @@
 
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
-                  {../progs/display/enfa.scala}
+\lstinputlisting[numbers=left]{../progs/display/enfa.scala}
 
 \caption{A Scala function that translates $\epsilon$NFA into NFAs. The
-  transtion function of $\epsilon$NFA takes as input an \texttt{Option[C]}.
+  transition function of $\epsilon$NFA takes as input an \texttt{Option[C]}.
   \texttt{None} stands for an $\epsilon$-transition; \texttt{Some(c)}
   for a ``proper'' transition consuming a character. The functions in
   Lines 18--26 calculate
@@ -480,13 +471,12 @@
 Also look carefully how the transitions of $\epsilon$NFAs are
 implemented. The additional possibility of performing silent
 transitions is encoded by using \texttt{Option[C]} as the type for the
-``input''. The \texttt{Some}s stand for ``propper'' transitions where
+``input''. The \texttt{Some}s stand for ``proper'' transitions where
 a character is consumed; \texttt{None} stands for
 $\epsilon$-transitions. The transition functions for the two
 $\epsilon$NFAs from the beginning of this section can be defined as
 
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
+{\small\begin{lstlisting}[language=Scala]
 val enfa_trans1 : (State, Option[Char]) :=> Set[State] = 
   { case (Q0, Some('a')) => Set(Q0)
     case (Q0, None) => Set(Q1, Q2)
@@ -503,7 +493,7 @@
 I hope you agree now with my earlier statement that the $\epsilon$NFAs
 are just an API for NFAs.
 
-\subsubsection*{Thompson Construction}
+\subsection*{Thompson Construction}
 
 Having the translation of $\epsilon$NFAs to NFAs in place, we can
 finally return to the problem of translating regular expressions into
@@ -543,9 +533,7 @@
 
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
-                  {../progs/display/thompson1.scala}
+\lstinputlisting[numbers=left]{../progs/display/thompson1.scala}
 \caption{The first part of the Thompson Construction. Lines 7--16
   implement a way of how to create new states that are all
   distinct by virtue of a counter. This counter is
@@ -558,11 +546,9 @@
 
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
-                  {../progs/display/thompson2.scala}
+\lstinputlisting[numbers=left]{../progs/display/thompson2.scala}
 \caption{The second part of the Thompson Construction implementing
-  the composition of NFAs according to $\cdot$, $+$ and $\_^*$.
+  the composition of NFAs according to $\cdot$, $+$ and ${}^*$.
   The implicit class about rich partial functions
   implements the infix operation \texttt{+++} which
   combines an $\epsilon$NFA transition with a NFA transition
@@ -650,25 +636,31 @@
 string is first recognised by the first NFA, then we silently change
 to the second NFA; the ending of the string is recognised by the
 second NFA...just like matching of a string by the regular expression
-$r_1\cdot r_2$. The Scala code for this constrction is given in
+$r_1\cdot r_2$. The Scala code for this construction is given in
 Figure~\ref{thompson2} in Lines 16--23. The starting states of the
 $\epsilon$NFA are the starting states of the first NFA (corresponding
 to $r_1$); the accepting function is the accepting function of the
 second NFA (corresponding to $r_2$). The new transition function is
 all the ``old'' transitions plus the $\epsilon$-transitions connecting
 the accepting states of the first NFA to the starting states of the
-first NFA (Lines 18 and 19). The $\epsilon$NFA is then immedately
+first NFA (Lines 18 and 19). The $\epsilon$NFA is then immediately
 translated in a NFA.
 
 
-The case for the choice regular expression $r_1 + r_2$ is slightly
-different: We are given by recursion two NFAs representing $r_1$ and
-$r_2$ respectively.
+The case for the alternative regular expression $r_1 + r_2$ is
+slightly different: We are given by recursion two NFAs representing
+$r_1$ and $r_2$ respectively. Each NFA has some starting states and
+some accepting states. We obtain a NFA for the regular expression $r_1
++ r_2$ by composing the transition functions (this crucially depends
+on knowing that the states of each component NFA are distinct); and
+also combine the starting states and accepting functions.
 
 \begin{center}
+\begin{tabular}[t]{ccc}
 \begin{tikzpicture}[node distance=3mm,
     >=stealth',very thick,
-    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+    baseline=(current bounding box.center)]
 \node at (0,0)  (1)  {$\mbox{}$};
 \node  (2)  [above=10mm of 1] {};
 \node[state, initial]  (4)  [above=1mm of 2] {$\mbox{}$};
@@ -691,18 +683,13 @@
 \node [yshift=3mm] at (2.north) {$r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
-\end{center}
-
-\noindent Each NFA has some starting states and some accepting
-states. We obtain a NFA for the regular expression $r_1 + r_2$
-by composing the transition functions (this crucially depends
-on knowing that the states of each component NFA are distinct);
-and also combine the starting states and accepting functions:
-
-\begin{center}
+&
+\mbox{}\qquad\tikz{\draw[>=stealth,line width=2mm,->] (0,0) -- (1, 0)}\quad\mbox{}
+&
 \begin{tikzpicture}[node distance=3mm,
     >=stealth',very thick,
-    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+    baseline=(current bounding box.center)]
 \node at (0,0) (1)  {$\mbox{}$};
 \node (2)  [above=10mm of 1] {$$};
 \node[state, initial]  (4)  [above=1mm of 2] {$\mbox{}$};
@@ -727,14 +714,23 @@
 \node [yshift=3mm] at (3.north) {$r_1+ r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
+\end{tabular}
 \end{center}
 
 \noindent The code for this construction is in Figure~\ref{thompson2}
-in Lines 25--33. Finally for the $*$-case we have a NFA for $r$
+in Lines 25--33.
+
+Finally for the $*$-case we have a NFA for $r$ and connect its
+accepting states to a new starting state via
+$\epsilon$-transitions. This new starting state is also an accepting
+state, because $r^*$ can recognise the empty string.
 
 \begin{center}
+\begin{tabular}[b]{@{\hspace{-4mm}}ccc@{}}  
 \begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+    baseline=(current bounding box.north)]
 \node at (0,0)  (1)  {$\mbox{}$};
 \node[state, initial]  (2)  [right=16mm of 1] {$\mbox{}$};
 \node (a)  [right=of 2] {$\ldots$};
@@ -746,18 +742,13 @@
 \node [yshift=3mm] at (1.north) {$r$};
 \end{pgfonlayer}
 \end{tikzpicture}
-\end{center}
-
-\noindent and connect its accepting states to a new starting state via
-$\epsilon$-transitions. This new starting state is also an accepting
-state, because $r^*$ can recognise the empty string. This gives the
-following $\epsilon$NFA for $r^*$ (the corresponding code is in
-Figure~\ref{thompson2} in Lines 35--43:
-
-\begin{center}  
+&
+\raisebox{-16mm}{\;\tikz{\draw[>=stealth,line width=2mm,->] (0,0) -- (1, 0)}}
+&
 \begin{tikzpicture}[node distance=3mm,
     >=stealth',very thick,
-    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+    baseline=(current bounding box.north)]
 \node at (0,0) [state, initial,accepting]  (1)  {$\mbox{}$};
 \node[state]  (2)  [right=16mm of 1] {$\mbox{}$};
 \node (a)  [right=of 2] {$\ldots$};
@@ -772,21 +763,23 @@
 \node (2) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (1) (a2) (a3)] {};
 \node [yshift=3mm] at (2.north) {$r^*$};
 \end{pgfonlayer}
-\end{tikzpicture}
+\end{tikzpicture}    
+\end{tabular}
 \end{center}
 
+\noindent 
+The corresponding code is in Figure~\ref{thompson2} in Lines 35--43)
 
-To sum ap, you can see in the sequence and star cases the need of
+To sum up, you can see in the sequence and star cases the need of
 having silent $\epsilon$-transitions. Similarly the alternative case
-shows the need of the NFA-nondeterminsim. It seems awkward to form the
+shows the need of the NFA-nondeterminism. It seems awkward to form the
 `alternative' composition of two DFAs, because DFA do not allow
 several starting and successor states. All these constructions can now
 be put together in the following recursive function:
 
 
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
-                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
-def thompson (r: Rexp) : NFAt = r match {
+{\small\begin{lstlisting}[language=Scala]
+def thompson(r: Rexp) : NFAt = r match {
   case ZERO => NFA_ZERO()
   case ONE => NFA_ONE()
   case CHAR(c) => NFA_CHAR(c)  
@@ -797,16 +790,25 @@
 \end{lstlisting}}
 
 \noindent
-It calculates a NFA from a regular expressions. At last we can run a
-NFA for the our evil regular expression examples.
+It calculates a NFA from a regular expressions. At last we can run
+NFAs for the our evil regular expression examples.  The graph on the
+left shows that when translating a regular expression such as
+$a^{\{n\}}$ into a NFA, the size can blow up and then even the
+relative fast (on small examples) breadth-first search can be
+slow. The graph on the right shows that with `evil' regular
+expressions the depth-first search can be abysmally slow. Even if the
+graphs not completely overlap with the curves of Python, Ruby and
+Java, they are similar enough. OK\ldots now you know why regular
+expression matchers in those languages are so slow. 
 
 
 \begin{center}
 \begin{tabular}{@{\hspace{-1mm}}c@{\hspace{1mm}}c@{}}  
 \begin{tikzpicture}
 \begin{axis}[
-    title={Graph: $\texttt{a?\{n\}\,a{\{n\}}}$ and strings 
+    title={Graph: $a^{?\{n\}} \cdot a^{\{n\}}$ and strings 
            $\underbrace{\texttt{a}\ldots \texttt{a}}_{n}$},
+    title style={yshift=-2ex},
     xlabel={$n$},
     x label style={at={(1.05,0.0)}},
     ylabel={time in secs},
@@ -818,9 +820,9 @@
     scaled ticks=false,
     axis lines=left,
     width=5.5cm,
-    height=4.5cm, 
-    legend entries={Python,Ruby},  
-    legend pos=south east,
+    height=4cm, 
+    legend entries={Python,Ruby, breadth-first NFA},
+    legend style={at={(0.5,-0.25)},anchor=north,font=\small},
     legend cell align=left]
   \addplot[blue,mark=*, mark options={fill=white}] table {re-python.data};
   \addplot[brown,mark=triangle*, mark options={fill=white}] table {re-ruby.data};
@@ -845,8 +847,9 @@
 &
 \begin{tikzpicture}
 \begin{axis}[
-    title={Graph: $\texttt{(a*)*\,b}$ and strings 
+    title={Graph: $(a^*)^* \cdot b$ and strings 
            $\underbrace{\texttt{a}\ldots \texttt{a}}_{n}$},
+    title style={yshift=-2ex},
     xlabel={$n$},
     x label style={at={(1.05,0.0)}},
     ylabel={time in secs},
@@ -858,9 +861,9 @@
     scaled ticks=false,
     axis lines=left,
     width=5.5cm,
-    height=4.5cm, 
-    legend entries={Python, Java},  
-    legend pos=outer north east,
+    height=4cm, 
+    legend entries={Python, Java, depth-first NFA},
+    legend style={at={(0.5,-0.25)},anchor=north,font=\small},
     legend cell align=left]
   \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data};
   \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data};  
@@ -881,166 +884,318 @@
 
 
 
-\subsubsection*{Subset Construction}
+\subsection*{Subset Construction}
+
+Of course, some developers of regular expression matchers are aware
+of these problems with sluggish NFAs and try to address them. One
+common technique for this I like to show you in this section. It will
+also explain why we insisted on polymorphic types in our DFA code
+(remember I used \texttt{A} and \texttt{C} for the types of states and
+the input, see Figure~\ref{dfa} on Page~\pageref{dfa}).\bigskip
 
-Remember that we did not bother with defining and implementing
-$\epsilon$NFA; we immediately translated them into equivalent
-NFAs. Equivalent in the sense of accepting the same language (though
-we only claimed this and did not prove it rigorously). Remember also
-that NFAs have a non-deterministic transitions, given as a relation.
-This non-determinism makes it harder to decide when a string is
-accepted or not; such a decision is rather straightforward with DFAs
-(remember their transition function).
+\noindent
+To start, remember that we did not bother with defining and
+implementing $\epsilon$NFA; we immediately translated them into
+equivalent NFAs. Equivalent in the sense of accepting the same
+language (though we only claimed this and did not prove it
+rigorously). Remember also that NFAs have non-deterministic
+transitions defined as a relation or implemented as function returning
+sets of states.  This non-determinism is crucial for the Thompson
+Construction to work (recall the cases for $\cdot$, $+$ and
+${}^*$). But this non-determinism makes it harder with NFAs to decide
+when a string is accepted or not; such a decision is rather
+straightforward with DFAs: recall their transition function is a
+\emph{function} that returns a single state. So we do not have to
+search at all.  What is perhaps interesting is the fact that for every
+NFA we can find a DFA that also recognises the same language. This
+might sound a bit paradoxical: NFA $\rightarrow$ decision of
+acceptance hard; DFA $\rightarrow$ decision easy. But this \emph{is}
+true\ldots but of course there is always a caveat---nothing is ever
+for free in life.
 
-What is interesting is that for every NFA we can find a DFA that also
-recognises the same language. This might sound like a bit paradoxical,
-but I litke to show you this next. There are a number of ways of
-transforming a NFA into an equivalent DFA, but the most famous is
-\emph{subset construction}. Consider again the NFA below on the left,
-consisting of nodes labelled, say, with $0$, $1$ and $2$.
+There are a number of techniques for transforming a NFA into an
+equivalent DFA, but the most famous one is the \emph{subset
+  construction}. Consider the following NFA where the states are
+labelled with, say, $0$, $1$ and $2$.
 
 \begin{center}
 \begin{tabular}{c@{\hspace{10mm}}c}
 \begin{tikzpicture}[scale=0.7,>=stealth',very thick,
                     every state/.style={minimum size=0pt,
                     draw=blue!50,very thick,fill=blue!20},
-                    baseline=0mm]
+                    baseline=(current bounding box.center)]
 \node[state,initial]  (Q_0)  {$0$};
-\node[state] (Q_1) [above=of Q_0] {$1$};
-\node[state, accepting] (Q_2) [below=of Q_0] {$2$};
-\path[->] (Q_0) edge node [left]  {$\epsilon$} (Q_1);
-\path[->] (Q_0) edge node [left]  {$\epsilon$} (Q_2);
-\path[->] (Q_0) edge [loop right] node  {$a$} ();
-\path[->] (Q_1) edge [loop above] node  {$a$} ();
-\path[->] (Q_2) edge [loop below] node  {$b$} ();
+\node[state] (Q_1) [below=of Q_0] {$1$};
+\node[state, accepting] (Q_2) [below=of Q_1] {$2$};
+
+\path[->] (Q_0) edge node [right]  {$b$} (Q_1);
+\path[->] (Q_1) edge node [right]  {$a,b$} (Q_2);
+\path[->] (Q_0) edge [loop above] node  {$a, b$} ();
 \end{tikzpicture}
 &
-\begin{tabular}{r|cl}
-nodes & $a$ & $b$\\
+\begin{tabular}{r|ll}
+states & $a$ & $b$\\
 \hline
 $\{\}\phantom{\star}$ & $\{\}$ & $\{\}$\\
-$\{0\}\phantom{\star}$       & $\{0,1,2\}$   & $\{2\}$\\
-$\{1\}\phantom{\star}$       & $\{1\}$       & $\{\}$\\
-$\{2\}\star$  & $\{\}$ & $\{2\}$\\
-$\{0,1\}\phantom{\star}$     & $\{0,1,2\}$   & $\{2\}$\\
-$\{0,2\}\star$ & $\{0,1,2\}$   & $\{2\}$\\
-$\{1,2\}\star$ & $\{1\}$       & $\{2\}$\\
-s: $\{0,1,2\}\star$ & $\{0,1,2\}$ & $\{2\}$\\
+start: $\{0\}\phantom{\star}$       & $\{0\}$   & $\{0,1\}$\\
+$\{1\}\phantom{\star}$       & $\{2\}$       & $\{2\}$\\
+$\{2\}\star$  & $\{\}$ & $\{\}$\\
+$\{0,1\}\phantom{\star}$     & $\{0,2\}$   & $\{0,1,2\}$\\
+$\{0,2\}\star$ & $\{0\}$   & $\{0,1\}$\\
+$\{1,2\}\star$ & $\{2\}$       & $\{2\}$\\
+$\{0,1,2\}\star$ & $\{0,2\}$ & $\{0,1,2\}$\\
 \end{tabular}
 \end{tabular}
 \end{center}
 
-\noindent The nodes of the DFA are given by calculating all
-subsets of the set of nodes of the NFA (seen in the nodes
-column on the right). The table shows the transition function
-for the DFA. The first row states that $\{\}$ is the
-sink node which has transitions for $a$ and $b$ to itself.
-The next three lines are calculated as follows: 
+\noindent The states of the corresponding DFA are given by generating
+all subsets of the set of states of the NFA (seen in the states column
+in the table on the right). The other columns define the transition
+function for the DFA for input $a$ and $b$. The first row states that
+$\{\}$ is the sink state which has transitions for $a$ and $b$ to
+itself.  The next three lines are calculated as follows:
 
 \begin{itemize}
-\item suppose you calculate the entry for the transition for
-      $a$ and the node $\{0\}$
-\item start from the node $0$ in the NFA
-\item do as many $\epsilon$-transition as you can obtaining a
-      set of nodes, in this case $\{0,1,2\}$
-\item filter out all notes that do not allow an $a$-transition
-      from this set, this excludes $2$ which does not permit a
-      $a$-transition
-\item from the remaining set, do as many $\epsilon$-transition
-      as you can, this yields again $\{0,1,2\}$     
-\item the resulting set specifies the transition from $\{0\}$
-      when given an $a$ 
+\item Suppose you calculate the entry for the $a$-transition for state
+  $\{0\}$. Look for all states in the NFA that can be reached by such
+  a transition from this state; this is only state $0$; therefore from
+  state $\{0\}$ we can go to state $\{0\}$ via an $a$-transition.
+\item Do the same for the $b$-transition; you can reach states $0$ and
+  $1$ in the NFA; therefore in the DFA we can go from state $\{0\}$ to
+  state $\{0,1\}$ via an $b$-transition.
+\item Continue with the states $\{1\}$ and $\{2\}$.
+\item Once you filled in the transitions for `simple' state, you only
+  have to build the union for the compound states $\{0,1\}$, $\{0,2\}$
+  and so on. For example for $\{0,1\}$ you take the union of line
+  $\{0\}$ and line $\{1\}$, which gives $\{0,2\}$ for $a$, and
+  $\{0,1,2\}$ for $b$. And so on.
+\item The starting state of the DFA can be calculated from the
+  starting states of the NFA, that is in this case $0$. But in general
+  there can be many starting states in the NFA and you would take the
+  corresponding subset as \emph{the} starting state of the DFA.
+\item The accepting states in the DFA are given by all sets that
+  contain a $2$, which is the only accpting state in this NFA. But
+  again in general if the subset contains an accepting state from the
+  NFA, then the corresponding state in the DFA is accepting as well.
 \end{itemize}
 
-\noindent So the transition from the state $\{0\}$ reading an
-$a$ goes to the state $\{0,1,2\}$. Similarly for the other
-entries in the rows for $\{0\}$, $\{1\}$ and $\{2\}$. The
-other rows are calculated by just taking the union of the
-single node entries. For example for $a$ and $\{0,1\}$ we need
-to take the union of $\{0,1,2\}$ (for $0$) and $\{1\}$ (for
-$1$). The starting state of the DFA can be calculated from the
-starting state of the NFA, that is $0$, and then do as many
-$\epsilon$-transitions as possible. This gives $\{0,1,2\}$
-which is the starting state of the DFA. The terminal states in
-the DFA are given by all sets that contain a $2$, which is the
-terminal state of the NFA. This completes the subset
-construction. So the corresponding DFA to the NFA from 
-above is
+\noindent This completes the subset construction. The corresponding
+DFA for the NFA shown above is:
 
 \begin{center}
-\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
+\begin{tikzpicture}[scale=0.8,>=stealth',very thick,
                     every state/.style={minimum size=0pt,
                     draw=blue!50,very thick,fill=blue!20},
                     baseline=0mm]
-\node[state,initial,accepting]  (q012)  {$0,1,2$};
-\node[state,accepting] (q02) [right=of q012] {$0,2$};
-\node[state] (q01) [above=of q02] {$0,1$};
-\node[state,accepting] (q12) [below=of q02] {$1,2$};
-\node[state] (q0) [right=2cm of q01] {$0$};
-\node[state] (q1) [right=2.5cm of q02] {$1$};
-\node[state,accepting] (q2) [right=1.5cm of q12] {$2$};
-\node[state] (qn) [right=of q1] {$\{\}$};
+\node[state,initial]  (q0)  {$0$};
+\node[state] (q01) [right=of q0] {$0,1$};
+\node[state,accepting] (q02) [below=of q01] {$0,2$};
+\node[state,accepting] (q012) [right=of q02] {$0,1,2$};
+\node[state] (q1) [below=0.5cm of q0] {$1$};
+\node[state,accepting] (q2) [below=1cm of q1] {$2$};
+\node[state] (qn) [below left=1cm of q2] {$\{\}$};
+\node[state,accepting] (q12) [below right=1cm of q2] {$1,2$};
+
+\path[->] (q0) edge node [above] {$b$} (q01);
+\path[->] (q01) edge node [above] {$b$} (q012);
+\path[->] (q0) edge [loop above] node  {$a$} ();
+\path[->] (q012) edge [loop right] node  {$b$} ();
+\path[->] (q012) edge node [below] {$a$} (q02);
+\path[->] (q02) edge node [below] {$a$} (q0);
+\path[->] (q01) edge [bend left] node [left]  {$a$} (q02);
+\path[->] (q02) edge [bend left] node [right]  {$b$} (q01);
+\path[->] (q1) edge node [left] {$a,b$} (q2);
+\path[->] (q12) edge node [right] {$a, b$} (q2);
+\path[->] (q2) edge node [right] {$a, b$} (qn);
+\path[->] (qn) edge [loop left] node  {$a,b$} ();
+\end{tikzpicture}
+\end{center}
+
+\noindent
+Please check that this is indeed a DFA. The big question is whether
+this DFA can recognise the same language as the NFA we started with.
+I let you ponder about this question.
+
+
+There are also two points to note: One is that very often the
+resulting DFA contains a number of ``dead'' states that are never
+reachable from the starting state. This is obvious in this case, where
+state $\{1\}$, $\{2\}$, $\{1,2\}$ and $\{\}$ can never be reached from
+the starting state.  In effect the DFA in this example is not a
+\emph{minimal} DFA (more about this in a minute). Such dead states can
+be safely removed without changing the language that is recognised by
+the DFA. Another point is that in some cases, however, the subset
+construction produces a DFA that does \emph{not} contain any dead
+states\ldots{}and further calculates a minimal DFA. Which in turn
+means that in some cases the number of states can by going from NFAs
+to DFAs exponentially increase, namely by $2^n$ (which is the number
+of subsets you can form for $n$ states).  This blow up the number of
+states in the DFA is again bad news for how quickly you can decide
+whether a string is accepted by a DFA or not. So the caveat with DFAs
+is that they might make the task of finding the next state trival, but
+might require $2^n$ times as many states as a NFA.\bigskip
+
+Lastly, can we 
+
+{\small\begin{lstlisting}[language=Scala]
+def subset[A, C](nfa: NFA[A, C]) : DFA[Set[A], C] = {
+  DFA(nfa.starts, 
+      { case (qs, c) => nfa.nexts(qs, c) }, 
+      _.exists(nfa.fins))
+}
+\end{lstlisting}}  
+
+
+
+\subsection*{DFA Minimisation}
+
+As seen in the subset construction, the translation 
+of a NFA to a DFA can result in a rather ``inefficient'' 
+DFA. Meaning there are states that are not needed. A
+DFA can be \emph{minimised} by the following algorithm:
+
+\begin{enumerate}
+\item Take all pairs $(q, p)$ with $q \not= p$
+\item Mark all pairs that accepting and non-accepting states
+\item For all unmarked pairs $(q, p)$ and all characters $c$
+      test whether 
+      
+      \begin{center} 
+      $(\delta(q, c), \delta(p,c))$ 
+      \end{center} 
+      
+      are marked. If there is one, then also mark $(q, p)$.
+\item Repeat last step until no change.
+\item All unmarked pairs can be merged.
+\end{enumerate}
+
+\noindent To illustrate this algorithm, consider the following 
+DFA.
 
-\path[->] (q012) edge [loop below] node  {$a$} ();
-\path[->] (q012) edge node [above]  {$b$} (q2);
-\path[->] (q12) edge [bend left] node [below,pos=0.4]  {$a$} (q1);
-\path[->] (q12) edge node [below]  {$b$} (q2);
-\path[->] (q02) edge node [above]  {$a$} (q012);
-\path[->] (q02) edge [bend left] node [above, pos=0.8]  {$b$} (q2);
-\path[->] (q01) edge node [below]  {$a$} (q012);
-\path[->] (q01) edge [bend left] node [above]  {$b$} (q2);
-\path[->] (q0) edge node [below]  {$a$} (q012);
-\path[->] (q0) edge node [right, pos=0.2]  {$b$} (q2);
-\path[->] (q1) edge [loop above] node  {$a$} ();
-\path[->] (q1) edge node [above]  {$b$} (qn);
-\path[->] (q2) edge [loop right] node  {$b$} ();
-\path[->] (q2) edge node [below]  {$a$} (qn);
-\path[->] (qn) edge [loop above] node  {$a,b$} ();
+\begin{center}
+\begin{tikzpicture}[>=stealth',very thick,auto,
+                    every state/.style={minimum size=0pt,
+                    inner sep=2pt,draw=blue!50,very thick,
+                    fill=blue!20}]
+\node[state,initial]  (Q_0)  {$Q_0$};
+\node[state] (Q_1) [right=of Q_0] {$Q_1$};
+\node[state] (Q_2) [below right=of Q_0] {$Q_2$};
+\node[state] (Q_3) [right=of Q_2] {$Q_3$};
+\node[state, accepting] (Q_4) [right=of Q_1] {$Q_4$};
+\path[->] (Q_0) edge node [above]  {$a$} (Q_1);
+\path[->] (Q_1) edge node [above]  {$a$} (Q_4);
+\path[->] (Q_4) edge [loop right] node  {$a, b$} ();
+\path[->] (Q_3) edge node [right]  {$a$} (Q_4);
+\path[->] (Q_2) edge node [above]  {$a$} (Q_3);
+\path[->] (Q_1) edge node [right]  {$b$} (Q_2);
+\path[->] (Q_0) edge node [above]  {$b$} (Q_2);
+\path[->] (Q_2) edge [loop left] node  {$b$} ();
+\path[->] (Q_3) edge [bend left=95, looseness=1.3] node 
+  [below]  {$b$} (Q_0);
+\end{tikzpicture}
+\end{center}
+
+\noindent In Step 1 and 2 we consider essentially a triangle
+of the form
+
+\begin{center}
+\begin{tikzpicture}[scale=0.6,line width=0.8mm]
+\draw (0,0) -- (4,0);
+\draw (0,1) -- (4,1);
+\draw (0,2) -- (3,2);
+\draw (0,3) -- (2,3);
+\draw (0,4) -- (1,4);
+
+\draw (0,0) -- (0, 4);
+\draw (1,0) -- (1, 4);
+\draw (2,0) -- (2, 3);
+\draw (3,0) -- (3, 2);
+\draw (4,0) -- (4, 1);
+
+\draw (0.5,-0.5) node {$Q_0$}; 
+\draw (1.5,-0.5) node {$Q_1$}; 
+\draw (2.5,-0.5) node {$Q_2$}; 
+\draw (3.5,-0.5) node {$Q_3$};
+ 
+\draw (-0.5, 3.5) node {$Q_1$}; 
+\draw (-0.5, 2.5) node {$Q_2$}; 
+\draw (-0.5, 1.5) node {$Q_3$}; 
+\draw (-0.5, 0.5) node {$Q_4$}; 
+
+\draw (0.5,0.5) node {\large$\star$}; 
+\draw (1.5,0.5) node {\large$\star$}; 
+\draw (2.5,0.5) node {\large$\star$}; 
+\draw (3.5,0.5) node {\large$\star$};
+\end{tikzpicture}
+\end{center}
+
+\noindent where the lower row is filled with stars, because in
+the corresponding pairs there is always one state that is
+accepting ($Q_4$) and a state that is non-accepting (the other
+states).
+
+Now in Step 3 we need to fill in more stars according whether 
+one of the next-state pairs are marked. We have to do this 
+for every unmarked field until there is no change anymore.
+This gives the triangle
+
+\begin{center}
+\begin{tikzpicture}[scale=0.6,line width=0.8mm]
+\draw (0,0) -- (4,0);
+\draw (0,1) -- (4,1);
+\draw (0,2) -- (3,2);
+\draw (0,3) -- (2,3);
+\draw (0,4) -- (1,4);
+
+\draw (0,0) -- (0, 4);
+\draw (1,0) -- (1, 4);
+\draw (2,0) -- (2, 3);
+\draw (3,0) -- (3, 2);
+\draw (4,0) -- (4, 1);
+
+\draw (0.5,-0.5) node {$Q_0$}; 
+\draw (1.5,-0.5) node {$Q_1$}; 
+\draw (2.5,-0.5) node {$Q_2$}; 
+\draw (3.5,-0.5) node {$Q_3$};
+ 
+\draw (-0.5, 3.5) node {$Q_1$}; 
+\draw (-0.5, 2.5) node {$Q_2$}; 
+\draw (-0.5, 1.5) node {$Q_3$}; 
+\draw (-0.5, 0.5) node {$Q_4$}; 
+
+\draw (0.5,0.5) node {\large$\star$}; 
+\draw (1.5,0.5) node {\large$\star$}; 
+\draw (2.5,0.5) node {\large$\star$}; 
+\draw (3.5,0.5) node {\large$\star$};
+\draw (0.5,1.5) node {\large$\star$}; 
+\draw (2.5,1.5) node {\large$\star$}; 
+\draw (0.5,3.5) node {\large$\star$}; 
+\draw (1.5,2.5) node {\large$\star$}; 
+\end{tikzpicture}
+\end{center}
+
+\noindent which means states $Q_0$ and $Q_2$, as well as $Q_1$
+and $Q_3$ can be merged. This gives the following minimal DFA
+
+\begin{center}
+\begin{tikzpicture}[>=stealth',very thick,auto,
+                    every state/.style={minimum size=0pt,
+                    inner sep=2pt,draw=blue!50,very thick,
+                    fill=blue!20}]
+\node[state,initial]  (Q_02)  {$Q_{0, 2}$};
+\node[state] (Q_13) [right=of Q_02] {$Q_{1, 3}$};
+\node[state, accepting] (Q_4) [right=of Q_13] 
+  {$Q_{4\phantom{,0}}$};
+\path[->] (Q_02) edge [bend left] node [above]  {$a$} (Q_13);
+\path[->] (Q_13) edge [bend left] node [below]  {$b$} (Q_02);
+\path[->] (Q_02) edge [loop below] node  {$b$} ();
+\path[->] (Q_13) edge node [above]  {$a$} (Q_4);
+\path[->] (Q_4) edge [loop above] node  {$a, b$} ();
 \end{tikzpicture}
 \end{center}
 
 
-
-There are two points to note: One is that very often the
-resulting DFA contains a number of ``dead'' nodes that are
-never reachable from the starting state. For example
-there is no way to reach node $\{0,2\}$ from the starting
-state $\{0,1,2\}$. I let you find the other dead states.
-In effect the DFA in this example is not a minimal DFA. Such
-dead nodes can be safely removed without changing the language
-that is recognised by the DFA. Another point is that in some
-cases, however, the subset construction produces a DFA that
-does \emph{not} contain any dead nodes\ldots{}that means it
-calculates a minimal DFA. Which in turn means that in some
-cases the number of nodes by going from NFAs to DFAs
-exponentially increases, namely by $2^n$ (which is the number
-of subsets you can form for $n$ nodes). 
-
-Removing all the dead states in the automaton above,
-gives a much more legible automaton, namely
-
-\begin{center}
-\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
-                    every state/.style={minimum size=0pt,
-                    draw=blue!50,very thick,fill=blue!20},
-                    baseline=0mm]
-\node[state,initial,accepting]  (q012)  {$0,1,2$};
-\node[state,accepting] (q2) [right=of q012] {$2$};
-\node[state] (qn) [right=of q2] {$\{\}$};
-
-\path[->] (q012) edge [loop below] node  {$a$} ();
-\path[->] (q012) edge node [above]  {$b$} (q2);
-\path[->] (q2) edge [loop below] node  {$b$} ();
-\path[->] (q2) edge node [below]  {$a$} (qn);
-\path[->] (qn) edge [loop above] node  {$a,b$} ();
-\end{tikzpicture}
-\end{center}
-
-\noindent Now the big question is whether this DFA
-can recognise the same language as the NFA we started with.
-I let you ponder about this question.
-
-\subsubsection*{Brzozowski's Method}
+\subsection*{Brzozowski's Method}
 
 As said before, we can also go into the other direction---from
 DFAs to regular expressions. Brzozowski's method calculates
@@ -1190,154 +1345,8 @@
 an equivalent  regular expression for the automaton. But
 for the purposes of this module, we omit this.
 
-\subsubsection*{Automata Minimization}
 
-As seen in the subset construction, the translation 
-of a NFA to a DFA can result in a rather ``inefficient'' 
-DFA. Meaning there are states that are not needed. A
-DFA can be \emph{minimised} by the following algorithm:
-
-\begin{enumerate}
-\item Take all pairs $(q, p)$ with $q \not= p$
-\item Mark all pairs that accepting and non-accepting states
-\item For all unmarked pairs $(q, p)$ and all characters $c$
-      test whether 
-      
-      \begin{center} 
-      $(\delta(q, c), \delta(p,c))$ 
-      \end{center} 
-      
-      are marked. If there is one, then also mark $(q, p)$.
-\item Repeat last step until no change.
-\item All unmarked pairs can be merged.
-\end{enumerate}
-
-\noindent To illustrate this algorithm, consider the following 
-DFA.
-
-\begin{center}
-\begin{tikzpicture}[>=stealth',very thick,auto,
-                    every state/.style={minimum size=0pt,
-                    inner sep=2pt,draw=blue!50,very thick,
-                    fill=blue!20}]
-\node[state,initial]  (Q_0)  {$Q_0$};
-\node[state] (Q_1) [right=of Q_0] {$Q_1$};
-\node[state] (Q_2) [below right=of Q_0] {$Q_2$};
-\node[state] (Q_3) [right=of Q_2] {$Q_3$};
-\node[state, accepting] (Q_4) [right=of Q_1] {$Q_4$};
-\path[->] (Q_0) edge node [above]  {$a$} (Q_1);
-\path[->] (Q_1) edge node [above]  {$a$} (Q_4);
-\path[->] (Q_4) edge [loop right] node  {$a, b$} ();
-\path[->] (Q_3) edge node [right]  {$a$} (Q_4);
-\path[->] (Q_2) edge node [above]  {$a$} (Q_3);
-\path[->] (Q_1) edge node [right]  {$b$} (Q_2);
-\path[->] (Q_0) edge node [above]  {$b$} (Q_2);
-\path[->] (Q_2) edge [loop left] node  {$b$} ();
-\path[->] (Q_3) edge [bend left=95, looseness=1.3] node 
-  [below]  {$b$} (Q_0);
-\end{tikzpicture}
-\end{center}
-
-\noindent In Step 1 and 2 we consider essentially a triangle
-of the form
-
-\begin{center}
-\begin{tikzpicture}[scale=0.6,line width=0.8mm]
-\draw (0,0) -- (4,0);
-\draw (0,1) -- (4,1);
-\draw (0,2) -- (3,2);
-\draw (0,3) -- (2,3);
-\draw (0,4) -- (1,4);
-
-\draw (0,0) -- (0, 4);
-\draw (1,0) -- (1, 4);
-\draw (2,0) -- (2, 3);
-\draw (3,0) -- (3, 2);
-\draw (4,0) -- (4, 1);
-
-\draw (0.5,-0.5) node {$Q_0$}; 
-\draw (1.5,-0.5) node {$Q_1$}; 
-\draw (2.5,-0.5) node {$Q_2$}; 
-\draw (3.5,-0.5) node {$Q_3$};
- 
-\draw (-0.5, 3.5) node {$Q_1$}; 
-\draw (-0.5, 2.5) node {$Q_2$}; 
-\draw (-0.5, 1.5) node {$Q_3$}; 
-\draw (-0.5, 0.5) node {$Q_4$}; 
-
-\draw (0.5,0.5) node {\large$\star$}; 
-\draw (1.5,0.5) node {\large$\star$}; 
-\draw (2.5,0.5) node {\large$\star$}; 
-\draw (3.5,0.5) node {\large$\star$};
-\end{tikzpicture}
-\end{center}
-
-\noindent where the lower row is filled with stars, because in
-the corresponding pairs there is always one state that is
-accepting ($Q_4$) and a state that is non-accepting (the other
-states).
-
-Now in Step 3 we need to fill in more stars according whether 
-one of the next-state pairs are marked. We have to do this 
-for every unmarked field until there is no change anymore.
-This gives the triangle
-
-\begin{center}
-\begin{tikzpicture}[scale=0.6,line width=0.8mm]
-\draw (0,0) -- (4,0);
-\draw (0,1) -- (4,1);
-\draw (0,2) -- (3,2);
-\draw (0,3) -- (2,3);
-\draw (0,4) -- (1,4);
-
-\draw (0,0) -- (0, 4);
-\draw (1,0) -- (1, 4);
-\draw (2,0) -- (2, 3);
-\draw (3,0) -- (3, 2);
-\draw (4,0) -- (4, 1);
-
-\draw (0.5,-0.5) node {$Q_0$}; 
-\draw (1.5,-0.5) node {$Q_1$}; 
-\draw (2.5,-0.5) node {$Q_2$}; 
-\draw (3.5,-0.5) node {$Q_3$};
- 
-\draw (-0.5, 3.5) node {$Q_1$}; 
-\draw (-0.5, 2.5) node {$Q_2$}; 
-\draw (-0.5, 1.5) node {$Q_3$}; 
-\draw (-0.5, 0.5) node {$Q_4$}; 
-
-\draw (0.5,0.5) node {\large$\star$}; 
-\draw (1.5,0.5) node {\large$\star$}; 
-\draw (2.5,0.5) node {\large$\star$}; 
-\draw (3.5,0.5) node {\large$\star$};
-\draw (0.5,1.5) node {\large$\star$}; 
-\draw (2.5,1.5) node {\large$\star$}; 
-\draw (0.5,3.5) node {\large$\star$}; 
-\draw (1.5,2.5) node {\large$\star$}; 
-\end{tikzpicture}
-\end{center}
-
-\noindent which means states $Q_0$ and $Q_2$, as well as $Q_1$
-and $Q_3$ can be merged. This gives the following minimal DFA
-
-\begin{center}
-\begin{tikzpicture}[>=stealth',very thick,auto,
-                    every state/.style={minimum size=0pt,
-                    inner sep=2pt,draw=blue!50,very thick,
-                    fill=blue!20}]
-\node[state,initial]  (Q_02)  {$Q_{0, 2}$};
-\node[state] (Q_13) [right=of Q_02] {$Q_{1, 3}$};
-\node[state, accepting] (Q_4) [right=of Q_13] 
-  {$Q_{4\phantom{,0}}$};
-\path[->] (Q_02) edge [bend left] node [above]  {$a$} (Q_13);
-\path[->] (Q_13) edge [bend left] node [below]  {$b$} (Q_02);
-\path[->] (Q_02) edge [loop below] node  {$b$} ();
-\path[->] (Q_13) edge node [above]  {$a$} (Q_4);
-\path[->] (Q_4) edge [loop above] node  {$a, b$} ();
-\end{tikzpicture}
-\end{center}
-
-\subsubsection*{Regular Languages}
+\subsection*{Regular Languages}
 
 Given the constructions in the previous sections we obtain 
 the following overall picture:
@@ -1381,7 +1390,7 @@
 derivatives or NFAs or DFAs. But let us quickly look at what
 the differences mean in computational terms. Translating a
 regular expression into a NFA gives us an automaton that has
-$O(n)$ nodes---that means the size of the NFA grows linearly
+$O(n)$ states---that means the size of the NFA grows linearly
 with the size of the regular expression. The problem with NFAs
 is that the problem of deciding whether a string is accepted
 or not is computationally not cheap. Remember with NFAs we
@@ -1441,11 +1450,11 @@
 automaton for this language, but again that would lead us too
 far afield for what we want to do in this module.
 
-\section*{Further Reading}
+%\section*{Further Reading}
 
-Compare what a ``human expert'' would create as an automaton for the
-regular expression $a (b + c)^*$ and what the Thomson
-algorithm generates.
+%Compare what a ``human expert'' would create as an automaton for the
+%regular expression $a\cdot (b + c)^*$ and what the Thomson
+%algorithm generates.
 
 %http://www.inf.ed.ac.uk/teaching/courses/ct/
 \end{document}
Binary file handouts/ho04.pdf has changed
--- a/handouts/ho04.tex	Sun May 07 03:01:29 2017 +0100
+++ b/handouts/ho04.tex	Tue May 09 12:31:55 2017 +0100
@@ -91,11 +91,13 @@
 letters, while values just start with a single upper-case
 character and the rest are lower-case letters.
  
-{\small\lstinputlisting[language=Scala,numbers=none]
+{\small\lstinputlisting[language=Scala,numbers=none,linebackgroundcolor=
+                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
 {../progs/app01.scala}}
 
  
-{\small\lstinputlisting[language=Scala,numbers=none]
+{\small\lstinputlisting[language=Scala,numbers=none,linebackgroundcolor=
+                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
 {../progs/app02.scala}}
 
 
@@ -580,7 +582,8 @@
 \end{figure}
 
 \begin{figure}[p]
-\lstinputlisting{../progs/app61.scala}
+\lstinputlisting[numbers=left,linebackgroundcolor=
+   {\ifodd\value{lstnumber}\color{capri!3}\fi}]{../progs/app61.scala}
 
 \caption{The Scala code for the simplification function. The
 first part defines some auxillary functions for the rectification.
@@ -639,7 +642,8 @@
 be regarded as a regular expression. The extended definition
 in Scala therefore looks as follows:
 
-{\small\lstinputlisting[language=Scala]
+{\small\lstinputlisting[language=Scala, numbers=none,linebackgroundcolor=
+                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
 {../progs/app03.scala}}
 
 \noindent Since we regard records as regular expressions we
@@ -648,7 +652,8 @@
 to extend the definition of values, which in Scala looks as
 follows:
 
-{\small\lstinputlisting[language=Scala]
+{\small\lstinputlisting[language=Scala, numbers=none,linebackgroundcolor=
+                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
 {../progs/app04.scala}}
 
 \noindent Let us now look at the purpose of records more
--- a/langs.sty	Sun May 07 03:01:29 2017 +0100
+++ b/langs.sty	Tue May 09 12:31:55 2017 +0100
@@ -39,27 +39,6 @@
   otherkeywords={=,!=,:=,<,>,\%;*,/},
 }[keywords,comments,strings]
 
-\lstdefinestyle{mystyle}
-       {basicstyle=\ttfamily,
-	keywordstyle=\color{codepurple}\bfseries,
-	stringstyle=\color{codegreen},
-	commentstyle=\color{codegreen},
-	morecomment=[s][\color{codedocblue}]{/**}{*/},
-	numbers=left,
-	numberstyle=\tiny\color{black},
-	stepnumber=1,
-	numbersep=10pt,
-	tabsize=2,
-	showspaces=false,
-	showstringspaces=false,
-        xleftmargin=8mm,
-        emphstyle=\color{codeblue}\bfseries,
-        keepspaces
-}
-
-\lstset{language=Scala,
-        style=mystyle}
-
 
 \newcommand{\code}[1]{{\lstinline{#1}}}
 \newcommand{\pcode}[1]{\mbox{\lstset{language={},keywordstyle=\color{black}}\lstinline!#1!}}
@@ -69,8 +48,29 @@
 %%\lstset{escapeinside={(*@}{@*)}}
 \lstset{escapeinside={/*@}{@*/}}
 
-
-
 %% stripy code
 \usepackage{lstlinebgrd}
 \definecolor{capri}{rgb}{0.0, 0.75, 1.0}
+
+
+\lstdefinestyle{mystyle}
+       {basicstyle=\ttfamily,
+	keywordstyle=\color{codepurple}\bfseries,
+	stringstyle=\color{codegreen},
+	commentstyle=\color{codegreen},
+	morecomment=[s][\color{codedocblue}]{/**}{*/},
+	numbers=none,
+	numberstyle=\tiny\color{black},
+	stepnumber=1,
+	numbersep=10pt,
+	tabsize=2,
+	showspaces=false,
+	showstringspaces=false,
+        xleftmargin=8mm,
+        emphstyle=\color{codeblue}\bfseries,
+        keepspaces,
+        linebackgroundcolor={\ifodd\value{lstnumber}\color{capri!3}\fi}
+}
+
+\lstset{language=Scala,
+        style=mystyle}
--- a/progs/nfa.scala	Sun May 07 03:01:29 2017 +0100
+++ b/progs/nfa.scala	Tue May 09 12:31:55 2017 +0100
@@ -85,4 +85,15 @@
 
 // subset constructions
 
-//def subset(nfa: NFA[A, C]) : DFA[Set[A], C] =
+def subset[A, C](nfa: NFA[A, C]) : DFA[Set[A], C] = {
+  DFA(nfa.starts, 
+      { case (qs, c) => nfa.nexts(qs, c) }, 
+      _.exists(nfa.fins))
+}
+
+subset(nfa1).accepts("aa".toList)             // false
+subset(nfa1).accepts("aaaaa".toList)          // false
+subset(nfa1).accepts("aaaaab".toList)         // true
+subset(nfa1).accepts("aaaaabbb".toList)       // true
+subset(nfa1).accepts("aaaaabbbaaa".toList)    // false
+subset(nfa1).accepts("ac".toList)             // false
--- a/progs/re0.scala	Sun May 07 03:01:29 2017 +0100
+++ b/progs/re0.scala	Tue May 09 12:31:55 2017 +0100
@@ -1,4 +1,5 @@
 import scala.annotation.tailrec
+import scala.language.implicitConversions
 
 abstract class Rexp
 
@@ -72,6 +73,7 @@
     if (i == 0) NULL else SEQ(der(c, r), REP(r, i - 1))
 }
 
+
 // derivative w.r.t. a string (iterates der)
 @tailrec
 def ders (s: List[Char], r: Rexp) : Rexp = s match {