--- a/handouts/ho03.tex	Mon Apr 03 01:10:54 2017 +0800
+++ b/handouts/ho03.tex	Tue Apr 11 06:22:46 2017 +0800
@@ -29,13 +29,14 @@
 
 \subsection*{Deterministic Finite Automata}
 
-Their definition is as follows:\medskip
+The central definition is:\medskip
 
 \noindent 
 A \emph{deterministic finite automaton} (DFA), say $A$, is
-given by a four-tuple written $A(Qs, Q_0, F, \delta)$ where
+given by a five-tuple written $A(\Sigma, Qs, Q_0, F, \delta)$ where
 
 \begin{itemize}
+\item $\Sigma$ is an alphabet,  
 \item $Qs$ is a finite set of states,
 \item $Q_0 \in Qs$ is the start state,
 \item $F \subseteq Qs$ are the accepting states, and
@@ -47,8 +48,8 @@
 the assumption that these transition functions do not need to be
 defined everywhere: so it can be the case that given a character there
 is no next state, in which case we need to raise a kind of ``failure
-exception''.  That means actually we have partial functions as
-transitions---see our implementation later on.  A typical example of a
+exception''.  That means actually we have \emph{partial} functions as
+transitions---see the implementation later on.  A typical example of a
 DFA is
 
 \begin{center}
@@ -115,7 +116,7 @@
 the way we hit the case where the transition function $\delta$ is not
 defined, we need to raise an error. In our implementation we will deal
 with this case elegantly by using Scala's \texttt{Try}. So a string
-$s$ is in the \emph{language accepted by the automaton} $A(Q, Q_0, F,
+$s$ is in the \emph{language accepted by the automaton} $A(\Sigma, Q, Q_0, F,
 \delta)$ iff
 
 \[
@@ -123,7 +124,7 @@
 \]
 
 \noindent I let you think about a definition that describes
-the set of strings accepted by an automaton. 
+the set of all strings accepted by an automaton. 
 
 \begin{figure}[p]
 \small
@@ -133,8 +134,8 @@
 \caption{A Scala implementation of DFAs using partial functions.
   Notice some subtleties: \texttt{deltas} implements the delta-hat
   construction by lifting the transition (partial) function to
-  lists of \texttt{C}haracters. Since \texttt{delta} is given
-  as partial function, it can obviously go ``wrong'' in which
+  lists of characters. Since \texttt{delta} is given
+  as a partial function, it can obviously go ``wrong'' in which
   case the \texttt{Try} in \texttt{accepts} catches the error and
   returns \texttt{false}---that means the string is not accepted.
   The example \texttt{delta} implements the DFA example shown
@@ -142,19 +143,25 @@
 \end{figure}
 
 A simple Scala implementation for DFAs is given in
-Figure~\ref{dfa}. As you can see, there many features of the
+Figure~\ref{dfa}. As you can see, there are many features of the
 mathematical definition that are quite closely reflected in the
 code. In the DFA-class, there is a starting state, called
-\texttt{start}, with the polymorphic type \texttt{A}. (The reason for
-having a polymorphic types for the states and the input of DFAs will
-become clearer later.)  There is a partial function \texttt{delta} for
-specifying the transitions. I used partial functions for representing
-the transitions in Scala; alternatives would have been \texttt{Maps}
-or even \texttt{Lists}. One of the main advantages of using partial
-functions is that transitions can be quite nicely defined by a series
-of \texttt{case} statements (see Lines 28 -- 38). If you need to
-represent an automaton with a sink state (catch-all-state), you can
-write something like
+\texttt{start}, with the polymorphic type \texttt{A}.  There is a
+partial function \texttt{delta} for specifying the transitions---these
+partial functions take a state (of polymorphic type \texttt{A}) and an
+input (of polymorphic type \texttt{C}) and produce a new state (of
+type \texttt{A}). For the moment it is OK to assume that \texttt{A} is
+some arbitrary type for states and the input is just characters.  (The
+reason for having a polymorphic types for the states and the input of
+DFAs will become clearer later on.)
+
+I used partial functions for representing the transitions in Scala;
+alternatives would have been \texttt{Maps} or even \texttt{Lists}. One
+of the main advantages of using partial functions is that transitions
+can be quite nicely defined by a series of \texttt{case} statements
+(see Lines 28 -- 38 for an example). If you need to represent an
+automaton with a sink state (catch-all-state), you can use Scala's
+pattern matching and write something like
 
 {\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
@@ -170,11 +177,13 @@
 \end{lstlisting}}  
 
 \noindent The DFA-class has also an argument for specifying final
-states. In the implementation it is a function from states to booleans
-(returns true whenever a state is meant to be final; false
-otherwise). While this boolean function is different from the sets of
-states used in the mathematical definition, Scala allows me to use
-sets for such functions (see Line 40 where I initialise the DFA).
+states. In the implementation it not a set of states, as in the
+matemathical definition, but is a function from states to booleans
+(this function is supposed to return true whenever a state is meant to
+be final; false otherwise). While this boolean function is different
+from the sets of states, Scala allows to use sets for such functions
+(see Line 40 where the DFA is initialised). Again it will become clear
+later on why I use functions for final states, rather than sets.
 
 I let you ponder whether this is a good implementation of DFAs. In
 doing so I hope you notice that the $Qs$ component (the set of finite
@@ -191,10 +200,11 @@
 to relax this restriction. That means we allow to have several
 potential successor states. We even allow more than one starting
 state. The resulting construction is called a \emph{non-deterministic
-  finite automaton} (NFA) given also as a four-tuple $A(Qs, Q_{0s}, F,
+  finite automaton} (NFA) given also as a five-tuple $A(\Sigma, Qs, Q_{0s}, F,
 \rho)$ where
 
 \begin{itemize}
+\item $\Sigma$ is an alphabet,  
 \item $Qs$ is a finite set of states
 \item $Q_{0s}$ is a set of start states ($Q_{0s} \subseteq Qs$)
 \item $F$ are some accepting states with $F \subseteq Qs$, and
@@ -202,12 +212,13 @@
 \end{itemize}
 
 \noindent
-A typical example of an NFA is
+A typical example of a NFA is
 
 % A NFA for (ab* + b)*a
 \begin{center}
-\begin{tikzpicture}[scale=0.8,>=stealth',very thick,
-every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]
+\begin{tikzpicture}[>=stealth',very thick, auto,
+    every state/.style={minimum size=0pt,inner sep=3pt,
+      draw=blue!50,very thick,fill=blue!20},scale=2]
 \node[state,initial]  (Q_0)  {$Q_0$};
 \node[state] (Q_1) [right=of Q_0] {$Q_1$};
 \node[state, accepting] (Q_2) [right=of Q_1] {$Q_2$};
@@ -223,40 +234,45 @@
 \noindent
 This NFA happens to have only one starting state, but in general there
 could be more.  Notice that in state $Q_0$ we might go to state $Q_1$
-\emph{or} to state $Q_2$ when receiving an $a$. Similarly in state $Q_1$
-and receiving an $a$, we can stay in state $Q_1$ or go to $Q_2$. This
-kind of choice is not allowed with DFAs. When it comes to deciding
-whether a string is accepted by an NFA we potentially need to explore
-all possibilities. I let you think which kind of strings this NFA
-accepts.
+\emph{or} to state $Q_2$ when receiving an $a$. Similarly in state
+$Q_1$ and receiving an $a$, we can stay in state $Q_1$ or go to
+$Q_2$. This kind of choice is not allowed with DFAs. The downside is
+that when it comes to deciding whether a string is accepted by a NFA
+we potentially have to explore all possibilities. I let you think
+which kind of strings the above NFA accepts.
 
 
-There are however a number of additional points you should note. Every
-DFA is a NFA, but not vice versa. The $\rho$ in NFAs is a transition
-\emph{relation} (DFAs have a transition function). The difference
-between a function and a relation is that a function has always a
-single output, while a relation gives, roughly speaking, several
-outputs. Look again at the NFA above: if you are currently in the
-state $Q_1$ and you read a character $b$, then you can transition to
-either $Q_0$ \emph{or} $Q_2$. Which route, or output, you take is not
-determined.  This non-determinism can be represented by a relation.
+There are a number of additional points you should note with
+NFAs. Every DFA is a NFA, but not vice versa. The $\rho$ in NFAs is a
+transition \emph{relation} (DFAs have a transition function). The
+difference between a function and a relation is that a function has
+always a single output, while a relation gives, roughly speaking,
+several outputs. Look again at the NFA above: if you are currently in
+the state $Q_1$ and you read a character $b$, then you can transition
+to either $Q_0$ \emph{or} $Q_2$. Which route, or output, you take is
+not determined.  This non-determinism can be represented by a
+relation.
 
-My implementation of NFAs is shown in Figure~\ref{nfa}.  Perhaps
-interestingly, I do not use relations for my implementation of NFAs in
-Scala, and I also do not use transition functions which return a set
-of states (another popular choice for implementing NFAs).  For reasons
-that become clear in a moment, I use sets of partial functions---see
-Line 7 in Figure~\ref{nfa}. \texttt{Starts} is now a set of states;
-\texttt{fins} is again a function from states to booleans. The
-\texttt{next} function calculates the set of next states reachable
-from a state \texttt{q} by a character \texttt{c}---we go through all
-partial functions in the \texttt{delta}-set, lift it (this transformes
-the partial function into the corresponding \texttt{Option}-function
-and then apply \texttt{q} and \texttt{c}. This gives us a set of
-\texttt{Some}s and \texttt{None}s, where the \texttt{Some}s are
-filtered out by the \texttt{flatMap}. Teh function \texttt{nexts} just
-lifts this to sets of states. \texttt{Deltas} and \texttt{accept} are
-similar to the DFA definitions.
+My implementation of NFAs in Scala is shown in Figure~\ref{nfa}.
+Perhaps interestingly, I do not actually use relations for my NFAs,
+and I also do not use transition functions that return sets of states
+(another popular choice for implementing NFAs).  For reasons that
+become clear in a moment, I use sets of partial functions
+instead---see Line 7 in Figure~\ref{nfa}. DFAs have only one such
+partial function; my NFAs have a set.  Another parameter,
+\texttt{Starts}, is in NFAs a set of states; \texttt{fins} is again a
+function from states to booleans. The \texttt{next} function
+calculates the set of next states reachable from a single state
+\texttt{q} by a character \texttt{c}---this is calculated by going
+through all the partial functions in the \texttt{delta}-set and apply
+\texttt{q} and \texttt{c} (see Line 13). This gives us a set of
+\texttt{Some}s (in case the application succeeded) and possibly some
+\texttt{None}s (in case the partial function is not defined or produces an
+error).  The \texttt{None}s are filtered out by the \texttt{flatMap},
+leaving the values inside the \texttt{Some}s. The function
+\texttt{nexts} just lifts this function to sets of
+states. \texttt{Deltas} and \texttt{accept} are similar to the DFA
+definitions.
 
 \begin{figure}[t]
 \small
@@ -264,17 +280,108 @@
                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
                   {../progs/nfa.scala}
 \caption{A Scala implementation of NFAs using sets of partial functions.
-  Notice some subtleties: \texttt{deltas} implements the delta-hat
-  construction by lifting the transition (partial) function to
-  lists of \texttt{C}haracters. Since \texttt{delta} is given
-  as partial function, it can obviously go ``wrong'' in which
-  case the \texttt{Try} in \texttt{accepts} catches the error and
-  returns \texttt{false}---that means the string is not accepted.
-  The example \texttt{delta} implements the DFA example shown
-  earlier in the handout.\label{nfa}}
+  Notice some subtleties: Since \texttt{delta} is given
+  as a set of partial functions, each of them can obviously go ``wrong'' in which
+  case the \texttt{Try}. The function \texttt{accepts} implements the
+  acceptance of a string in a breath-first fashion. This can be costly
+  way of deciding whether a string is accepted in practical contexts.\label{nfa}}
 \end{figure}
 
+The reason for using sets of partial functions for specifying the
+transitions in NFAs has to do with examples like this one: a
+popular benchmark regular expression is $(.)^*\cdot a\cdot
+(.)^{\{n\}}\cdot b\cdot c$. A NFA that accepts the same strings
+(for $n=3$) is as follows:
 
+\begin{center}
+\begin{tikzpicture}[>=stealth',very thick, auto, node distance=7mm,
+    every state/.style={minimum size=0pt,inner sep=1pt,
+      draw=blue!50,very thick,fill=blue!20},scale=0.5]
+\node[state,initial]  (Q_0)  {$Q_0$};
+\node[state] (Q_1) [right=of Q_0] {$Q_1$};
+\node[state] (Q_2) [right=of Q_1] {$Q_2$};
+\node[state] (Q_3) [right=of Q_2] {$Q_3$};
+\node[state] (Q_4) [right=of Q_3] {$Q_4$};
+\node[state] (Q_5) [right=of Q_4] {$Q_5$};
+\node[state,accepting] (Q_6) [right=of Q_5] {$Q_6$};
+\path[->] (Q_0) edge [loop above] node  {$.$} ();
+\path[->] (Q_0) edge node [above]  {$a$} (Q_1);
+\path[->] (Q_1) edge node [above]  {$.$} (Q_2);
+\path[->] (Q_2) edge node [above]  {$.$} (Q_3);
+\path[->] (Q_3) edge node [above]  {$.$} (Q_4);
+\path[->] (Q_4) edge node [above]  {$b$} (Q_5);
+\path[->] (Q_5) edge node [above]  {$c$} (Q_6);
+\end{tikzpicture}
+\end{center}
+
+\noindent
+The $.$ stands for accepting any single character: for example if we
+are in $Q_0$ and read an $a$ we can either stay in $Q_0$ (since any
+character will do for this) or advance to $Q_1$ (but only if it is an
+$a$). Why this is a good benchmark regular expression is irrelevant
+here. The point is that this NFA can be conveniently represented by
+the code:
+
+{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
+val delta = Set[(State, Char) :=> State](
+  { case (Q0, 'a') => Q1 },
+  { case (Q0, _)   => Q0 },
+  { case (Q1, _)   => Q2 },
+  { case (Q2, _)   => Q3 },
+  { case (Q3, _)   => Q4 },
+  { case (Q4, 'b') => Q5 },
+  { case (Q5, 'c') => Q6 }
+)
+
+NFA(Set[State](Q0), delta, Set[State](Q6))
+\end{lstlisting}}
+
+\noindent
+where the $.$-transitions translate into a
+underscore-pattern-matching. Recall that in $Q_0$ if we read an $a$ we
+can go to $Q_1$ (by the first partial function in the set) and also
+stay in $Q_0$ (by the second partial function). Representing such
+transitions in any other way is somehow awkward; the set of partial
+function representation makes them easy to implement.
+
+Look very careful again at the \texttt{accepts} and \texttt{deltas}
+functions in NFAs and remember that when accepting a string by an NFA
+we might have to explore all possible transitions (which state to go
+to is not unique anymore). The shown implementation achieves this
+exploration in a \emph{breath-first search}. This is fine for very
+small NFAs, but can lead to problems when the NFAs are bigger. Take
+for example the regular expression $(.)^*\cdot a\cdot (.)^{\{n\}}\cdot
+b\cdot c$ from above. If $n$ is large, say 100 or 1000, then the
+corresponding NFA will have 104, respectively 1004, nodes. The problem
+is that with certain strings this can lead to 1000 ``active'' nodes
+which we need to analyse when determining the next states. This can be
+a real memory strain in practical applications. As a result, some
+regular expression matching engines resort to a \emph{depth-first
+  search} with \emph{backtracking} in unsuccessful cases. In our
+implementation we could implement a depth-first version of
+\texttt{accepts} using Scala's \texttt{exists} as follows:
+
+
+{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
+def search(q: A, s: List[C]) : Boolean = s match {
+  case Nil => fins(q)
+  case c::cs =>
+    delta.exists((d) => Try(search(d(q, c), cs)) getOrElse false)
+}
+
+def accepts(s: List[C]) : Boolean = 
+  starts.exists(search(_, s))
+\end{lstlisting}}
+
+\noindent
+This depth-first way of exploration seems to work efficiently in many
+examples and is much less of strain on memory. The problem is that the
+backtracking can get catastrophic in some examples---remember the
+catastrophic backtracking from earlier lectures. This depth-first
+search with backtracking is the reason for the abysmal performance of
+regular expression macthing in Java, Ruby and Python.
 
 %This means if
 %we need to decide whether a string is accepted by a NFA, we might have
@@ -780,7 +887,7 @@
 \subsubsection*{Automata Minimization}
 
 As seen in the subset construction, the translation 
-of an NFA to a DFA can result in a rather ``inefficient'' 
+of a NFA to a DFA can result in a rather ``inefficient'' 
 DFA. Meaning there are states that are not needed. A
 DFA can be \emph{minimised} by the following algorithm:
 
@@ -974,7 +1081,7 @@
 or not is computationally not cheap. Remember with NFAs we
 have potentially many next states even for the same input and
 also have the silent $\epsilon$-transitions. If we want to
-find a path from the starting state of an NFA to an accepting
+find a path from the starting state of a NFA to an accepting
 state, we need to consider all possibilities. In Ruby and
 Python this is done by a depth-first search, which in turn
 means that if a ``wrong'' choice is made, the algorithm has to
@@ -1008,7 +1115,7 @@
 for this is as follows: take a regular expression, translate
 it into a NFA and then a DFA that both recognise the same
 language. Once you have the DFA it is very easy to construct
-the automaton for the language not recognised by an DFA. If
+the automaton for the language not recognised by a DFA. If
 the DFA is completed (this is important!), then you just need
 to exchange the accepting and non-accepting states. You can
 then translate this DFA back into a regular expression and