--- a/handouts/ho03.tex Mon Apr 03 01:10:54 2017 +0800
+++ b/handouts/ho03.tex Tue Apr 11 06:22:46 2017 +0800
@@ -29,13 +29,14 @@
\subsection*{Deterministic Finite Automata}
-Their definition is as follows:\medskip
+The central definition is:\medskip
\noindent
A \emph{deterministic finite automaton} (DFA), say $A$, is
-given by a four-tuple written $A(Qs, Q_0, F, \delta)$ where
+given by a five-tuple written $A(\Sigma, Qs, Q_0, F, \delta)$ where
\begin{itemize}
+\item $\Sigma$ is an alphabet,
\item $Qs$ is a finite set of states,
\item $Q_0 \in Qs$ is the start state,
\item $F \subseteq Qs$ are the accepting states, and
@@ -47,8 +48,8 @@
the assumption that these transition functions do not need to be
defined everywhere: so it can be the case that given a character there
is no next state, in which case we need to raise a kind of ``failure
-exception''. That means actually we have partial functions as
-transitions---see our implementation later on. A typical example of a
+exception''. That means actually we have \emph{partial} functions as
+transitions---see the implementation later on. A typical example of a
DFA is
\begin{center}
@@ -115,7 +116,7 @@
the way we hit the case where the transition function $\delta$ is not
defined, we need to raise an error. In our implementation we will deal
with this case elegantly by using Scala's \texttt{Try}. So a string
-$s$ is in the \emph{language accepted by the automaton} $A(Q, Q_0, F,
+$s$ is in the \emph{language accepted by the automaton} $A(\Sigma, Q, Q_0, F,
\delta)$ iff
\[
@@ -123,7 +124,7 @@
\]
\noindent I let you think about a definition that describes
-the set of strings accepted by an automaton.
+the set of all strings accepted by an automaton.
\begin{figure}[p]
\small
@@ -133,8 +134,8 @@
\caption{A Scala implementation of DFAs using partial functions.
Notice some subtleties: \texttt{deltas} implements the delta-hat
construction by lifting the transition (partial) function to
- lists of \texttt{C}haracters. Since \texttt{delta} is given
- as partial function, it can obviously go ``wrong'' in which
+ lists of characters. Since \texttt{delta} is given
+ as a partial function, it can obviously go ``wrong'' in which
case the \texttt{Try} in \texttt{accepts} catches the error and
returns \texttt{false}---that means the string is not accepted.
The example \texttt{delta} implements the DFA example shown
@@ -142,19 +143,25 @@
\end{figure}
A simple Scala implementation for DFAs is given in
-Figure~\ref{dfa}. As you can see, there many features of the
+Figure~\ref{dfa}. As you can see, there are many features of the
mathematical definition that are quite closely reflected in the
code. In the DFA-class, there is a starting state, called
-\texttt{start}, with the polymorphic type \texttt{A}. (The reason for
-having a polymorphic types for the states and the input of DFAs will
-become clearer later.) There is a partial function \texttt{delta} for
-specifying the transitions. I used partial functions for representing
-the transitions in Scala; alternatives would have been \texttt{Maps}
-or even \texttt{Lists}. One of the main advantages of using partial
-functions is that transitions can be quite nicely defined by a series
-of \texttt{case} statements (see Lines 28 -- 38). If you need to
-represent an automaton with a sink state (catch-all-state), you can
-write something like
+\texttt{start}, with the polymorphic type \texttt{A}. There is a
+partial function \texttt{delta} for specifying the transitions---these
+partial functions take a state (of polymorphic type \texttt{A}) and an
+input (of polymorphic type \texttt{C}) and produce a new state (of
+type \texttt{A}). For the moment it is OK to assume that \texttt{A} is
+some arbitrary type for states and the input is just characters. (The
+reason for having a polymorphic types for the states and the input of
+DFAs will become clearer later on.)
+
+I used partial functions for representing the transitions in Scala;
+alternatives would have been \texttt{Maps} or even \texttt{Lists}. One
+of the main advantages of using partial functions is that transitions
+can be quite nicely defined by a series of \texttt{case} statements
+(see Lines 28 -- 38 for an example). If you need to represent an
+automaton with a sink state (catch-all-state), you can use Scala's
+pattern matching and write something like
{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
{\ifodd\value{lstnumber}\color{capri!3}\fi}]
@@ -170,11 +177,13 @@
\end{lstlisting}}
\noindent The DFA-class has also an argument for specifying final
-states. In the implementation it is a function from states to booleans
-(returns true whenever a state is meant to be final; false
-otherwise). While this boolean function is different from the sets of
-states used in the mathematical definition, Scala allows me to use
-sets for such functions (see Line 40 where I initialise the DFA).
+states. In the implementation it not a set of states, as in the
+matemathical definition, but is a function from states to booleans
+(this function is supposed to return true whenever a state is meant to
+be final; false otherwise). While this boolean function is different
+from the sets of states, Scala allows to use sets for such functions
+(see Line 40 where the DFA is initialised). Again it will become clear
+later on why I use functions for final states, rather than sets.
I let you ponder whether this is a good implementation of DFAs. In
doing so I hope you notice that the $Qs$ component (the set of finite
@@ -191,10 +200,11 @@
to relax this restriction. That means we allow to have several
potential successor states. We even allow more than one starting
state. The resulting construction is called a \emph{non-deterministic
- finite automaton} (NFA) given also as a four-tuple $A(Qs, Q_{0s}, F,
+ finite automaton} (NFA) given also as a five-tuple $A(\Sigma, Qs, Q_{0s}, F,
\rho)$ where
\begin{itemize}
+\item $\Sigma$ is an alphabet,
\item $Qs$ is a finite set of states
\item $Q_{0s}$ is a set of start states ($Q_{0s} \subseteq Qs$)
\item $F$ are some accepting states with $F \subseteq Qs$, and
@@ -202,12 +212,13 @@
\end{itemize}
\noindent
-A typical example of an NFA is
+A typical example of a NFA is
% A NFA for (ab* + b)*a
\begin{center}
-\begin{tikzpicture}[scale=0.8,>=stealth',very thick,
-every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]
+\begin{tikzpicture}[>=stealth',very thick, auto,
+ every state/.style={minimum size=0pt,inner sep=3pt,
+ draw=blue!50,very thick,fill=blue!20},scale=2]
\node[state,initial] (Q_0) {$Q_0$};
\node[state] (Q_1) [right=of Q_0] {$Q_1$};
\node[state, accepting] (Q_2) [right=of Q_1] {$Q_2$};
@@ -223,40 +234,45 @@
\noindent
This NFA happens to have only one starting state, but in general there
could be more. Notice that in state $Q_0$ we might go to state $Q_1$
-\emph{or} to state $Q_2$ when receiving an $a$. Similarly in state $Q_1$
-and receiving an $a$, we can stay in state $Q_1$ or go to $Q_2$. This
-kind of choice is not allowed with DFAs. When it comes to deciding
-whether a string is accepted by an NFA we potentially need to explore
-all possibilities. I let you think which kind of strings this NFA
-accepts.
+\emph{or} to state $Q_2$ when receiving an $a$. Similarly in state
+$Q_1$ and receiving an $a$, we can stay in state $Q_1$ or go to
+$Q_2$. This kind of choice is not allowed with DFAs. The downside is
+that when it comes to deciding whether a string is accepted by a NFA
+we potentially have to explore all possibilities. I let you think
+which kind of strings the above NFA accepts.
-There are however a number of additional points you should note. Every
-DFA is a NFA, but not vice versa. The $\rho$ in NFAs is a transition
-\emph{relation} (DFAs have a transition function). The difference
-between a function and a relation is that a function has always a
-single output, while a relation gives, roughly speaking, several
-outputs. Look again at the NFA above: if you are currently in the
-state $Q_1$ and you read a character $b$, then you can transition to
-either $Q_0$ \emph{or} $Q_2$. Which route, or output, you take is not
-determined. This non-determinism can be represented by a relation.
+There are a number of additional points you should note with
+NFAs. Every DFA is a NFA, but not vice versa. The $\rho$ in NFAs is a
+transition \emph{relation} (DFAs have a transition function). The
+difference between a function and a relation is that a function has
+always a single output, while a relation gives, roughly speaking,
+several outputs. Look again at the NFA above: if you are currently in
+the state $Q_1$ and you read a character $b$, then you can transition
+to either $Q_0$ \emph{or} $Q_2$. Which route, or output, you take is
+not determined. This non-determinism can be represented by a
+relation.
-My implementation of NFAs is shown in Figure~\ref{nfa}. Perhaps
-interestingly, I do not use relations for my implementation of NFAs in
-Scala, and I also do not use transition functions which return a set
-of states (another popular choice for implementing NFAs). For reasons
-that become clear in a moment, I use sets of partial functions---see
-Line 7 in Figure~\ref{nfa}. \texttt{Starts} is now a set of states;
-\texttt{fins} is again a function from states to booleans. The
-\texttt{next} function calculates the set of next states reachable
-from a state \texttt{q} by a character \texttt{c}---we go through all
-partial functions in the \texttt{delta}-set, lift it (this transformes
-the partial function into the corresponding \texttt{Option}-function
-and then apply \texttt{q} and \texttt{c}. This gives us a set of
-\texttt{Some}s and \texttt{None}s, where the \texttt{Some}s are
-filtered out by the \texttt{flatMap}. Teh function \texttt{nexts} just
-lifts this to sets of states. \texttt{Deltas} and \texttt{accept} are
-similar to the DFA definitions.
+My implementation of NFAs in Scala is shown in Figure~\ref{nfa}.
+Perhaps interestingly, I do not actually use relations for my NFAs,
+and I also do not use transition functions that return sets of states
+(another popular choice for implementing NFAs). For reasons that
+become clear in a moment, I use sets of partial functions
+instead---see Line 7 in Figure~\ref{nfa}. DFAs have only one such
+partial function; my NFAs have a set. Another parameter,
+\texttt{Starts}, is in NFAs a set of states; \texttt{fins} is again a
+function from states to booleans. The \texttt{next} function
+calculates the set of next states reachable from a single state
+\texttt{q} by a character \texttt{c}---this is calculated by going
+through all the partial functions in the \texttt{delta}-set and apply
+\texttt{q} and \texttt{c} (see Line 13). This gives us a set of
+\texttt{Some}s (in case the application succeeded) and possibly some
+\texttt{None}s (in case the partial function is not defined or produces an
+error). The \texttt{None}s are filtered out by the \texttt{flatMap},
+leaving the values inside the \texttt{Some}s. The function
+\texttt{nexts} just lifts this function to sets of
+states. \texttt{Deltas} and \texttt{accept} are similar to the DFA
+definitions.
\begin{figure}[t]
\small
@@ -264,17 +280,108 @@
{\ifodd\value{lstnumber}\color{capri!3}\fi}]
{../progs/nfa.scala}
\caption{A Scala implementation of NFAs using sets of partial functions.
- Notice some subtleties: \texttt{deltas} implements the delta-hat
- construction by lifting the transition (partial) function to
- lists of \texttt{C}haracters. Since \texttt{delta} is given
- as partial function, it can obviously go ``wrong'' in which
- case the \texttt{Try} in \texttt{accepts} catches the error and
- returns \texttt{false}---that means the string is not accepted.
- The example \texttt{delta} implements the DFA example shown
- earlier in the handout.\label{nfa}}
+ Notice some subtleties: Since \texttt{delta} is given
+ as a set of partial functions, each of them can obviously go ``wrong'' in which
+ case the \texttt{Try}. The function \texttt{accepts} implements the
+ acceptance of a string in a breath-first fashion. This can be costly
+ way of deciding whether a string is accepted in practical contexts.\label{nfa}}
\end{figure}
+The reason for using sets of partial functions for specifying the
+transitions in NFAs has to do with examples like this one: a
+popular benchmark regular expression is $(.)^*\cdot a\cdot
+(.)^{\{n\}}\cdot b\cdot c$. A NFA that accepts the same strings
+(for $n=3$) is as follows:
+\begin{center}
+\begin{tikzpicture}[>=stealth',very thick, auto, node distance=7mm,
+ every state/.style={minimum size=0pt,inner sep=1pt,
+ draw=blue!50,very thick,fill=blue!20},scale=0.5]
+\node[state,initial] (Q_0) {$Q_0$};
+\node[state] (Q_1) [right=of Q_0] {$Q_1$};
+\node[state] (Q_2) [right=of Q_1] {$Q_2$};
+\node[state] (Q_3) [right=of Q_2] {$Q_3$};
+\node[state] (Q_4) [right=of Q_3] {$Q_4$};
+\node[state] (Q_5) [right=of Q_4] {$Q_5$};
+\node[state,accepting] (Q_6) [right=of Q_5] {$Q_6$};
+\path[->] (Q_0) edge [loop above] node {$.$} ();
+\path[->] (Q_0) edge node [above] {$a$} (Q_1);
+\path[->] (Q_1) edge node [above] {$.$} (Q_2);
+\path[->] (Q_2) edge node [above] {$.$} (Q_3);
+\path[->] (Q_3) edge node [above] {$.$} (Q_4);
+\path[->] (Q_4) edge node [above] {$b$} (Q_5);
+\path[->] (Q_5) edge node [above] {$c$} (Q_6);
+\end{tikzpicture}
+\end{center}
+
+\noindent
+The $.$ stands for accepting any single character: for example if we
+are in $Q_0$ and read an $a$ we can either stay in $Q_0$ (since any
+character will do for this) or advance to $Q_1$ (but only if it is an
+$a$). Why this is a good benchmark regular expression is irrelevant
+here. The point is that this NFA can be conveniently represented by
+the code:
+
+{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+ {\ifodd\value{lstnumber}\color{capri!3}\fi}]
+val delta = Set[(State, Char) :=> State](
+ { case (Q0, 'a') => Q1 },
+ { case (Q0, _) => Q0 },
+ { case (Q1, _) => Q2 },
+ { case (Q2, _) => Q3 },
+ { case (Q3, _) => Q4 },
+ { case (Q4, 'b') => Q5 },
+ { case (Q5, 'c') => Q6 }
+)
+
+NFA(Set[State](Q0), delta, Set[State](Q6))
+\end{lstlisting}}
+
+\noindent
+where the $.$-transitions translate into a
+underscore-pattern-matching. Recall that in $Q_0$ if we read an $a$ we
+can go to $Q_1$ (by the first partial function in the set) and also
+stay in $Q_0$ (by the second partial function). Representing such
+transitions in any other way is somehow awkward; the set of partial
+function representation makes them easy to implement.
+
+Look very careful again at the \texttt{accepts} and \texttt{deltas}
+functions in NFAs and remember that when accepting a string by an NFA
+we might have to explore all possible transitions (which state to go
+to is not unique anymore). The shown implementation achieves this
+exploration in a \emph{breath-first search}. This is fine for very
+small NFAs, but can lead to problems when the NFAs are bigger. Take
+for example the regular expression $(.)^*\cdot a\cdot (.)^{\{n\}}\cdot
+b\cdot c$ from above. If $n$ is large, say 100 or 1000, then the
+corresponding NFA will have 104, respectively 1004, nodes. The problem
+is that with certain strings this can lead to 1000 ``active'' nodes
+which we need to analyse when determining the next states. This can be
+a real memory strain in practical applications. As a result, some
+regular expression matching engines resort to a \emph{depth-first
+ search} with \emph{backtracking} in unsuccessful cases. In our
+implementation we could implement a depth-first version of
+\texttt{accepts} using Scala's \texttt{exists} as follows:
+
+
+{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+ {\ifodd\value{lstnumber}\color{capri!3}\fi}]
+def search(q: A, s: List[C]) : Boolean = s match {
+ case Nil => fins(q)
+ case c::cs =>
+ delta.exists((d) => Try(search(d(q, c), cs)) getOrElse false)
+}
+
+def accepts(s: List[C]) : Boolean =
+ starts.exists(search(_, s))
+\end{lstlisting}}
+
+\noindent
+This depth-first way of exploration seems to work efficiently in many
+examples and is much less of strain on memory. The problem is that the
+backtracking can get catastrophic in some examples---remember the
+catastrophic backtracking from earlier lectures. This depth-first
+search with backtracking is the reason for the abysmal performance of
+regular expression macthing in Java, Ruby and Python.
%This means if
%we need to decide whether a string is accepted by a NFA, we might have
@@ -780,7 +887,7 @@
\subsubsection*{Automata Minimization}
As seen in the subset construction, the translation
-of an NFA to a DFA can result in a rather ``inefficient''
+of a NFA to a DFA can result in a rather ``inefficient''
DFA. Meaning there are states that are not needed. A
DFA can be \emph{minimised} by the following algorithm:
@@ -974,7 +1081,7 @@
or not is computationally not cheap. Remember with NFAs we
have potentially many next states even for the same input and
also have the silent $\epsilon$-transitions. If we want to
-find a path from the starting state of an NFA to an accepting
+find a path from the starting state of a NFA to an accepting
state, we need to consider all possibilities. In Ruby and
Python this is done by a depth-first search, which in turn
means that if a ``wrong'' choice is made, the algorithm has to
@@ -1008,7 +1115,7 @@
for this is as follows: take a regular expression, translate
it into a NFA and then a DFA that both recognise the same
language. Once you have the DFA it is very easy to construct
-the automaton for the language not recognised by an DFA. If
+the automaton for the language not recognised by a DFA. If
the DFA is completed (this is important!), then you just need
to exchange the accepting and non-accepting states. You can
then translate this DFA back into a regular expression and