afl-material: comparison handouts/ho03.tex

equal deleted inserted replaced

-:e28d7a327870
+:4fee50f38305
 \item $Q_0 \in Qs$ is the start state,
 \item $F \subseteq Qs$ are the accepting states, and
 \item $\delta$ is the transition function.
 \end{itemize}
-\noindent I am sure you have seen this defininition already
+\noindent I am sure you have seen this definition already
 before. The transition function determines how to ``transition'' from
 one state to the next state with respect to a character. We have the
 assumption that these transition functions do not need to be defined
 everywhere: so it can be the case that given a character there is no
 next state, in which case we need to raise a kind of ``failure
 \[
 \widehat{\delta}(Q_0, s) \in F
 \]
 \noindent I let you think about a definition that describes the set of
-all strings accepted by a determinsitic finite automaton.
+all strings accepted by a deterministic finite automaton.
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
+\lstinputlisting[numbers=left]{../progs/display/dfa.scala}
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
-{../progs/display/dfa.scala}
 \caption{A Scala implementation of DFAs using partial functions.
 Note some subtleties: \texttt{deltas} implements the delta-hat
 construction by lifting the (partial) transition  function to lists
 of characters. Since \texttt{delta} is given as a partial function,
 it can obviously go ``wrong'' in which case the \texttt{Try} in
 nicely defined by a series of \texttt{case} statements (see Lines 28
 -- 38 for an example). If you need to represent an automaton with a
 sink state (catch-all-state), you can use Scala's pattern matching and
 write something like
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+{\small\begin{lstlisting}[language=Scala]
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
 abstract class State
 ...
 case object Sink extends State
 val delta : (State, Char) :=> State =
 return a set of states. I let you think about this representation for
 NFA-transitions and how it corresponds to the relations used in the
 mathematical definition of NFAs. An example of a transition function
 in Scala for the NFA shown above is
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+{\small\begin{lstlisting}[language=Scala]
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
 val nfa_delta : (State, Char) :=> Set[State] =
 { case (Q0, 'a') => Set(Q1, Q2)
 case (Q0, 'b') => Set(Q0)
 case (Q1, 'a') => Set(Q1, Q2)
 case (Q1, 'b') => Set(Q0, Q1) }
 \end{lstlisting}}
-\noindent Like in the mathematical definition, \texttt{starts} is in
+Like in the mathematical definition, \texttt{starts} is in
 NFAs a set of states; \texttt{fins} is again a function from states to
 booleans. The \texttt{next} function calculates the set of next states
 reachable from a single state \texttt{q} by a character~\texttt{c}. In
 case there is no such state---the partial transition function is
 undefined---the empty set is returned (see function
 \texttt{applyOrElse} in Lines 9 and 10). The function \texttt{nexts}
 just lifts this function to sets of states.
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
+\lstinputlisting[numbers=left]{../progs/display/nfa.scala}
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
-{../progs/display/nfa.scala}
 \caption{A Scala implementation of NFAs using partial functions.
 Notice that the function \texttt{accepts} implements the
 acceptance of a string in a breath-first search fashion. This can be a costly
 way of deciding whether a string is accepted or not in applications that need to handle
 large NFAs and large inputs.\label{nfa}}
 search} with \emph{backtracking} in unsuccessful cases. In our
 implementation we can implement a depth-first version of
 \texttt{accepts} using Scala's \texttt{exists}-function as follows:
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+{\small\begin{lstlisting}[language=Scala]
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
 def search(q: A, s: List[C]) : Boolean = s match {
 case Nil => fins(q)
 case c::cs => next(q, c).exists(search(_, cs))
 }
 lectures. This depth-first search with backtracking is the reason for
 the abysmal performance of some regular expression matchings in Java,
 Ruby and Python. I like to show you this in the next two sections.
-\subsubsection*{Epsilon NFAs}
+\subsection*{Epsilon NFAs}
 In order to get an idea what calculations are performed by Java \&
 friends, we need a method for transforming a regular expression into
 an automaton. This automaton should accept exactly those strings that
 are accepted by the regular expression.  The simplest and most
 $\epsilon$-transitions mean you do not have to ``consume'' any part of
 the input string, but ``silently'' change to a different state. In
 this example, if you are in the starting state $Q_0$, you can silently
 move either to $Q_1$ or $Q_2$. You can see that once you are in $Q_1$,
 respectively $Q_2$, you cannot ``go back'' to the other states. So it
-seems allowing $\epsilon$-transitions is a rather substancial
+seems allowing $\epsilon$-transitions is a rather substantial
 extension to NFAs. On first appearances, $\epsilon$-transitions might
 even look rather strange, or even silly. To start with, silent
 transitions make the decision whether a string is accepted by an
 automaton even harder: with $\epsilon$NFAs we have to look whether we
 can do first some $\epsilon$-transitions and then do a
 The obvious question is: Do we get anything in return for this hassle
 with silent transitions?  Well, we still have to work for it\ldots
 unfortunately.  If we were to follow the many textbooks on the
 subject, we would now start with defining what $\epsilon$NFAs
 are---that would require extending the transition relation of
-NFAs. Next, we woudl show that the $\epsilon$NFAs are equivalent to
+NFAs. Next, we would show that the $\epsilon$NFAs are equivalent to
 NFAs and so on. Once we have done all this on paper, we would need to
 implement $\epsilon$NFAs. Lets try to take a shortcut instead. We are
 not really interested in $\epsilon$NFAs; they are only a convenient
 tool for translating regular expressions into automata. So we are not
 going to implementing them explicitly, but translate them immediately
 by $\epsilon$-transitions.
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
+\lstinputlisting[numbers=left]{../progs/display/enfa.scala}
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
-{../progs/display/enfa.scala}
 \caption{A Scala function that translates $\epsilon$NFA into NFAs. The
-transtion function of $\epsilon$NFA takes as input an \texttt{Option[C]}.
+transition function of $\epsilon$NFA takes as input an \texttt{Option[C]}.
 \texttt{None} stands for an $\epsilon$-transition; \texttt{Some(c)}
 for a ``proper'' transition consuming a character. The functions in
 Lines 18--26 calculate
 all states reachable by one or more $\epsilon$-transition for a given
 set of states. The NFA is constructed in Lines 36--38.\label{enfa}}
 \end{figure}
 Also look carefully how the transitions of $\epsilon$NFAs are
 implemented. The additional possibility of performing silent
 transitions is encoded by using \texttt{Option[C]} as the type for the
-``input''. The \texttt{Some}s stand for ``propper'' transitions where
+``input''. The \texttt{Some}s stand for ``proper'' transitions where
 a character is consumed; \texttt{None} stands for
 $\epsilon$-transitions. The transition functions for the two
 $\epsilon$NFAs from the beginning of this section can be defined as
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+{\small\begin{lstlisting}[language=Scala]
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
 val enfa_trans1 : (State, Option[Char]) :=> Set[State] =
 { case (Q0, Some('a')) => Set(Q0)
 case (Q0, None) => Set(Q1, Q2)
 case (Q1, Some('a')) => Set(Q1)
 case (Q2, Some('b')) => Set(Q2) }
 \noindent
 I hope you agree now with my earlier statement that the $\epsilon$NFAs
 are just an API for NFAs.
-\subsubsection*{Thompson Construction}
+\subsection*{Thompson Construction}
 Having the translation of $\epsilon$NFAs to NFAs in place, we can
 finally return to the problem of translating regular expressions into
 equivalent NFAs. Recall that by equivalent we mean that the NFAs
 recognise the same language. Consider the simple regular expressions
 increases this counter whenever a new state is created.\footnote{You might
 have to read up what \emph{companion objects} do in Scala.}
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
+\lstinputlisting[numbers=left]{../progs/display/thompson1.scala}
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
-{../progs/display/thompson1.scala}
 \caption{The first part of the Thompson Construction. Lines 7--16
 implement a way of how to create new states that are all
 distinct by virtue of a counter. This counter is
 increased in the companion object of \texttt{TState}
 whenever a new state is created. The code in Lines 24--40
 \label{thompson1}}
 \end{figure}
 \begin{figure}[p]
 \small
-\lstinputlisting[numbers=left,linebackgroundcolor=
+\lstinputlisting[numbers=left]{../progs/display/thompson2.scala}
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
-{../progs/display/thompson2.scala}
 \caption{The second part of the Thompson Construction implementing
-the composition of NFAs according to $\cdot$, $+$ and $\_^*$.
+the composition of NFAs according to $\cdot$, $+$ and ${}^*$.
 The implicit class about rich partial functions
 implements the infix operation \texttt{+++} which
 combines an $\epsilon$NFA transition with a NFA transition
 (both given as partial functions).\label{thompson2}}
 \end{figure}
 \noindent The idea behind this construction is that the start of any
 string is first recognised by the first NFA, then we silently change
 to the second NFA; the ending of the string is recognised by the
 second NFA...just like matching of a string by the regular expression
-$r_1\cdot r_2$. The Scala code for this constrction is given in
+$r_1\cdot r_2$. The Scala code for this construction is given in
 Figure~\ref{thompson2} in Lines 16--23. The starting states of the
 $\epsilon$NFA are the starting states of the first NFA (corresponding
 to $r_1$); the accepting function is the accepting function of the
 second NFA (corresponding to $r_2$). The new transition function is
 all the ``old'' transitions plus the $\epsilon$-transitions connecting
 the accepting states of the first NFA to the starting states of the
-first NFA (Lines 18 and 19). The $\epsilon$NFA is then immedately
+first NFA (Lines 18 and 19). The $\epsilon$NFA is then immediately
 translated in a NFA.
-The case for the choice regular expression $r_1 + r_2$ is slightly
+The case for the alternative regular expression $r_1 + r_2$ is
-different: We are given by recursion two NFAs representing $r_1$ and
+slightly different: We are given by recursion two NFAs representing
-$r_2$ respectively.
+$r_1$ and $r_2$ respectively. Each NFA has some starting states and
+some accepting states. We obtain a NFA for the regular expression $r_1
-\begin{center}
++ r_2$ by composing the transition functions (this crucially depends
+on knowing that the states of each component NFA are distinct); and
+also combine the starting states and accepting functions.
+\begin{center}
+\begin{tabular}[t]{ccc}
 \begin{tikzpicture}[node distance=3mm,
 >=stealth',very thick,
-every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+baseline=(current bounding box.center)]
 \node at (0,0)  (1)  {$\mbox{}$};
 \node  (2)  [above=10mm of 1] {};
 \node[state, initial]  (4)  [above=1mm of 2] {$\mbox{}$};
 \node[state, initial]  (5)  [below=1mm of 2] {$\mbox{}$};
 \node[state, initial]  (3)  [below=10mm of 1] {$\mbox{}$};
 \node (2) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (3) (b1) (b2) (b3)] {};
 \node [yshift=3mm] at (1.north) {$r_1$};
 \node [yshift=3mm] at (2.north) {$r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
-\end{center}
+&
+\mbox{}\qquad\tikz{\draw[>=stealth,line width=2mm,->] (0,0) -- (1, 0)}\quad\mbox{}
-\noindent Each NFA has some starting states and some accepting
+&
-states. We obtain a NFA for the regular expression $r_1 + r_2$
-by composing the transition functions (this crucially depends
-on knowing that the states of each component NFA are distinct);
-and also combine the starting states and accepting functions:
-\begin{center}
 \begin{tikzpicture}[node distance=3mm,
 >=stealth',very thick,
-every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+baseline=(current bounding box.center)]
 \node at (0,0) (1)  {$\mbox{}$};
 \node (2)  [above=10mm of 1] {$$};
 \node[state, initial]  (4)  [above=1mm of 2] {$\mbox{}$};
 \node[state, initial]  (5)  [below=1mm of 2] {$\mbox{}$};
 \node[state, initial]  (3)  [below=10mm of 1] {$\mbox{}$};
 \begin{pgfonlayer}{background}
 \node (3) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (1) (a2) (a3) (b2) (b3)] {};
 \node [yshift=3mm] at (3.north) {$r_1+ r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
+\end{tabular}
 \end{center}
 \noindent The code for this construction is in Figure~\ref{thompson2}
-in Lines 25--33. Finally for the $*$-case we have a NFA for $r$
+in Lines 25--33.
-\begin{center}
+Finally for the $*$-case we have a NFA for $r$ and connect its
+accepting states to a new starting state via
+$\epsilon$-transitions. This new starting state is also an accepting
+state, because $r^*$ can recognise the empty string.
+\begin{center}
+\begin{tabular}[b]{@{\hspace{-4mm}}ccc@{}}
 \begin{tikzpicture}[node distance=3mm,
->=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+>=stealth',very thick,
+every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+baseline=(current bounding box.north)]
 \node at (0,0)  (1)  {$\mbox{}$};
 \node[state, initial]  (2)  [right=16mm of 1] {$\mbox{}$};
 \node (a)  [right=of 2] {$\ldots$};
 \node[state, accepting]  (a1)  [right=of a] {$\mbox{}$};
 \node[state, accepting]  (a2)  [above=of a1] {$\mbox{}$};
 \begin{pgfonlayer}{background}
 \node (1) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (2) (a1) (a2) (a3)] {};
 \node [yshift=3mm] at (1.north) {$r$};
 \end{pgfonlayer}
 \end{tikzpicture}
-\end{center}
+&
+\raisebox{-16mm}{\;\tikz{\draw[>=stealth,line width=2mm,->] (0,0) -- (1, 0)}}
-\noindent and connect its accepting states to a new starting state via
+&
-$\epsilon$-transitions. This new starting state is also an accepting
-state, because $r^*$ can recognise the empty string. This gives the
-following $\epsilon$NFA for $r^*$ (the corresponding code is in
-Figure~\ref{thompson2} in Lines 35--43:
-\begin{center}
 \begin{tikzpicture}[node distance=3mm,
 >=stealth',very thick,
-every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+baseline=(current bounding box.north)]
 \node at (0,0) [state, initial,accepting]  (1)  {$\mbox{}$};
 \node[state]  (2)  [right=16mm of 1] {$\mbox{}$};
 \node (a)  [right=of 2] {$\ldots$};
 \node[state]  (a1)  [right=of a] {$\mbox{}$};
 \node[state]  (a2)  [above=of a1] {$\mbox{}$};
 \path[->] (a3) edge [bend left=45] node [below]  {$\epsilon$} (1);
 \begin{pgfonlayer}{background}
 \node (2) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (1) (a2) (a3)] {};
 \node [yshift=3mm] at (2.north) {$r^*$};
 \end{pgfonlayer}
 \end{tikzpicture}
-\end{center}
+\end{tabular}
+\end{center}
-To sum ap, you can see in the sequence and star cases the need of
+\noindent
+The corresponding code is in Figure~\ref{thompson2} in Lines 35--43)
+To sum up, you can see in the sequence and star cases the need of
 having silent $\epsilon$-transitions. Similarly the alternative case
-shows the need of the NFA-nondeterminsim. It seems awkward to form the
+shows the need of the NFA-nondeterminism. It seems awkward to form the
 `alternative' composition of two DFAs, because DFA do not allow
 several starting and successor states. All these constructions can now
 be put together in the following recursive function:
-{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+{\small\begin{lstlisting}[language=Scala]
-{\ifodd\value{lstnumber}\color{capri!3}\fi}]
+def thompson(r: Rexp) : NFAt = r match {
-def thompson (r: Rexp) : NFAt = r match {
 case ZERO => NFA_ZERO()
 case ONE => NFA_ONE()
 case CHAR(c) => NFA_CHAR(c)
 case ALT(r1, r2) => NFA_ALT(thompson(r1), thompson(r2))
 case SEQ(r1, r2) => NFA_SEQ(thompson(r1), thompson(r2))
 case STAR(r1) => NFA_STAR(thompson(r1))
 }
 \end{lstlisting}}
 \noindent
-It calculates a NFA from a regular expressions. At last we can run a
+It calculates a NFA from a regular expressions. At last we can run
-NFA for the our evil regular expression examples.
+NFAs for the our evil regular expression examples.  The graph on the
+left shows that when translating a regular expression such as
+$a^{\{n\}}$ into a NFA, the size can blow up and then even the
+relative fast (on small examples) breadth-first search can be
+slow. The graph on the right shows that with `evil' regular
+expressions the depth-first search can be abysmally slow. Even if the
+graphs not completely overlap with the curves of Python, Ruby and
+Java, they are similar enough. OK\ldots now you know why regular
+expression matchers in those languages are so slow.
 \begin{center}
 \begin{tabular}{@{\hspace{-1mm}}c@{\hspace{1mm}}c@{}}
 \begin{tikzpicture}
 \begin{axis}[
-title={Graph: $\texttt{a?\{n\}\,a{\{n\}}}$ and strings
+title={Graph: $a^{?\{n\}} \cdot a^{\{n\}}$ and strings
 $\underbrace{\texttt{a}\ldots \texttt{a}}_{n}$},
+title style={yshift=-2ex},
 xlabel={$n$},
 x label style={at={(1.05,0.0)}},
 ylabel={time in secs},
 enlargelimits=false,
 xtick={0,5,...,30},
 ymax=35,
 ytick={0,5,...,30},
 scaled ticks=false,
 axis lines=left,
 width=5.5cm,
-height=4.5cm,
+height=4cm,
-legend entries={Python,Ruby},
+legend entries={Python,Ruby, breadth-first NFA},
-legend pos=south east,
+legend style={at={(0.5,-0.25)},anchor=north,font=\small},
 legend cell align=left]
 \addplot[blue,mark=*, mark options={fill=white}] table {re-python.data};
 \addplot[brown,mark=triangle*, mark options={fill=white}] table {re-ruby.data};
 % breath-first search in NFAs
 \addplot[red,mark=*, mark options={fill=white}] table {
 \end{axis}
 \end{tikzpicture}
 &
 \begin{tikzpicture}
 \begin{axis}[
-title={Graph: $\texttt{(a*)*\,b}$ and strings
+title={Graph: $(a^*)^* \cdot b$ and strings
 $\underbrace{\texttt{a}\ldots \texttt{a}}_{n}$},
+title style={yshift=-2ex},
 xlabel={$n$},
 x label style={at={(1.05,0.0)}},
 ylabel={time in secs},
 enlargelimits=false,
 xtick={0,5,...,30},
 ymax=35,
 ytick={0,5,...,30},
 scaled ticks=false,
 axis lines=left,
 width=5.5cm,
-height=4.5cm,
+height=4cm,
-legend entries={Python, Java},
+legend entries={Python, Java, depth-first NFA},
-legend pos=outer north east,
+legend style={at={(0.5,-0.25)},anchor=north,font=\small},
 legend cell align=left]
 \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data};
 \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data};
 % depth-first search in NFAs
 \addplot[red,mark=*, mark options={fill=white}] table {
 \end{tabular}
 \end{center}
-\subsubsection*{Subset Construction}
+\subsection*{Subset Construction}
-Remember that we did not bother with defining and implementing
+Of course, some developers of regular expression matchers are aware
-$\epsilon$NFA; we immediately translated them into equivalent
+of these problems with sluggish NFAs and try to address them. One
-NFAs. Equivalent in the sense of accepting the same language (though
+common technique for this I like to show you in this section. It will
-we only claimed this and did not prove it rigorously). Remember also
+also explain why we insisted on polymorphic types in our DFA code
-that NFAs have a non-deterministic transitions, given as a relation.
+(remember I used \texttt{A} and \texttt{C} for the types of states and
-This non-determinism makes it harder to decide when a string is
+the input, see Figure~\ref{dfa} on Page~\pageref{dfa}).\bigskip
-accepted or not; such a decision is rather straightforward with DFAs
-(remember their transition function).
+\noindent
+To start, remember that we did not bother with defining and
-What is interesting is that for every NFA we can find a DFA that also
+implementing $\epsilon$NFA; we immediately translated them into
-recognises the same language. This might sound like a bit paradoxical,
+equivalent NFAs. Equivalent in the sense of accepting the same
-but I litke to show you this next. There are a number of ways of
+language (though we only claimed this and did not prove it
-transforming a NFA into an equivalent DFA, but the most famous is
+rigorously). Remember also that NFAs have non-deterministic
-\emph{subset construction}. Consider again the NFA below on the left,
+transitions defined as a relation or implemented as function returning
-consisting of nodes labelled, say, with $0$, $1$ and $2$.
+sets of states.  This non-determinism is crucial for the Thompson
+Construction to work (recall the cases for $\cdot$, $+$ and
+${}^*$). But this non-determinism makes it harder with NFAs to decide
+when a string is accepted or not; such a decision is rather
+straightforward with DFAs: recall their transition function is a
+\emph{function} that returns a single state. So we do not have to
+search at all.  What is perhaps interesting is the fact that for every
+NFA we can find a DFA that also recognises the same language. This
+might sound a bit paradoxical: NFA $\rightarrow$ decision of
+acceptance hard; DFA $\rightarrow$ decision easy. But this \emph{is}
+true\ldots but of course there is always a caveat---nothing is ever
+for free in life.
+There are a number of techniques for transforming a NFA into an
+equivalent DFA, but the most famous one is the \emph{subset
+construction}. Consider the following NFA where the states are
+labelled with, say, $0$, $1$ and $2$.
 \begin{center}
 \begin{tabular}{c@{\hspace{10mm}}c}
 \begin{tikzpicture}[scale=0.7,>=stealth',very thick,
 every state/.style={minimum size=0pt,
 draw=blue!50,very thick,fill=blue!20},
-baseline=0mm]
+baseline=(current bounding box.center)]
 \node[state,initial]  (Q_0)  {$0$};
-\node[state] (Q_1) [above=of Q_0] {$1$};
+\node[state] (Q_1) [below=of Q_0] {$1$};
-\node[state, accepting] (Q_2) [below=of Q_0] {$2$};
+\node[state, accepting] (Q_2) [below=of Q_1] {$2$};
-\path[->] (Q_0) edge node [left]  {$\epsilon$} (Q_1);
-\path[->] (Q_0) edge node [left]  {$\epsilon$} (Q_2);
+\path[->] (Q_0) edge node [right]  {$b$} (Q_1);
-\path[->] (Q_0) edge [loop right] node  {$a$} ();
+\path[->] (Q_1) edge node [right]  {$a,b$} (Q_2);
-\path[->] (Q_1) edge [loop above] node  {$a$} ();
+\path[->] (Q_0) edge [loop above] node  {$a, b$} ();
-\path[->] (Q_2) edge [loop below] node  {$b$} ();
 \end{tikzpicture}
 &
-\begin{tabular}{r|cl}
+\begin{tabular}{r|ll}
-nodes & $a$ & $b$\\
+states & $a$ & $b$\\
 \hline
 $\{\}\phantom{\star}$ & $\{\}$ & $\{\}$\\
-$\{0\}\phantom{\star}$       & $\{0,1,2\}$   & $\{2\}$\\
+start: $\{0\}\phantom{\star}$       & $\{0\}$   & $\{0,1\}$\\
-$\{1\}\phantom{\star}$       & $\{1\}$       & $\{\}$\\
+$\{1\}\phantom{\star}$       & $\{2\}$       & $\{2\}$\\
-$\{2\}\star$  & $\{\}$ & $\{2\}$\\
+$\{2\}\star$  & $\{\}$ & $\{\}$\\
-$\{0,1\}\phantom{\star}$     & $\{0,1,2\}$   & $\{2\}$\\
+$\{0,1\}\phantom{\star}$     & $\{0,2\}$   & $\{0,1,2\}$\\
-$\{0,2\}\star$ & $\{0,1,2\}$   & $\{2\}$\\
+$\{0,2\}\star$ & $\{0\}$   & $\{0,1\}$\\
-$\{1,2\}\star$ & $\{1\}$       & $\{2\}$\\
+$\{1,2\}\star$ & $\{2\}$       & $\{2\}$\\
-s: $\{0,1,2\}\star$ & $\{0,1,2\}$ & $\{2\}$\\
+$\{0,1,2\}\star$ & $\{0,2\}$ & $\{0,1,2\}$\\
 \end{tabular}
 \end{tabular}
 \end{center}
-\noindent The nodes of the DFA are given by calculating all
+\noindent The states of the corresponding DFA are given by generating
-subsets of the set of nodes of the NFA (seen in the nodes
+all subsets of the set of states of the NFA (seen in the states column
-column on the right). The table shows the transition function
+in the table on the right). The other columns define the transition
-for the DFA. The first row states that $\{\}$ is the
+function for the DFA for input $a$ and $b$. The first row states that
-sink node which has transitions for $a$ and $b$ to itself.
+$\{\}$ is the sink state which has transitions for $a$ and $b$ to
-The next three lines are calculated as follows:
+itself.  The next three lines are calculated as follows:
 \begin{itemize}
-\item suppose you calculate the entry for the transition for
+\item Suppose you calculate the entry for the $a$-transition for state
-$a$ and the node $\{0\}$
+$\{0\}$. Look for all states in the NFA that can be reached by such
-\item start from the node $0$ in the NFA
+a transition from this state; this is only state $0$; therefore from
-\item do as many $\epsilon$-transition as you can obtaining a
+state $\{0\}$ we can go to state $\{0\}$ via an $a$-transition.
-set of nodes, in this case $\{0,1,2\}$
+\item Do the same for the $b$-transition; you can reach states $0$ and
-\item filter out all notes that do not allow an $a$-transition
+$1$ in the NFA; therefore in the DFA we can go from state $\{0\}$ to
-from this set, this excludes $2$ which does not permit a
+state $\{0,1\}$ via an $b$-transition.
-$a$-transition
+\item Continue with the states $\{1\}$ and $\{2\}$.
-\item from the remaining set, do as many $\epsilon$-transition
+\item Once you filled in the transitions for `simple' state, you only
-as you can, this yields again $\{0,1,2\}$
+have to build the union for the compound states $\{0,1\}$, $\{0,2\}$
-\item the resulting set specifies the transition from $\{0\}$
+and so on. For example for $\{0,1\}$ you take the union of line
-when given an $a$
+$\{0\}$ and line $\{1\}$, which gives $\{0,2\}$ for $a$, and
+$\{0,1,2\}$ for $b$. And so on.
+\item The starting state of the DFA can be calculated from the
+starting states of the NFA, that is in this case $0$. But in general
+there can be many starting states in the NFA and you would take the
+corresponding subset as \emph{the} starting state of the DFA.
+\item The accepting states in the DFA are given by all sets that
+contain a $2$, which is the only accpting state in this NFA. But
+again in general if the subset contains an accepting state from the
+NFA, then the corresponding state in the DFA is accepting as well.
 \end{itemize}
-\noindent So the transition from the state $\{0\}$ reading an
+\noindent This completes the subset construction. The corresponding
-$a$ goes to the state $\{0,1,2\}$. Similarly for the other
+DFA for the NFA shown above is:
-entries in the rows for $\{0\}$, $\{1\}$ and $\{2\}$. The
-other rows are calculated by just taking the union of the
+\begin{center}
-single node entries. For example for $a$ and $\{0,1\}$ we need
+\begin{tikzpicture}[scale=0.8,>=stealth',very thick,
-to take the union of $\{0,1,2\}$ (for $0$) and $\{1\}$ (for
-$1$). The starting state of the DFA can be calculated from the
-starting state of the NFA, that is $0$, and then do as many
-$\epsilon$-transitions as possible. This gives $\{0,1,2\}$
-which is the starting state of the DFA. The terminal states in
-the DFA are given by all sets that contain a $2$, which is the
-terminal state of the NFA. This completes the subset
-construction. So the corresponding DFA to the NFA from
-above is
-\begin{center}
-\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
 every state/.style={minimum size=0pt,
 draw=blue!50,very thick,fill=blue!20},
 baseline=0mm]
-\node[state,initial,accepting]  (q012)  {$0,1,2$};
+\node[state,initial]  (q0)  {$0$};
-\node[state,accepting] (q02) [right=of q012] {$0,2$};
+\node[state] (q01) [right=of q0] {$0,1$};
-\node[state] (q01) [above=of q02] {$0,1$};
+\node[state,accepting] (q02) [below=of q01] {$0,2$};
-\node[state,accepting] (q12) [below=of q02] {$1,2$};
+\node[state,accepting] (q012) [right=of q02] {$0,1,2$};
-\node[state] (q0) [right=2cm of q01] {$0$};
+\node[state] (q1) [below=0.5cm of q0] {$1$};
-\node[state] (q1) [right=2.5cm of q02] {$1$};
+\node[state,accepting] (q2) [below=1cm of q1] {$2$};
-\node[state,accepting] (q2) [right=1.5cm of q12] {$2$};
+\node[state] (qn) [below left=1cm of q2] {$\{\}$};
-\node[state] (qn) [right=of q1] {$\{\}$};
+\node[state,accepting] (q12) [below right=1cm of q2] {$1,2$};
-\path[->] (q012) edge [loop below] node  {$a$} ();
+\path[->] (q0) edge node [above] {$b$} (q01);
-\path[->] (q012) edge node [above]  {$b$} (q2);
+\path[->] (q01) edge node [above] {$b$} (q012);
-\path[->] (q12) edge [bend left] node [below,pos=0.4]  {$a$} (q1);
+\path[->] (q0) edge [loop above] node  {$a$} ();
-\path[->] (q12) edge node [below]  {$b$} (q2);
+\path[->] (q012) edge [loop right] node  {$b$} ();
-\path[->] (q02) edge node [above]  {$a$} (q012);
+\path[->] (q012) edge node [below] {$a$} (q02);
-\path[->] (q02) edge [bend left] node [above, pos=0.8]  {$b$} (q2);
+\path[->] (q02) edge node [below] {$a$} (q0);
-\path[->] (q01) edge node [below]  {$a$} (q012);
+\path[->] (q01) edge [bend left] node [left]  {$a$} (q02);
-\path[->] (q01) edge [bend left] node [above]  {$b$} (q2);
+\path[->] (q02) edge [bend left] node [right]  {$b$} (q01);
-\path[->] (q0) edge node [below]  {$a$} (q012);
+\path[->] (q1) edge node [left] {$a,b$} (q2);
-\path[->] (q0) edge node [right, pos=0.2]  {$b$} (q2);
+\path[->] (q12) edge node [right] {$a, b$} (q2);
-\path[->] (q1) edge [loop above] node  {$a$} ();
+\path[->] (q2) edge node [right] {$a, b$} (qn);
-\path[->] (q1) edge node [above]  {$b$} (qn);
+\path[->] (qn) edge [loop left] node  {$a,b$} ();
-\path[->] (q2) edge [loop right] node  {$b$} ();
+\end{tikzpicture}
-\path[->] (q2) edge node [below]  {$a$} (qn);
+\end{center}
-\path[->] (qn) edge [loop above] node  {$a,b$} ();
-\end{tikzpicture}
+\noindent
-\end{center}
+Please check that this is indeed a DFA. The big question is whether
+this DFA can recognise the same language as the NFA we started with.
+I let you ponder about this question.
-There are two points to note: One is that very often the
-resulting DFA contains a number of ``dead'' nodes that are
+There are also two points to note: One is that very often the
-never reachable from the starting state. For example
+resulting DFA contains a number of ``dead'' states that are never
-there is no way to reach node $\{0,2\}$ from the starting
+reachable from the starting state. This is obvious in this case, where
-state $\{0,1,2\}$. I let you find the other dead states.
+state $\{1\}$, $\{2\}$, $\{1,2\}$ and $\{\}$ can never be reached from
-In effect the DFA in this example is not a minimal DFA. Such
+the starting state.  In effect the DFA in this example is not a
-dead nodes can be safely removed without changing the language
+\emph{minimal} DFA (more about this in a minute). Such dead states can
-that is recognised by the DFA. Another point is that in some
+be safely removed without changing the language that is recognised by
-cases, however, the subset construction produces a DFA that
+the DFA. Another point is that in some cases, however, the subset
-does \emph{not} contain any dead nodes\ldots{}that means it
+construction produces a DFA that does \emph{not} contain any dead
-calculates a minimal DFA. Which in turn means that in some
+states\ldots{}and further calculates a minimal DFA. Which in turn
-cases the number of nodes by going from NFAs to DFAs
+means that in some cases the number of states can by going from NFAs
-exponentially increases, namely by $2^n$ (which is the number
+to DFAs exponentially increase, namely by $2^n$ (which is the number
-of subsets you can form for $n$ nodes).
+of subsets you can form for $n$ states).  This blow up the number of
+states in the DFA is again bad news for how quickly you can decide
-Removing all the dead states in the automaton above,
+whether a string is accepted by a DFA or not. So the caveat with DFAs
-gives a much more legible automaton, namely
+is that they might make the task of finding the next state trival, but
+might require $2^n$ times as many states as a NFA.\bigskip
-\begin{center}
-\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
+Lastly, can we
+{\small\begin{lstlisting}[language=Scala]
+def subset[A, C](nfa: NFA[A, C]) : DFA[Set[A], C] = {
+DFA(nfa.starts,
+{ case (qs, c) => nfa.nexts(qs, c) },
+_.exists(nfa.fins))
+}
+\end{lstlisting}}
+\subsection*{DFA Minimisation}
+As seen in the subset construction, the translation
+of a NFA to a DFA can result in a rather ``inefficient''
+DFA. Meaning there are states that are not needed. A
+DFA can be \emph{minimised} by the following algorithm:
+\begin{enumerate}
+\item Take all pairs $(q, p)$ with $q \not= p$
+\item Mark all pairs that accepting and non-accepting states
+\item For all unmarked pairs $(q, p)$ and all characters $c$
+test whether
+\begin{center}
+$(\delta(q, c), \delta(p,c))$
+\end{center}
+are marked. If there is one, then also mark $(q, p)$.
+\item Repeat last step until no change.
+\item All unmarked pairs can be merged.
+\end{enumerate}
+\noindent To illustrate this algorithm, consider the following
+DFA.
+\begin{center}
+\begin{tikzpicture}[>=stealth',very thick,auto,
 every state/.style={minimum size=0pt,
-draw=blue!50,very thick,fill=blue!20},
+inner sep=2pt,draw=blue!50,very thick,
-baseline=0mm]
+fill=blue!20}]
-\node[state,initial,accepting]  (q012)  {$0,1,2$};
+\node[state,initial]  (Q_0)  {$Q_0$};
-\node[state,accepting] (q2) [right=of q012] {$2$};
+\node[state] (Q_1) [right=of Q_0] {$Q_1$};
-\node[state] (qn) [right=of q2] {$\{\}$};
+\node[state] (Q_2) [below right=of Q_0] {$Q_2$};
+\node[state] (Q_3) [right=of Q_2] {$Q_3$};
-\path[->] (q012) edge [loop below] node  {$a$} ();
+\node[state, accepting] (Q_4) [right=of Q_1] {$Q_4$};
-\path[->] (q012) edge node [above]  {$b$} (q2);
+\path[->] (Q_0) edge node [above]  {$a$} (Q_1);
-\path[->] (q2) edge [loop below] node  {$b$} ();
+\path[->] (Q_1) edge node [above]  {$a$} (Q_4);
-\path[->] (q2) edge node [below]  {$a$} (qn);
+\path[->] (Q_4) edge [loop right] node  {$a, b$} ();
-\path[->] (qn) edge [loop above] node  {$a,b$} ();
+\path[->] (Q_3) edge node [right]  {$a$} (Q_4);
-\end{tikzpicture}
+\path[->] (Q_2) edge node [above]  {$a$} (Q_3);
-\end{center}
+\path[->] (Q_1) edge node [right]  {$b$} (Q_2);
+\path[->] (Q_0) edge node [above]  {$b$} (Q_2);
-\noindent Now the big question is whether this DFA
+\path[->] (Q_2) edge [loop left] node  {$b$} ();
-can recognise the same language as the NFA we started with.
+\path[->] (Q_3) edge [bend left=95, looseness=1.3] node
-I let you ponder about this question.
+[below]  {$b$} (Q_0);
+\end{tikzpicture}
-\subsubsection*{Brzozowski's Method}
+\end{center}
+\noindent In Step 1 and 2 we consider essentially a triangle
+of the form
+\begin{center}
+\begin{tikzpicture}[scale=0.6,line width=0.8mm]
+\draw (0,0) -- (4,0);
+\draw (0,1) -- (4,1);
+\draw (0,2) -- (3,2);
+\draw (0,3) -- (2,3);
+\draw (0,4) -- (1,4);
+\draw (0,0) -- (0, 4);
+\draw (1,0) -- (1, 4);
+\draw (2,0) -- (2, 3);
+\draw (3,0) -- (3, 2);
+\draw (4,0) -- (4, 1);
+\draw (0.5,-0.5) node {$Q_0$};
+\draw (1.5,-0.5) node {$Q_1$};
+\draw (2.5,-0.5) node {$Q_2$};
+\draw (3.5,-0.5) node {$Q_3$};
+\draw (-0.5, 3.5) node {$Q_1$};
+\draw (-0.5, 2.5) node {$Q_2$};
+\draw (-0.5, 1.5) node {$Q_3$};
+\draw (-0.5, 0.5) node {$Q_4$};
+\draw (0.5,0.5) node {\large$\star$};
+\draw (1.5,0.5) node {\large$\star$};
+\draw (2.5,0.5) node {\large$\star$};
+\draw (3.5,0.5) node {\large$\star$};
+\end{tikzpicture}
+\end{center}
+\noindent where the lower row is filled with stars, because in
+the corresponding pairs there is always one state that is
+accepting ($Q_4$) and a state that is non-accepting (the other
+states).
+Now in Step 3 we need to fill in more stars according whether
+one of the next-state pairs are marked. We have to do this
+for every unmarked field until there is no change anymore.
+This gives the triangle
+\begin{center}
+\begin{tikzpicture}[scale=0.6,line width=0.8mm]
+\draw (0,0) -- (4,0);
+\draw (0,1) -- (4,1);
+\draw (0,2) -- (3,2);
+\draw (0,3) -- (2,3);
+\draw (0,4) -- (1,4);
+\draw (0,0) -- (0, 4);
+\draw (1,0) -- (1, 4);
+\draw (2,0) -- (2, 3);
+\draw (3,0) -- (3, 2);
+\draw (4,0) -- (4, 1);
+\draw (0.5,-0.5) node {$Q_0$};
+\draw (1.5,-0.5) node {$Q_1$};
+\draw (2.5,-0.5) node {$Q_2$};
+\draw (3.5,-0.5) node {$Q_3$};
+\draw (-0.5, 3.5) node {$Q_1$};
+\draw (-0.5, 2.5) node {$Q_2$};
+\draw (-0.5, 1.5) node {$Q_3$};
+\draw (-0.5, 0.5) node {$Q_4$};
+\draw (0.5,0.5) node {\large$\star$};
+\draw (1.5,0.5) node {\large$\star$};
+\draw (2.5,0.5) node {\large$\star$};
+\draw (3.5,0.5) node {\large$\star$};
+\draw (0.5,1.5) node {\large$\star$};
+\draw (2.5,1.5) node {\large$\star$};
+\draw (0.5,3.5) node {\large$\star$};
+\draw (1.5,2.5) node {\large$\star$};
+\end{tikzpicture}
+\end{center}
+\noindent which means states $Q_0$ and $Q_2$, as well as $Q_1$
+and $Q_3$ can be merged. This gives the following minimal DFA
+\begin{center}
+\begin{tikzpicture}[>=stealth',very thick,auto,
+every state/.style={minimum size=0pt,
+inner sep=2pt,draw=blue!50,very thick,
+fill=blue!20}]
+\node[state,initial]  (Q_02)  {$Q_{0, 2}$};
+\node[state] (Q_13) [right=of Q_02] {$Q_{1, 3}$};
+\node[state, accepting] (Q_4) [right=of Q_13]
+{$Q_{4\phantom{,0}}$};
+\path[->] (Q_02) edge [bend left] node [above]  {$a$} (Q_13);
+\path[->] (Q_13) edge [bend left] node [below]  {$b$} (Q_02);
+\path[->] (Q_02) edge [loop below] node  {$b$} ();
+\path[->] (Q_13) edge node [above]  {$a$} (Q_4);
+\path[->] (Q_4) edge [loop above] node  {$a, b$} ();
+\end{tikzpicture}
+\end{center}
+\subsection*{Brzozowski's Method}
 As said before, we can also go into the other direction---from
 DFAs to regular expressions. Brzozowski's method calculates
 a regular expression using familiar transformations for
 solving equational systems. Consider the DFA:
 We should prove that Brzozowski's method really produces
 an equivalent  regular expression for the automaton. But
 for the purposes of this module, we omit this.
-\subsubsection*{Automata Minimization}
+\subsection*{Regular Languages}
-As seen in the subset construction, the translation
-of a NFA to a DFA can result in a rather ``inefficient''
-DFA. Meaning there are states that are not needed. A
-DFA can be \emph{minimised} by the following algorithm:
-\begin{enumerate}
-\item Take all pairs $(q, p)$ with $q \not= p$
-\item Mark all pairs that accepting and non-accepting states
-\item For all unmarked pairs $(q, p)$ and all characters $c$
-test whether
-\begin{center}
-$(\delta(q, c), \delta(p,c))$
-\end{center}
-are marked. If there is one, then also mark $(q, p)$.
-\item Repeat last step until no change.
-\item All unmarked pairs can be merged.
-\end{enumerate}
-\noindent To illustrate this algorithm, consider the following
-DFA.
-\begin{center}
-\begin{tikzpicture}[>=stealth',very thick,auto,
-every state/.style={minimum size=0pt,
-inner sep=2pt,draw=blue!50,very thick,
-fill=blue!20}]
-\node[state,initial]  (Q_0)  {$Q_0$};
-\node[state] (Q_1) [right=of Q_0] {$Q_1$};
-\node[state] (Q_2) [below right=of Q_0] {$Q_2$};
-\node[state] (Q_3) [right=of Q_2] {$Q_3$};
-\node[state, accepting] (Q_4) [right=of Q_1] {$Q_4$};
-\path[->] (Q_0) edge node [above]  {$a$} (Q_1);
-\path[->] (Q_1) edge node [above]  {$a$} (Q_4);
-\path[->] (Q_4) edge [loop right] node  {$a, b$} ();
-\path[->] (Q_3) edge node [right]  {$a$} (Q_4);
-\path[->] (Q_2) edge node [above]  {$a$} (Q_3);
-\path[->] (Q_1) edge node [right]  {$b$} (Q_2);
-\path[->] (Q_0) edge node [above]  {$b$} (Q_2);
-\path[->] (Q_2) edge [loop left] node  {$b$} ();
-\path[->] (Q_3) edge [bend left=95, looseness=1.3] node
-[below]  {$b$} (Q_0);
-\end{tikzpicture}
-\end{center}
-\noindent In Step 1 and 2 we consider essentially a triangle
-of the form
-\begin{center}
-\begin{tikzpicture}[scale=0.6,line width=0.8mm]
-\draw (0,0) -- (4,0);
-\draw (0,1) -- (4,1);
-\draw (0,2) -- (3,2);
-\draw (0,3) -- (2,3);
-\draw (0,4) -- (1,4);
-\draw (0,0) -- (0, 4);
-\draw (1,0) -- (1, 4);
-\draw (2,0) -- (2, 3);
-\draw (3,0) -- (3, 2);
-\draw (4,0) -- (4, 1);
-\draw (0.5,-0.5) node {$Q_0$};
-\draw (1.5,-0.5) node {$Q_1$};
-\draw (2.5,-0.5) node {$Q_2$};
-\draw (3.5,-0.5) node {$Q_3$};
-\draw (-0.5, 3.5) node {$Q_1$};
-\draw (-0.5, 2.5) node {$Q_2$};
-\draw (-0.5, 1.5) node {$Q_3$};
-\draw (-0.5, 0.5) node {$Q_4$};
-\draw (0.5,0.5) node {\large$\star$};
-\draw (1.5,0.5) node {\large$\star$};
-\draw (2.5,0.5) node {\large$\star$};
-\draw (3.5,0.5) node {\large$\star$};
-\end{tikzpicture}
-\end{center}
-\noindent where the lower row is filled with stars, because in
-the corresponding pairs there is always one state that is
-accepting ($Q_4$) and a state that is non-accepting (the other
-states).
-Now in Step 3 we need to fill in more stars according whether
-one of the next-state pairs are marked. We have to do this
-for every unmarked field until there is no change anymore.
-This gives the triangle
-\begin{center}
-\begin{tikzpicture}[scale=0.6,line width=0.8mm]
-\draw (0,0) -- (4,0);
-\draw (0,1) -- (4,1);
-\draw (0,2) -- (3,2);
-\draw (0,3) -- (2,3);
-\draw (0,4) -- (1,4);
-\draw (0,0) -- (0, 4);
-\draw (1,0) -- (1, 4);
-\draw (2,0) -- (2, 3);
-\draw (3,0) -- (3, 2);
-\draw (4,0) -- (4, 1);
-\draw (0.5,-0.5) node {$Q_0$};
-\draw (1.5,-0.5) node {$Q_1$};
-\draw (2.5,-0.5) node {$Q_2$};
-\draw (3.5,-0.5) node {$Q_3$};
-\draw (-0.5, 3.5) node {$Q_1$};
-\draw (-0.5, 2.5) node {$Q_2$};
-\draw (-0.5, 1.5) node {$Q_3$};
-\draw (-0.5, 0.5) node {$Q_4$};
-\draw (0.5,0.5) node {\large$\star$};
-\draw (1.5,0.5) node {\large$\star$};
-\draw (2.5,0.5) node {\large$\star$};
-\draw (3.5,0.5) node {\large$\star$};
-\draw (0.5,1.5) node {\large$\star$};
-\draw (2.5,1.5) node {\large$\star$};
-\draw (0.5,3.5) node {\large$\star$};
-\draw (1.5,2.5) node {\large$\star$};
-\end{tikzpicture}
-\end{center}
-\noindent which means states $Q_0$ and $Q_2$, as well as $Q_1$
-and $Q_3$ can be merged. This gives the following minimal DFA
-\begin{center}
-\begin{tikzpicture}[>=stealth',very thick,auto,
-every state/.style={minimum size=0pt,
-inner sep=2pt,draw=blue!50,very thick,
-fill=blue!20}]
-\node[state,initial]  (Q_02)  {$Q_{0, 2}$};
-\node[state] (Q_13) [right=of Q_02] {$Q_{1, 3}$};
-\node[state, accepting] (Q_4) [right=of Q_13]
-{$Q_{4\phantom{,0}}$};
-\path[->] (Q_02) edge [bend left] node [above]  {$a$} (Q_13);
-\path[->] (Q_13) edge [bend left] node [below]  {$b$} (Q_02);
-\path[->] (Q_02) edge [loop below] node  {$b$} ();
-\path[->] (Q_13) edge node [above]  {$a$} (Q_4);
-\path[->] (Q_4) edge [loop above] node  {$a, b$} ();
-\end{tikzpicture}
-\end{center}
-\subsubsection*{Regular Languages}
 Given the constructions in the previous sections we obtain
 the following overall picture:
 \begin{center}
 \noindent So for deciding whether a string is recognised by a
 regular expression, we could use our algorithm based on
 derivatives or NFAs or DFAs. But let us quickly look at what
 the differences mean in computational terms. Translating a
 regular expression into a NFA gives us an automaton that has
-$O(n)$ nodes---that means the size of the NFA grows linearly
+$O(n)$ states---that means the size of the NFA grows linearly
 with the size of the regular expression. The problem with NFAs
 is that the problem of deciding whether a string is accepted
 or not is computationally not cheap. Remember with NFAs we
 have potentially many next states even for the same input and
 also have the silent $\epsilon$-transitions. If we want to
 expression for this language and also not an automaton. One
 can actually prove that there is no regular expression nor
 automaton for this language, but again that would lead us too
 far afield for what we want to do in this module.
-\section*{Further Reading}
+%\section*{Further Reading}
-Compare what a ``human expert'' would create as an automaton for the
+%Compare what a ``human expert'' would create as an automaton for the
-regular expression $a (b + c)^*$ and what the Thomson
+%regular expression $a\cdot (b + c)^*$ and what the Thomson
-algorithm generates.
+%algorithm generates.
 %http://www.inf.ed.ac.uk/teaching/courses/ct/
 \end{document}
 %%% Local Variables:

changeset 490	4fee50f38305
parent 489	e28d7a327870
child 491	d5776c6018f0