handouts/ho03.tex
changeset 488 598741d39d21
parent 487 a697421eaa04
child 489 e28d7a327870
--- a/handouts/ho03.tex	Fri Apr 28 11:01:25 2017 +0100
+++ b/handouts/ho03.tex	Sun May 07 00:20:58 2017 +0100
@@ -9,17 +9,19 @@
 
 \section*{Handout 3 (Finite Automata)}
 
+
 Every formal language and compiler course I know of bombards you first
 with automata and then to a much, much smaller extend with regular
 expressions. As you can see, this course is turned upside down:
 regular expressions come first. The reason is that regular expressions
 are easier to reason about and the notion of derivatives, although
 already quite old, only became more widely known rather
-recently. Still let us in this lecture have a closer look at automata
+recently. Still, let us in this lecture have a closer look at automata
 and their relation to regular expressions. This will help us with
 understanding why the regular expression matchers in Python, Ruby and
 Java are so slow with certain regular expressions. On the way we will
-also see what are the limitations of regular expressions.
+also see what are the limitations of regular
+expressions. Unfortunately, they cannot be used for \emph{everything}.
 
 
 \subsection*{Deterministic Finite Automata}
@@ -38,11 +40,12 @@
 \item $\delta$ is the transition function.
 \end{itemize}
 
-\noindent The transition function determines how to ``transition''
-from one state to the next state with respect to a character. We have
-the assumption that these transition functions do not need to be
-defined everywhere: so it can be the case that given a character there
-is no next state, in which case we need to raise a kind of ``failure
+\noindent I am sure you have seen this defininition already
+before. The transition function determines how to ``transition'' from
+one state to the next state with respect to a character. We have the
+assumption that these transition functions do not need to be defined
+everywhere: so it can be the case that given a character there is no
+next state, in which case we need to raise a kind of ``failure
 exception''.  That means actually we have \emph{partial} functions as
 transitions---see the Scala implementation of DFAs later on.  A
 typical example of a DFA is
@@ -118,8 +121,8 @@
 \widehat{\delta}(Q_0, s) \in F 
 \]
 
-\noindent I let you think about a definition that describes
-the set of all strings accepted by an automaton. 
+\noindent I let you think about a definition that describes the set of
+all strings accepted by a determinsitic finite automaton.
 
 \begin{figure}[p]
 \small
@@ -127,7 +130,7 @@
                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
                   {../progs/display/dfa.scala}
 \caption{A Scala implementation of DFAs using partial functions.
-  Notice some subtleties: \texttt{deltas} implements the delta-hat
+  Note some subtleties: \texttt{deltas} implements the delta-hat
   construction by lifting the (partial) transition  function to lists
   of characters. Since \texttt{delta} is given as a partial function,
   it can obviously go ``wrong'' in which case the \texttt{Try} in
@@ -239,9 +242,9 @@
 \emph{or} to state $Q_2$ when receiving an $a$. Similarly in state
 $Q_1$ and receiving an $a$, we can stay in state $Q_1$ \emph{or} go to
 $Q_2$. This kind of choice is not allowed with DFAs. The downside of
-this choice is that when it comes to deciding whether a string is
+this choice in NFAs is that when it comes to deciding whether a string is
 accepted by a NFA we potentially have to explore all possibilities. I
-let you think which kind of strings the above NFA accepts.
+let you think which strings the above NFA accepts.
 
 
 There are a number of additional points you should note about
@@ -259,10 +262,10 @@
 Perhaps interestingly, I do not actually use relations for my NFAs,
 but use transition functions that return sets of states.  DFAs have
 partial transition functions that return a single state; my NFAs
-return a set. I let you think about this representation for
+return a set of states. I let you think about this representation for
 NFA-transitions and how it corresponds to the relations used in the
 mathematical definition of NFAs. An example of a transition function
-in Scala for the NFA above is
+in Scala for the NFA shown above is
 
 {\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
@@ -338,7 +341,7 @@
 well-known method for this is called \emph{Thompson Construction},
 after the Turing Award winner Ken Thompson. This method is by
 recursion over regular expressions and depends on the non-determinism
-in NFAs described in the earlier section. You will see shortly why
+in NFAs described in the previous section. You will see shortly why
 this construction works well with NFAs, but is not so straightforward
 with DFAs.
 
@@ -404,15 +407,15 @@
 unfortunately.  If we were to follow the many textbooks on the
 subject, we would now start with defining what $\epsilon$NFAs
 are---that would require extending the transition relation of
-NFAs. Next, show that the $\epsilon$NFAs are equivalent to NFAs and so
-on. Once we have done all this on paper, we would need to implement
-$\epsilon$NFAs. Lets try to take a shortcut instead. We are not really
-interested in $\epsilon$NFAs; they are only a convenient tool for
-translating regular expressions into automata. So we are not going to
-implementing them explicitly, but translate them immediately into NFAs
-(in a sense $\epsilon$NFAs are just a convenient API for lazy people ;o).
-How does the translation work? Well we have to find all transitions of
-the form
+NFAs. Next, we woudl show that the $\epsilon$NFAs are equivalent to
+NFAs and so on. Once we have done all this on paper, we would need to
+implement $\epsilon$NFAs. Lets try to take a shortcut instead. We are
+not really interested in $\epsilon$NFAs; they are only a convenient
+tool for translating regular expressions into automata. So we are not
+going to implementing them explicitly, but translate them immediately
+into NFAs (in a sense $\epsilon$NFAs are just a convenient API for
+lazy people ;o).  How does the translation work? Well we have to find
+all transitions of the form
 
 \[
 q\stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow}
@@ -444,10 +447,10 @@
 and verify that I did not forget any transition.
 
 So in what follows, whenever we are given an $\epsilon$NFA we will
-replace it by an equivalent NFA. The code for this is given in
-Figure~\ref{enfa}. The main workhorse in this code is a function that
-calculates a fixpoint of function (Lines 5--10). This function is used
-for ``discovering'' which states are reachable by
+replace it by an equivalent NFA. The Scala code for this translation
+is given in Figure~\ref{enfa}. The main workhorse in this code is a
+function that calculates a fixpoint of function (Lines 5--10). This
+function is used for ``discovering'' which states are reachable by
 $\epsilon$-transitions. Once no new state is discovered, a fixpoint is
 reached. This is used for example when calculating the starting states
 of an equivalent NFA (see Line 36): we start with all starting states
@@ -466,11 +469,12 @@
                   {../progs/display/enfa.scala}
 
 \caption{A Scala function that translates $\epsilon$NFA into NFAs. The
-  transtions of $\epsilon$NFA take as input an \texttt{Option[C]}.
+  transtion function of $\epsilon$NFA takes as input an \texttt{Option[C]}.
   \texttt{None} stands for an $\epsilon$-transition; \texttt{Some(c)}
-  for a ``proper'' transition. The functions in Lines 18--26 calculate
+  for a ``proper'' transition consuming a character. The functions in
+  Lines 18--26 calculate
   all states reachable by one or more $\epsilon$-transition for a given
-  set of states. The NFA is constructed in in Lines 36--38.\label{enfa}}
+  set of states. The NFA is constructed in Lines 36--38.\label{enfa}}
 \end{figure}
 
 Also look carefully how the transitions of $\epsilon$NFAs are
@@ -508,7 +512,7 @@
 $\ZERO$, $\ONE$ and $c$. They can be translated into equivalent NFAs
 as follows:
 
-\begin{center}
+\begin{equation}\mbox{
 \begin{tabular}[t]{l@{\hspace{10mm}}l}
 \raisebox{1mm}{$\ZERO$} & 
 \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
@@ -524,18 +528,18 @@
 \node[state, accepting]  (Q_1)  [right=of Q_0] {$\mbox{}$};
 \path[->] (Q_0) edge node [below]  {$c$} (Q_1);
 \end{tikzpicture}\\
-\end{tabular}
-\end{center}
+\end{tabular}}\label{simplecases}
+\end{equation}
 
 \noindent
 I let you think whether the NFAs can match exactly those strings the
 regular expressions can match. To do this translation in code we need
 a way to construct states programatically...and as an additional
-constrain Scala needs to recognise these states as being distinct.
+constrain Scala needs to recognise that these states are being distinct.
 For this I implemented in Figure~\ref{thompson1} a class
 \texttt{TState} that includes a counter and a companion object that
-increases this counter whenever a state is created.\footnote{You might
-  have to read up what \emph{companion objects} are in Scala.}
+increases this counter whenever a new state is created.\footnote{You might
+  have to read up what \emph{companion objects} do in Scala.}
 
 \begin{figure}[p]
 \small
@@ -543,11 +547,12 @@
                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
                   {../progs/display/thompson1.scala}
 \caption{The first part of the Thompson Construction. Lines 7--16
-  implement a way how to create states that are all
+  implement a way of how to create new states that are all
   distinct by virtue of a counter. This counter is
   increased in the companion object of \texttt{TState}
   whenever a new state is created. The code in Lines 24--40
-  constructs NFAs for the simple regular expressions.
+  constructs NFAs for the simple regular expressions $\ZERO$, $\ONE$ and $c$.
+  Compare the pictures given in \eqref{simplecases}.
   \label{thompson1}}
 \end{figure}
 
@@ -564,57 +569,77 @@
   (both given as partial functions).\label{thompson2}}
 \end{figure}
 
-The case for the sequence regular expression $r_1 \cdot r_2$ is as
-follows: We are given by recursion two automata representing $r_1$ and
-$r_2$ respectively.
+The case for the sequence regular expression $r_1 \cdot r_2$ is a bit more
+complicated: We are given by recursion two NFAs representing the regular
+expressions $r_1$ and $r_2$ respectively.
 
 \begin{center}
 \begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
 \node[state, initial]  (Q_0)  {$\mbox{}$};
-\node (r_1)  [right=of Q_0] {$\ldots$};
-\node[state, accepting]  (t_1)  [right=of r_1] {$\mbox{}$};
-\node[state, accepting]  (t_2)  [above=of t_1] {$\mbox{}$};
-\node[state, accepting]  (t_3)  [below=of t_1] {$\mbox{}$};
-\node[state, initial]  (a_0)  [right=2.5cm of t_1] {$\mbox{}$};
-\node (b_1)  [right=of a_0] {$\ldots$};
+\node[state, initial]  (Q_01) [below=1mm of Q_0] {$\mbox{}$};  
+\node[state, initial]  (Q_02) [above=1mm of Q_0] {$\mbox{}$};  
+\node (R_1)  [right=of Q_0] {$\ldots$};
+\node[state, accepting]  (T_1)  [right=of R_1] {$\mbox{}$};
+\node[state, accepting]  (T_2)  [above=of T_1] {$\mbox{}$};
+\node[state, accepting]  (T_3)  [below=of T_1] {$\mbox{}$};
+
+\node (A_0)  [right=2.5cm of T_1] {$\mbox{}$};
+\node[state, initial]  (A_01)  [above=1mm of A_0] {$\mbox{}$};
+\node[state, initial]  (A_02)  [below=1mm of A_0] {$\mbox{}$};
+
+\node (b_1)  [right=of A_0] {$\ldots$};
 \node[state, accepting]  (c_1)  [right=of b_1] {$\mbox{}$};
 \node[state, accepting]  (c_2)  [above=of c_1] {$\mbox{}$};
 \node[state, accepting]  (c_3)  [below=of c_1] {$\mbox{}$};
 \begin{pgfonlayer}{background}
-\node (1) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (Q_0) (r_1) (t_1) (t_2) (t_3)] {};
-\node (2) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (a_0) (b_1) (c_1) (c_2) (c_3)] {};
+  \node (1) [rounded corners, inner sep=1mm, thick,
+    draw=black!60, fill=black!20, fit= (Q_0) (R_1) (T_1) (T_2) (T_3)] {};
+  \node (2) [rounded corners, inner sep=1mm, thick,
+    draw=black!60, fill=black!20, fit= (A_0) (b_1) (c_1) (c_2) (c_3)] {};
 \node [yshift=2mm] at (1.north) {$r_1$};
 \node [yshift=2mm] at (2.north) {$r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
 \end{center}
 
-\noindent The first automaton has some accepting states. We
-obtain an automaton for $r_1\cdot r_2$ by connecting these
-accepting states with $\epsilon$-transitions to the starting
-state of the second automaton. By doing so we make them
+\noindent The first NFA has some accepting states and the second some
+starting states. We obtain $\epsilon$NFA for $r_1\cdot r_2$ by
+connecting these accepting states with $\epsilon$-transitions to the
+starting states of the second automaton. By doing so we make them
 non-accepting like so:
 
 \begin{center}
 \begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
 \node[state, initial]  (Q_0)  {$\mbox{}$};
+\node[state, initial]  (Q_01) [below=1mm of Q_0] {$\mbox{}$};  
+\node[state, initial]  (Q_02) [above=1mm of Q_0] {$\mbox{}$};  
 \node (r_1)  [right=of Q_0] {$\ldots$};
 \node[state]  (t_1)  [right=of r_1] {$\mbox{}$};
 \node[state]  (t_2)  [above=of t_1] {$\mbox{}$};
 \node[state]  (t_3)  [below=of t_1] {$\mbox{}$};
-\node[state]  (a_0)  [right=2.5cm of t_1] {$\mbox{}$};
-\node (b_1)  [right=of a_0] {$\ldots$};
+
+\node  (A_0)  [right=2.5cm of t_1] {$\mbox{}$};
+\node[state]  (A_01)  [above=1mm of A_0] {$\mbox{}$};
+\node[state]  (A_02)  [below=1mm of A_0] {$\mbox{}$};
+
+\node (b_1)  [right=of A_0] {$\ldots$};
 \node[state, accepting]  (c_1)  [right=of b_1] {$\mbox{}$};
 \node[state, accepting]  (c_2)  [above=of c_1] {$\mbox{}$};
 \node[state, accepting]  (c_3)  [below=of c_1] {$\mbox{}$};
-\path[->] (t_1) edge node [above, pos=0.3]  {$\epsilon$} (a_0);
-\path[->] (t_2) edge node [above]  {$\epsilon$} (a_0);
-\path[->] (t_3) edge node [below]  {$\epsilon$} (a_0);
-
+\path[->] (t_1) edge (A_01);
+\path[->] (t_2) edge node [above]  {$\epsilon$} (A_01);
+\path[->] (t_3) edge (A_01);
+\path[->] (t_1) edge (A_02);
+\path[->] (t_2) edge (A_02);
+\path[->] (t_3) edge node [below]  {$\epsilon$} (A_02);
+ 
 \begin{pgfonlayer}{background}
-\node (3) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (Q_0) (c_1) (c_2) (c_3)] {};
+  \node (3) [rounded corners, inner sep=1mm, thick,
+    draw=black!60, fill=black!20, fit= (Q_0) (c_1) (c_2) (c_3)] {};
 \node [yshift=2mm] at (3.north) {$r_1\cdot r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
@@ -626,7 +651,8 @@
 
 \begin{center}
 \begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
 \node at (0,0)  (1)  {$\mbox{}$};
 \node[state, initial]  (2)  [above right=16mm of 1] {$\mbox{}$};
 \node[state, initial]  (3)  [below right=16mm of 1] {$\mbox{}$};
@@ -706,7 +732,7 @@
 also an accepting state, because $r^*$ can recognise the
 empty string. This gives the following automaton for $r^*$:
 
-\begin{center}
+\begin{center}  
 \begin{tikzpicture}[node distance=3mm,
                              >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
 \node at (0,0) [state, initial,accepting]  (1)  {$\mbox{}$};
@@ -726,16 +752,116 @@
 \end{tikzpicture}
 \end{center}
 
+{\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
+                  {\ifodd\value{lstnumber}\color{capri!3}\fi}]
+def thompson (r: Rexp) : NFAt = r match {
+  case ZERO => NFA_ZERO()
+  case ONE => NFA_ONE()
+  case CHAR(c) => NFA_CHAR(c)  
+  case ALT(r1, r2) => NFA_ALT(thompson(r1), thompson(r2))
+  case SEQ(r1, r2) => NFA_SEQ(thompson(r1), thompson(r2))
+  case STAR(r1) => NFA_STAR(thompson(r1))
+}
+\end{lstlisting}}
+
+
+\begin{center}
+\begin{tabular}{@{\hspace{-1mm}}c@{\hspace{1mm}}c@{}}  
+\begin{tikzpicture}
+\begin{axis}[
+    title={Graph: $\texttt{(a*)*\,b}$ and strings 
+           $\underbrace{\texttt{a}\ldots \texttt{a}}_{n}$},
+    xlabel={$n$},
+    x label style={at={(1.05,0.0)}},
+    ylabel={time in secs},
+    enlargelimits=false,
+    xtick={0,5,...,30},
+    xmax=33,
+    ymax=35,
+    ytick={0,5,...,30},
+    scaled ticks=false,
+    axis lines=left,
+    width=5.5cm,
+    height=4.5cm, 
+    legend entries={Python, Java},  
+    legend pos=outer north east,
+    legend cell align=left]
+  \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data};
+  \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data};  
+  % depth-first search in NFAs
+  \addplot[red,mark=*, mark options={fill=white}] table {
+    1 0.00605
+    2 0.03086
+    3 0.11994
+    4 0.45389
+    5 2.06192
+    6 8.04894
+    7 32.63549
+  };
+\end{axis}
+\end{tikzpicture}
+&
+\begin{tikzpicture}
+\begin{axis}[
+    title={Graph: $\texttt{a?\{n\}\,a{\{n\}}}$ and strings 
+           $\underbrace{\texttt{a}\ldots \texttt{a}}_{n}$},
+    xlabel={$n$},
+    x label style={at={(1.05,0.0)}},
+    ylabel={time in secs},
+    enlargelimits=false,
+    xtick={0,5,...,30},
+    xmax=33,
+    ymax=35,
+    ytick={0,5,...,30},
+    scaled ticks=false,
+    axis lines=left,
+    width=5.5cm,
+    height=4.5cm, 
+    legend entries={Python,Ruby},  
+    legend pos=south east,
+    legend cell align=left]
+  \addplot[blue,mark=*, mark options={fill=white}] table {re-python.data};
+  \addplot[brown,mark=triangle*, mark options={fill=white}] table {re-ruby.data};
+  % breath-first search in NFAs
+  \addplot[red,mark=*, mark options={fill=white}] table {
+    1  0.00741
+    2  0.02657
+    3  0.08317
+    4  0.22133
+    5  0.54463
+    6  1.42062
+    7  4.04926
+    8  12.70395
+    9  39.21555
+    10  112.05466
+  };
+\end{axis}
+\end{tikzpicture}
+\end{tabular}
+\end{center}
+
+
 \noindent This construction of a NFA from a regular expression
 was invented by Ken Thompson in 1968.
 
 
 \subsubsection*{Subset Construction}
 
-What is interesting is that for every NFA we can find a DFA
-which recognises the same language. This can, for example, be
-done by the \emph{subset construction}. Consider again the NFA
-below on the left, consisting of nodes labelled $0$, $1$ and $2$. 
+Remember that we did not bother with defining and implementing
+$\epsilon$NFA; we immediately translated them into equivalent
+NFAs. Equivalent in the sense of accepting the same language (though
+we only claimed this and did not prove it rigorously). Remember also
+that NFAs have a non-deterministic transitions, given as a relation.
+This non-determinism makes it harder to decide when a string is
+accepted or not; such a decision is rather straightforward with DFAs
+(remember their transition function).
+
+What is interesting is that for every NFA we can find a DFA that also
+recognises the same language. This might sound like a bit paradoxical,
+but I litke to show you this next. There are a number of ways of
+transforming a NFA into an equivalent DFA, but the most famous is
+\emph{subset construction}. Consider again the NFA below on the left,
+consisting of nodes labelled, say, with $0$, $1$ and $2$.
 
 \begin{center}
 \begin{tabular}{c@{\hspace{10mm}}c}