handouts/ho03.tex
changeset 487 a697421eaa04
parent 485 bd6d999ab7b6
child 488 598741d39d21
equal deleted inserted replaced
486:8178fcf377dc 487:a697421eaa04
     1 \documentclass{article}
     1 \documentclass{article}
     2 \usepackage{../style}
     2 \usepackage{../style}
     3 \usepackage{../langs}
     3 \usepackage{../langs}
     4 \usepackage{../graphics}
     4 \usepackage{../graphics}
     5 
     5 
     6 %We can even allow ``silent
       
     7 %transitions'', also called epsilon-transitions. They allow us
       
     8 %to go from one state to the next without having a character
       
     9 %consumed. We label such silent transition with the letter
       
    10 %$\epsilon$.
       
    11 
     6 
    12 \begin{document}
     7 \begin{document}
    13 \fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017}
     8 \fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017}
    14 
     9 
    15 \section*{Handout 3 (Finite Automata)}
    10 \section*{Handout 3 (Finite Automata)}
   128 
   123 
   129 \begin{figure}[p]
   124 \begin{figure}[p]
   130 \small
   125 \small
   131 \lstinputlisting[numbers=left,linebackgroundcolor=
   126 \lstinputlisting[numbers=left,linebackgroundcolor=
   132                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   127                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   133                   {../progs/dfa.scala}
   128                   {../progs/display/dfa.scala}
   134 \caption{A Scala implementation of DFAs using partial functions.
   129 \caption{A Scala implementation of DFAs using partial functions.
   135   Notice some subtleties: \texttt{deltas} implements the delta-hat
   130   Notice some subtleties: \texttt{deltas} implements the delta-hat
   136   construction by lifting the transition (partial) function to lists
   131   construction by lifting the (partial) transition  function to lists
   137   of characters. Since \texttt{delta} is given as a partial function,
   132   of characters. Since \texttt{delta} is given as a partial function,
   138   it can obviously go ``wrong'' in which case the \texttt{Try} in
   133   it can obviously go ``wrong'' in which case the \texttt{Try} in
   139   \texttt{accepts} catches the error and returns \texttt{false}---that
   134   \texttt{accepts} catches the error and returns \texttt{false}---that
   140   means the string is not accepted.  The example \texttt{delta} in
   135   means the string is not accepted.  The example \texttt{delta} in
   141   Line 28--38 implements the DFA example shown earlier in the
   136   Line 28--38 implements the DFA example shown earlier in the
   264 Perhaps interestingly, I do not actually use relations for my NFAs,
   259 Perhaps interestingly, I do not actually use relations for my NFAs,
   265 but use transition functions that return sets of states.  DFAs have
   260 but use transition functions that return sets of states.  DFAs have
   266 partial transition functions that return a single state; my NFAs
   261 partial transition functions that return a single state; my NFAs
   267 return a set. I let you think about this representation for
   262 return a set. I let you think about this representation for
   268 NFA-transitions and how it corresponds to the relations used in the
   263 NFA-transitions and how it corresponds to the relations used in the
   269 mathematical definition of NFAs.
   264 mathematical definition of NFAs. An example of a transition function
   270 
   265 in Scala for the NFA above is
   271 Like in the mathematical definition, \texttt{starts} is in NFAs a set
   266 
   272 of states; \texttt{fins} is again a function from states to
   267 {\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
       
   268                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
       
   269 val nfa_delta : (State, Char) :=> Set[State] = 
       
   270   { case (Q0, 'a') => Set(Q1, Q2)
       
   271     case (Q0, 'b') => Set(Q0)
       
   272     case (Q1, 'a') => Set(Q1, Q2)
       
   273     case (Q1, 'b') => Set(Q0, Q1) }
       
   274 \end{lstlisting}}  
       
   275 
       
   276 \noindent Like in the mathematical definition, \texttt{starts} is in
       
   277 NFAs a set of states; \texttt{fins} is again a function from states to
   273 booleans. The \texttt{next} function calculates the set of next states
   278 booleans. The \texttt{next} function calculates the set of next states
   274 reachable from a single state \texttt{q} by a character~\texttt{c}. In
   279 reachable from a single state \texttt{q} by a character~\texttt{c}. In
   275 case there is no such state---the partial transition function is
   280 case there is no such state---the partial transition function is
   276 undefined---the empty set is returned (see function
   281 undefined---the empty set is returned (see function
   277 \texttt{applyOrElse} in Lines 9 and 10). The function \texttt{nexts}
   282 \texttt{applyOrElse} in Lines 9 and 10). The function \texttt{nexts}
   279  
   284  
   280 \begin{figure}[p]
   285 \begin{figure}[p]
   281 \small
   286 \small
   282 \lstinputlisting[numbers=left,linebackgroundcolor=
   287 \lstinputlisting[numbers=left,linebackgroundcolor=
   283                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   288                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   284                   {../progs/nfa.scala}
   289                   {../progs/display/nfa.scala}
   285 \caption{A Scala implementation of NFAs using partial functions.
   290 \caption{A Scala implementation of NFAs using partial functions.
   286   Notice that the function \texttt{accepts} implements the
   291   Notice that the function \texttt{accepts} implements the
   287   acceptance of a string in a breath-first search fashion. This can be a costly
   292   acceptance of a string in a breath-first search fashion. This can be a costly
   288   way of deciding whether a string is accepted or not in applications that need to handle
   293   way of deciding whether a string is accepted or not in applications that need to handle
   289   large NFAs and large inputs.\label{nfa}}
   294   large NFAs and large inputs.\label{nfa}}
   292 Look very careful at the \texttt{accepts} and \texttt{deltas}
   297 Look very careful at the \texttt{accepts} and \texttt{deltas}
   293 functions in NFAs and remember that when accepting a string by a NFA
   298 functions in NFAs and remember that when accepting a string by a NFA
   294 we might have to explore all possible transitions (recall which state
   299 we might have to explore all possible transitions (recall which state
   295 to go to is not unique anymore with NFAs\ldots{}we need to explore
   300 to go to is not unique anymore with NFAs\ldots{}we need to explore
   296 potentially all next states). The implementation achieves this
   301 potentially all next states). The implementation achieves this
   297 exploration in a \emph{breadth-first search} manner. This is fine for
   302 exploration through a \emph{breadth-first search}. This is fine for
   298 small NFAs, but can lead to real memory problems when the NFAs are
   303 small NFAs, but can lead to real memory problems when the NFAs are
   299 bigger and larger strings need to be processed. As result, some
   304 bigger and larger strings need to be processed. As result, some
   300 regular expression matching engines resort to a \emph{depth-first
   305 regular expression matching engines resort to a \emph{depth-first
   301   search} with \emph{backtracking} in unsuccessful cases. In our
   306   search} with \emph{backtracking} in unsuccessful cases. In our
   302 implementation we can implement a depth-first version of
   307 implementation we can implement a depth-first version of
   313 def accepts2(s: List[C]) : Boolean = 
   318 def accepts2(s: List[C]) : Boolean = 
   314   starts.exists(search(_, s))
   319   starts.exists(search(_, s))
   315 \end{lstlisting}}
   320 \end{lstlisting}}
   316 
   321 
   317 \noindent
   322 \noindent
   318 This depth-first way of exploration seems to work efficiently in many
   323 This depth-first way of exploration seems to work quite efficiently in
   319 examples and is much less of a strain on memory. The problem is that
   324 many examples and is much less of a strain on memory. The problem is
   320 the backtracking can get ``catastrophic'' in some examples---remember
   325 that the backtracking can get ``catastrophic'' in some
   321 the catastrophic backtracking from earlier lectures. This depth-first
   326 examples---remember the catastrophic backtracking from earlier
   322 search with backtracking is the reason for the abysmal performance of
   327 lectures. This depth-first search with backtracking is the reason for
   323 some regular expression matchings in Java, Ruby and Python. I like to
   328 the abysmal performance of some regular expression matchings in Java,
   324 show you this next.
   329 Ruby and Python. I like to show you this in the next two sections.
   325 
   330 
   326 
   331 
   327 \subsubsection*{Thompson Construction}
   332 \subsubsection*{Epsilon NFAs}
   328 
   333 
   329 In order to get an idea what calculations are performed by Java \&
   334 In order to get an idea what calculations are performed by Java \&
   330 friends, we need a method for transforming a regular expression into
   335 friends, we need a method for transforming a regular expression into
   331 an automaton. This automaton should accept exactly those strings that
   336 an automaton. This automaton should accept exactly those strings that
   332 are accepted by the regular expression.  The simplest and most
   337 are accepted by the regular expression.  The simplest and most
   333 well-known method for this is called \emph{Thompson Construction},
   338 well-known method for this is called \emph{Thompson Construction},
   334 after the Turing Award winner Ken Thompson. This method is by
   339 after the Turing Award winner Ken Thompson. This method is by
   335 recursion over regular expressions and uses the non-determinism in
   340 recursion over regular expressions and depends on the non-determinism
   336 NFAs. You will see shortly why this construction works well with NFAs,
   341 in NFAs described in the earlier section. You will see shortly why
   337 but is not so straightforward with DFAs. Unfortunately we are still
   342 this construction works well with NFAs, but is not so straightforward
   338 one step away from our intended target though---because this
   343 with DFAs.
   339 construction uses a version of NFAs that allows ``silent
   344 
   340 transitions''. The idea behind silent transitions is that they
   345 Unfortunately we are still one step away from our intended target
   341 allow us to go from one state to the next without having to consume a
   346 though---because this construction uses a version of NFAs that allows
   342 character. We label such silent transition with the letter
   347 ``silent transitions''. The idea behind silent transitions is that
   343 $\epsilon$ and call the automata $\epsilon$NFAs. Two
   348 they allow us to go from one state to the next without having to
   344 typical examples of $\epsilon$NFAs are
   349 consume a character. We label such silent transition with the letter
       
   350 $\epsilon$ and call the automata $\epsilon$NFAs. Two typical examples
       
   351 of $\epsilon$NFAs are:
   345 
   352 
   346 
   353 
   347 \begin{center}
   354 \begin{center}
   348 \begin{tabular}[t]{c@{\hspace{9mm}}c}
   355 \begin{tabular}[t]{c@{\hspace{9mm}}c}
   349 \begin{tikzpicture}[>=stealth',very thick,
   356 \begin{tikzpicture}[>=stealth',very thick,
   371 \end{tikzpicture}}
   378 \end{tikzpicture}}
   372 \end{tabular}
   379 \end{tabular}
   373 \end{center}
   380 \end{center}
   374 
   381 
   375 \noindent
   382 \noindent
   376 Consider the $\epsilon$NFA on the left-hand side: an
   383 Consider the $\epsilon$NFA on the left-hand side: the
   377 $\epsilon$-transition means you do not have to ``consume'' any part of
   384 $\epsilon$-transitions mean you do not have to ``consume'' any part of
   378 the input string, but ``silently'' change to a different state. For
   385 the input string, but ``silently'' change to a different state. In
   379 example, if you are in the starting state $Q_0$, you can silently move
   386 this example, if you are in the starting state $Q_0$, you can silently
   380 either to $Q_1$ or $Q_2$. In this example you can see that once you
   387 move either to $Q_1$ or $Q_2$. You can see that once you are in $Q_1$,
   381 are in $Q_1$, respectively $Q_2$, you cannot ``go back'' to the other
   388 respectively $Q_2$, you cannot ``go back'' to the other states. So it
   382 states.
   389 seems allowing $\epsilon$-transitions is a rather substancial
   383 
   390 extension to NFAs. On first appearances, $\epsilon$-transitions might
   384 On first appearances, $\epsilon$-transitions might look rather
   391 even look rather strange, or even silly. To start with, silent
   385 strange, or even silly. To start with, silent transitions make the
   392 transitions make the decision whether a string is accepted by an
   386 decision whether a string is accepted by an automaton even harder:
   393 automaton even harder: with $\epsilon$NFAs we have to look whether we
   387 with $\epsilon$NFAs we have to look whether we can do first some
   394 can do first some $\epsilon$-transitions and then do a
   388 $\epsilon$-transitions and then do a ``proper''-transition; and after
   395 ``proper''-transition; and after any ``proper''-transition we again
   389 any ``proper''-transition we again have to check whether we can do
   396 have to check whether we can do again some silent transitions. Even
   390 again some silent transitions. Even worse, if there is a silent
   397 worse, if there is a silent transition pointing back to the same
   391 transition pointing back to the same state, then we have to be careful
   398 state, then we have to be careful our decision procedure for strings
   392 our decision procedure for strings does not loop (remember the
   399 does not loop (remember the depth-first search for exploring all
   393 depth-first search for exploring all states).
   400 states).
   394 
   401 
   395 The obvious question is: Do we get anything in return for this hassle
   402 The obvious question is: Do we get anything in return for this hassle
   396 with silent transitions?  Well, we still have to work for it\ldots
   403 with silent transitions?  Well, we still have to work for it\ldots
   397 unfortunately.  If we were to follow the many textbooks on the
   404 unfortunately.  If we were to follow the many textbooks on the
   398 subject, we would now start with defining what $\epsilon$NFAs
   405 subject, we would now start with defining what $\epsilon$NFAs
   399 are---that would require extending the transition relation of
   406 are---that would require extending the transition relation of
   400 NFAs. Next, show that the $\epsilon$NFAs are equivalent to NFAs and so
   407 NFAs. Next, show that the $\epsilon$NFAs are equivalent to NFAs and so
   401 on. Once we have done all this on paper, we would need to implement
   408 on. Once we have done all this on paper, we would need to implement
   402 $\epsilon$NFAs. Lets try to take a shortcut---we are not really
   409 $\epsilon$NFAs. Lets try to take a shortcut instead. We are not really
   403 interested in $\epsilon$NFAs; they are only a convenient tool for
   410 interested in $\epsilon$NFAs; they are only a convenient tool for
   404 translating regular expressions into automata. We are not going to
   411 translating regular expressions into automata. So we are not going to
   405 implementing them explicitly, but translate them directly into NFAs
   412 implementing them explicitly, but translate them immediately into NFAs
   406 (in a sense $\epsilon$NFAs are just a convenient API for lazy people).
   413 (in a sense $\epsilon$NFAs are just a convenient API for lazy people ;o).
   407 How does the translation work: well we have to find all transitions of
   414 How does the translation work? Well we have to find all transitions of
   408 the form
   415 the form
   409 
   416 
   410 \[
   417 \[
   411 q\stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow}
   418 q\stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow}
   412 \;\stackrel{a}{\longrightarrow}\;
   419 \;\stackrel{a}{\longrightarrow}\;
   413 \stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow} q'
   420 \stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow} q'
   414 \]
   421 \]
   415 
   422 
   416 \noindent and replace them with $q \stackrel{a}{\longrightarrow}
   423 \noindent and replace them with $q \stackrel{a}{\longrightarrow}
   417 q'$. Doing this to the $\epsilon$NFA on the left-hand side above gives
   424 q'$. Doing this to the $\epsilon$NFA on the right-hand side above gives
   418 the NFA
   425 the NFA
   419 
   426 
   420 \begin{center}
   427 \begin{center}
   421 \begin{tikzpicture}[>=stealth',very thick,
   428 \begin{tikzpicture}[>=stealth',very thick,
   422     every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]
   429     every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]
   430 \path[->] (r_1) edge [loop below] node  {$a$} ();
   437 \path[->] (r_1) edge [loop below] node  {$a$} ();
   431 \path[->] (r_1) edge [bend right] node [below]  {$a$} (r_3);
   438 \path[->] (r_1) edge [bend right] node [below]  {$a$} (r_3);
   432 \end{tikzpicture}
   439 \end{tikzpicture}
   433 \end{center}
   440 \end{center}
   434 
   441 
   435 \noindent where the single the $\epsilon$-transition is replaced by
   442 \noindent where the single $\epsilon$-transition is replaced by
   436 three more $a$-transitions. So whenever we are given $\epsilon$NFA we
   443 three additional $a$-transitions. Please do the calculations yourself
   437 will, similarly, replace it by an equivalent NFA. The code for this is
   444 and verify that I did not forget any transition.
   438 given in Figure~\ref{enfa}. At the core is a function that calculates
   445 
   439 a fixpoint of function (Lines 5--10). This is used for ``discovering''
   446 So in what follows, whenever we are given an $\epsilon$NFA we will
   440 states reachable by $\epsilon$-transitions. Once no new state is
   447 replace it by an equivalent NFA. The code for this is given in
   441 reachable state, a fixpoint is reached. This is used for example when
   448 Figure~\ref{enfa}. The main workhorse in this code is a function that
   442 calculating the starting states of an equivalent NFA---see Line 36.
   449 calculates a fixpoint of function (Lines 5--10). This function is used
   443 We starts with all starting states of the $\epsilon$NFA and then look
   450 for ``discovering'' which states are reachable by
   444 for all other states that can be reached by
   451 $\epsilon$-transitions. Once no new state is discovered, a fixpoint is
   445 $\epsilon$-transition. This is what the $\epsilon$-closure, called
   452 reached. This is used for example when calculating the starting states
   446 \texttt{ecl}, calculates.
   453 of an equivalent NFA (see Line 36): we start with all starting states
       
   454 of the $\epsilon$NFA and then look for all additional states that can
       
   455 be reached by $\epsilon$-transitions. We keep on doing this until no
       
   456 new state can be reached. This is what the $\epsilon$-closure, named
       
   457 in the code \texttt{ecl}, calculates. Similarly, an accepting state of
       
   458 the NFA is when we can reach an accepting state of the $\epsilon$NFA
       
   459 by $\epsilon$-transitions.
       
   460 
   447 
   461 
   448 \begin{figure}[p]
   462 \begin{figure}[p]
   449 \small
   463 \small
   450 \lstinputlisting[numbers=left,linebackgroundcolor=
   464 \lstinputlisting[numbers=left,linebackgroundcolor=
   451                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   465                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   452                   {../progs/enfa.scala}
   466                   {../progs/display/enfa.scala}
   453 
   467 
   454 \caption{A Scala function that translates $\epsilon$NFA into NFAs. The
   468 \caption{A Scala function that translates $\epsilon$NFA into NFAs. The
   455   transtions of $\epsilon$NFA take as input an \texttt{Option[C]}.
   469   transtions of $\epsilon$NFA take as input an \texttt{Option[C]}.
   456   \texttt{None} stands for an $\epsilon$-transition; \texttt{Some(c)}
   470   \texttt{None} stands for an $\epsilon$-transition; \texttt{Some(c)}
   457   for a ``proper'' transition. The functions in Lines 18--26 calculate
   471   for a ``proper'' transition. The functions in Lines 18--26 calculate
   458   all states reachable by one or more $\epsilon$-transition for a given
   472   all states reachable by one or more $\epsilon$-transition for a given
   459   set of states. The NFA is constructed in in Lines 36--38.\label{enfa}}
   473   set of states. The NFA is constructed in in Lines 36--38.\label{enfa}}
   460 \end{figure}
   474 \end{figure}
   461 
   475 
   462 
   476 Also look carefully how the transitions of $\epsilon$NFAs are
   463 Now having a translation of $\epsilon$NFAs to NFAs in place, we can
   477 implemented. The additional possibility of performing silent
   464 finally make headway with the problem of translating regular
   478 transitions is encoded by using \texttt{Option[C]} as the type for the
   465 expressions into equivalent NFAs. By equivalent we mean that the NFAs
   479 ``input''. The \texttt{Some}s stand for ``propper'' transitions where
       
   480 a character is consumed; \texttt{None} stands for
       
   481 $\epsilon$-transitions. The transition functions for the two
       
   482 $\epsilon$NFAs from the beginning of this section can be defined as
       
   483 
       
   484 {\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
       
   485                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
       
   486 val enfa_trans1 : (State, Option[Char]) :=> Set[State] = 
       
   487   { case (Q0, Some('a')) => Set(Q0)
       
   488     case (Q0, None) => Set(Q1, Q2)
       
   489     case (Q1, Some('a')) => Set(Q1)
       
   490     case (Q2, Some('b')) => Set(Q2) }
       
   491 
       
   492 val enfa_trans2 : (State, Option[Char]) :=> Set[State] = 
       
   493   { case (R1, Some('b')) => Set(R3)
       
   494     case (R1, None) => Set(R2)
       
   495     case (R2, Some('a')) => Set(R1, R3) }
       
   496 \end{lstlisting}}
       
   497 
       
   498 \noindent
       
   499 I hope you agree now with my earlier statement that the $\epsilon$NFAs
       
   500 are just an API for NFAs.
       
   501 
       
   502 \subsubsection*{Thompson Construction}
       
   503 
       
   504 Having the translation of $\epsilon$NFAs to NFAs in place, we can
       
   505 finally return to the problem of translating regular expressions into
       
   506 equivalent NFAs. Recall that by equivalent we mean that the NFAs
   466 recognise the same language. Consider the simple regular expressions
   507 recognise the same language. Consider the simple regular expressions
   467 $\ZERO$, $\ONE$ and $c$. They can be translated into equivalent NFAs
   508 $\ZERO$, $\ONE$ and $c$. They can be translated into equivalent NFAs
   468 as follows:
   509 as follows:
   469 
   510 
   470 \begin{center}
   511 \begin{center}
   475 \end{tikzpicture}\\\\
   516 \end{tikzpicture}\\\\
   476 \raisebox{1mm}{$\ONE$} & 
   517 \raisebox{1mm}{$\ONE$} & 
   477 \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
   518 \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
   478 \node[state, initial, accepting]  (Q_0)  {$\mbox{}$};
   519 \node[state, initial, accepting]  (Q_0)  {$\mbox{}$};
   479 \end{tikzpicture}\\\\
   520 \end{tikzpicture}\\\\
   480 \raisebox{2mm}{$c$} & 
   521 \raisebox{3mm}{$c$} & 
   481 \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
   522 \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
   482 \node[state, initial]  (Q_0)  {$\mbox{}$};
   523 \node[state, initial]  (Q_0)  {$\mbox{}$};
   483 \node[state, accepting]  (Q_1)  [right=of Q_0] {$\mbox{}$};
   524 \node[state, accepting]  (Q_1)  [right=of Q_0] {$\mbox{}$};
   484 \path[->] (Q_0) edge node [below]  {$c$} (Q_1);
   525 \path[->] (Q_0) edge node [below]  {$c$} (Q_1);
   485 \end{tikzpicture}\\\\
   526 \end{tikzpicture}\\
   486 \end{tabular}
   527 \end{tabular}
   487 \end{center}
   528 \end{center}
       
   529 
       
   530 \noindent
       
   531 I let you think whether the NFAs can match exactly those strings the
       
   532 regular expressions can match. To do this translation in code we need
       
   533 a way to construct states programatically...and as an additional
       
   534 constrain Scala needs to recognise these states as being distinct.
       
   535 For this I implemented in Figure~\ref{thompson1} a class
       
   536 \texttt{TState} that includes a counter and a companion object that
       
   537 increases this counter whenever a state is created.\footnote{You might
       
   538   have to read up what \emph{companion objects} are in Scala.}
   488 
   539 
   489 \begin{figure}[p]
   540 \begin{figure}[p]
   490 \small
   541 \small
   491 \lstinputlisting[numbers=left,linebackgroundcolor=
   542 \lstinputlisting[numbers=left,linebackgroundcolor=
   492                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   543                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   493                   {../progs/thompson.scala}
   544                   {../progs/display/thompson1.scala}
   494 \caption{A Scala XXX\label{thompson}}
   545 \caption{The first part of the Thompson Construction. Lines 7--16
       
   546   implement a way how to create states that are all
       
   547   distinct by virtue of a counter. This counter is
       
   548   increased in the companion object of \texttt{TState}
       
   549   whenever a new state is created. The code in Lines 24--40
       
   550   constructs NFAs for the simple regular expressions.
       
   551   \label{thompson1}}
   495 \end{figure}
   552 \end{figure}
   496 
   553 
   497 
   554 \begin{figure}[p]
   498 \noindent The case for the sequence regular expression $r_1
   555 \small
   499 \cdot r_2$ is as follows: We are given by recursion two
   556 \lstinputlisting[numbers=left,linebackgroundcolor=
   500 automata representing $r_1$ and $r_2$ respectively. 
   557                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
       
   558                   {../progs/display/thompson2.scala}
       
   559 \caption{The second part of the Thompson Construction implementing
       
   560   the composition of NFAs according to $\cdot$, $+$ and $\_^*$.
       
   561   The implicit class about rich partial functions
       
   562   implements the infix operation \texttt{+++} which
       
   563   combines an $\epsilon$NFA transition with a NFA transition
       
   564   (both given as partial functions).\label{thompson2}}
       
   565 \end{figure}
       
   566 
       
   567 The case for the sequence regular expression $r_1 \cdot r_2$ is as
       
   568 follows: We are given by recursion two automata representing $r_1$ and
       
   569 $r_2$ respectively.
   501 
   570 
   502 \begin{center}
   571 \begin{center}
   503 \begin{tikzpicture}[node distance=3mm,
   572 \begin{tikzpicture}[node distance=3mm,
   504                              >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
   573                              >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
   505 \node[state, initial]  (Q_0)  {$\mbox{}$};
   574 \node[state, initial]  (Q_0)  {$\mbox{}$};