handouts/ho03.tex
changeset 484 e61ffb28994d
parent 483 6f508bcdaa30
child 485 bd6d999ab7b6
equal deleted inserted replaced
483:6f508bcdaa30 484:e61ffb28994d
     1 \documentclass{article}
     1 \documentclass{article}
     2 \usepackage{../style}
     2 \usepackage{../style}
     3 \usepackage{../langs}
     3 \usepackage{../langs}
     4 \usepackage{../graphics}
     4 \usepackage{../graphics}
     5 
       
     6 
     5 
     7 %We can even allow ``silent
     6 %We can even allow ``silent
     8 %transitions'', also called epsilon-transitions. They allow us
     7 %transitions'', also called epsilon-transitions. They allow us
     9 %to go from one state to the next without having a character
     8 %to go from one state to the next without having a character
    10 %consumed. We label such silent transition with the letter
     9 %consumed. We label such silent transition with the letter
    31 
    30 
    32 The central definition is:\medskip
    31 The central definition is:\medskip
    33 
    32 
    34 \noindent 
    33 \noindent 
    35 A \emph{deterministic finite automaton} (DFA), say $A$, is
    34 A \emph{deterministic finite automaton} (DFA), say $A$, is
    36 given by a five-tuple written $A(\Sigma, Qs, Q_0, F, \delta)$ where
    35 given by a five-tuple written ${\cal A}(\varSigma, Qs, Q_0, F, \delta)$ where
    37 
    36 
    38 \begin{itemize}
    37 \begin{itemize}
    39 \item $\Sigma$ is an alphabet,  
    38 \item $\varSigma$ is an alphabet,  
    40 \item $Qs$ is a finite set of states,
    39 \item $Qs$ is a finite set of states,
    41 \item $Q_0 \in Qs$ is the start state,
    40 \item $Q_0 \in Qs$ is the start state,
    42 \item $F \subseteq Qs$ are the accepting states, and
    41 \item $F \subseteq Qs$ are the accepting states, and
    43 \item $\delta$ is the transition function.
    42 \item $\delta$ is the transition function.
    44 \end{itemize}
    43 \end{itemize}
    47 from one state to the next state with respect to a character. We have
    46 from one state to the next state with respect to a character. We have
    48 the assumption that these transition functions do not need to be
    47 the assumption that these transition functions do not need to be
    49 defined everywhere: so it can be the case that given a character there
    48 defined everywhere: so it can be the case that given a character there
    50 is no next state, in which case we need to raise a kind of ``failure
    49 is no next state, in which case we need to raise a kind of ``failure
    51 exception''.  That means actually we have \emph{partial} functions as
    50 exception''.  That means actually we have \emph{partial} functions as
    52 transitions---see the implementation later on.  A typical example of a
    51 transitions---see the Scala implementation of DFAs later on.  A
    53 DFA is
    52 typical example of a DFA is
    54 
    53 
    55 \begin{center}
    54 \begin{center}
    56 \begin{tikzpicture}[>=stealth',very thick,auto,
    55 \begin{tikzpicture}[>=stealth',very thick,auto,
    57                     every state/.style={minimum size=0pt,
    56                     every state/.style={minimum size=0pt,
    58                     inner sep=2pt,draw=blue!50,very thick,
    57                     inner sep=2pt,draw=blue!50,very thick,
   100 an automaton. For this we lift the transition function
    99 an automaton. For this we lift the transition function
   101 $\delta$ from characters to strings as follows:
   100 $\delta$ from characters to strings as follows:
   102 
   101 
   103 \[
   102 \[
   104 \begin{array}{lcl}
   103 \begin{array}{lcl}
   105 \hat{\delta}(q, [])        & \dn & q\\
   104 \widehat{\delta}(q, [])        & \dn & q\\
   106 \hat{\delta}(q, c\!::\!s) & \dn & \hat{\delta}(\delta(q, c), s)\\
   105 \widehat{\delta}(q, c\!::\!s) & \dn & \widehat{\delta}(\delta(q, c), s)\\
   107 \end{array}
   106 \end{array}
   108 \]
   107 \]
   109 
   108 
   110 \noindent This lifted transition function is often called
   109 \noindent This lifted transition function is often called
   111 ``delta-hat''. Given a string, we start in the starting state and take
   110 ``delta-hat''. Given a string, we start in the starting state and take
   112 the first character of the string, follow to the next sate, then take
   111 the first character of the string, follow to the next state, then take
   113 the second character and so on. Once the string is exhausted and we
   112 the second character and so on. Once the string is exhausted and we
   114 end up in an accepting state, then this string is accepted by the
   113 end up in an accepting state, then this string is accepted by the
   115 automaton. Otherwise it is not accepted. This also means that if along
   114 automaton. Otherwise it is not accepted. This also means that if along
   116 the way we hit the case where the transition function $\delta$ is not
   115 the way we hit the case where the transition function $\delta$ is not
   117 defined, we need to raise an error. In our implementation we will deal
   116 defined, we need to raise an error. In our implementation we will deal
   118 with this case elegantly by using Scala's \texttt{Try}. So a string
   117 with this case elegantly by using Scala's \texttt{Try}. So a string
   119 $s$ is in the \emph{language accepted by the automaton} $A(\Sigma, Q, Q_0, F,
   118 $s$ is in the \emph{language accepted by the automaton} ${\cal
   120 \delta)$ iff
   119   A}(\varSigma, Q, Q_0, F, \delta)$ iff
   121 
   120 
   122 \[
   121 \[
   123 \hat{\delta}(Q_0, s) \in F 
   122 \widehat{\delta}(Q_0, s) \in F 
   124 \]
   123 \]
   125 
   124 
   126 \noindent I let you think about a definition that describes
   125 \noindent I let you think about a definition that describes
   127 the set of all strings accepted by an automaton. 
   126 the set of all strings accepted by an automaton. 
   128 
   127 
   140   returns \texttt{false}---that means the string is not accepted.
   139   returns \texttt{false}---that means the string is not accepted.
   141   The example \texttt{delta} implements the DFA example shown
   140   The example \texttt{delta} implements the DFA example shown
   142   earlier in the handout.\label{dfa}}
   141   earlier in the handout.\label{dfa}}
   143 \end{figure}
   142 \end{figure}
   144 
   143 
   145 A simple Scala implementation for DFAs is given in
   144 My take of a simple Scala implementation for DFAs is given in
   146 Figure~\ref{dfa}. As you can see, there are many features of the
   145 Figure~\ref{dfa}. As you can see, there are many features of the
   147 mathematical definition that are quite closely reflected in the
   146 mathematical definition that are quite closely reflected in the
   148 code. In the DFA-class, there is a starting state, called
   147 code. In the DFA-class, there is a starting state, called
   149 \texttt{start}, with the polymorphic type \texttt{A}.  There is a
   148 \texttt{start}, with the polymorphic type \texttt{A}.  There is a
   150 partial function \texttt{delta} for specifying the transitions---these
   149 partial function \texttt{delta} for specifying the transitions---these
   151 partial functions take a state (of polymorphic type \texttt{A}) and an
   150 partial functions take a state (of polymorphic type \texttt{A}) and an
   152 input (of polymorphic type \texttt{C}) and produce a new state (of
   151 input (of polymorphic type \texttt{C}) and produce a new state (of
   153 type \texttt{A}). For the moment it is OK to assume that \texttt{A} is
   152 type \texttt{A}). For the moment it is OK to assume that \texttt{A} is
   154 some arbitrary type for states and the input is just characters.  (The
   153 some arbitrary type for states and the input is just characters.  (The
   155 reason for having a polymorphic types for the states and the input of
   154 reason for having polymorphic types for the states and the input of
   156 DFAs will become clearer later on.)
   155 DFAs will become clearer later on.)
   157 
   156 
   158 I used partial functions for representing the transitions in Scala;
   157 The most important point in this implemnetation is that I use Scala's
   159 alternatives would have been \texttt{Maps} or even \texttt{Lists}. One
   158 partial functions for representing the transitions; alternatives would
   160 of the main advantages of using partial functions is that transitions
   159 have been \texttt{Maps} or even \texttt{Lists}. One of the main
   161 can be quite nicely defined by a series of \texttt{case} statements
   160 advantages of using partial functions is that transitions can be quite
   162 (see Lines 28 -- 38 for an example). If you need to represent an
   161 nicely defined by a series of \texttt{case} statements (see Lines 28
   163 automaton with a sink state (catch-all-state), you can use Scala's
   162 -- 38 for an example). If you need to represent an automaton with a
   164 pattern matching and write something like
   163 sink state (catch-all-state), you can use Scala's pattern matching and
       
   164 write something like
   165 
   165 
   166 {\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
   166 {\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
   167                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   167                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   168 abstract class State
   168 abstract class State
   169 ...
   169 ...
   174     case (S1, 'a') => S2
   174     case (S1, 'a') => S2
   175     case _ => Sink
   175     case _ => Sink
   176   }
   176   }
   177 \end{lstlisting}}  
   177 \end{lstlisting}}  
   178 
   178 
   179 \noindent The DFA-class has also an argument for specifying final
   179 \noindent I let you think what this DFA looks like in the graphical
   180 states. In the implementation it not a set of states, as in the
   180 notation.
   181 matemathical definition, but is a function from states to booleans
   181 
   182 (this function is supposed to return true whenever a state is meant to
   182 The DFA-class has also an argument for specifying final states. In the
   183 be final; false otherwise). While this boolean function is different
   183 implementation it not a set of states, as in the matemathical
   184 from the sets of states, Scala allows to use sets for such functions
   184 definition, but a function from states to booleans (this function is
   185 (see Line 40 where the DFA is initialised). Again it will become clear
   185 supposed to return true whenever a state is final; false
   186 later on why I use functions for final states, rather than sets.
   186 otherwise). While this boolean function is different from the sets of
       
   187 states, Scala allows to use sets for such functions (see Line 40 where
       
   188 the DFA is initialised). Again it will become clear later on why I use
       
   189 functions for final states, rather than sets.
   187 
   190 
   188 I let you ponder whether this is a good implementation of DFAs. In
   191 I let you ponder whether this is a good implementation of DFAs. In
   189 doing so I hope you notice that the $Qs$ component (the set of finite
   192 doing so I hope you notice that the $\varSigma$ and $Qs$ components (the
   190 states) is missing from the class definition. This means that the
   193 alphabet and the set of finite states, respectively) are missing from
   191 implementation allows you to do some fishy things you are not meant to
   194 the class definition. This means that the implementation allows you to
   192 do with DFAs. Which fishy things could that be?
   195 do some fishy things you are not meant to do with DFAs. Which fishy
       
   196 things could that be?
   193 
   197 
   194 
   198 
   195 
   199 
   196 \subsection*{Non-Deterministic Finite Automata}
   200 \subsection*{Non-Deterministic Finite Automata}
   197 
   201 
   198 While with DFAs it will always be clear that given a state and a
   202 While with DFAs it is always be clear that given a state and a
   199 character what the next state is (potentially none), it will be useful
   203 character what the next state is (potentially none), it will be useful
   200 to relax this restriction. That means we allow to have several
   204 to relax this restriction. That means we allow states to have several
   201 potential successor states. We even allow more than one starting
   205 potential successor states. We even allow more than one starting
   202 state. The resulting construction is called a \emph{non-deterministic
   206 state. The resulting construction is called a \emph{Non-Deterministic
   203   finite automaton} (NFA) given also as a five-tuple $A(\Sigma, Qs, Q_{0s}, F,
   207   Finite Automaton} (NFA) given also as a five-tuple ${\cal A}(\varSigma,
   204 \rho)$ where
   208 Qs, Q_{0s}, F, \rho)$ where
   205 
   209 
   206 \begin{itemize}
   210 \begin{itemize}
   207 \item $\Sigma$ is an alphabet,  
   211 \item $\varSigma$ is an alphabet,  
   208 \item $Qs$ is a finite set of states
   212 \item $Qs$ is a finite set of states
   209 \item $Q_{0s}$ is a set of start states ($Q_{0s} \subseteq Qs$)
   213 \item $Q_{0s}$ is a set of start states ($Q_{0s} \subseteq Qs$)
   210 \item $F$ are some accepting states with $F \subseteq Qs$, and
   214 \item $F$ are some accepting states with $F \subseteq Qs$, and
   211 \item $\rho$ is a transition relation.
   215 \item $\rho$ is a transition relation.
   212 \end{itemize}
   216 \end{itemize}
   233 
   237 
   234 \noindent
   238 \noindent
   235 This NFA happens to have only one starting state, but in general there
   239 This NFA happens to have only one starting state, but in general there
   236 could be more.  Notice that in state $Q_0$ we might go to state $Q_1$
   240 could be more.  Notice that in state $Q_0$ we might go to state $Q_1$
   237 \emph{or} to state $Q_2$ when receiving an $a$. Similarly in state
   241 \emph{or} to state $Q_2$ when receiving an $a$. Similarly in state
   238 $Q_1$ and receiving an $a$, we can stay in state $Q_1$ or go to
   242 $Q_1$ and receiving an $a$, we can stay in state $Q_1$ \emph{or} go to
   239 $Q_2$. This kind of choice is not allowed with DFAs. The downside is
   243 $Q_2$. This kind of choice is not allowed with DFAs. The downside of
   240 that when it comes to deciding whether a string is accepted by a NFA
   244 this choice is that when it comes to deciding whether a string is
   241 we potentially have to explore all possibilities. I let you think
   245 accepted by a NFA we potentially have to explore all possibilities. I
   242 which kind of strings the above NFA accepts.
   246 let you think which kind of strings the above NFA accepts.
   243 
   247 
   244 
   248 
   245 There are a number of additional points you should note with
   249 There are a number of additional points you should note with
   246 NFAs. Every DFA is a NFA, but not vice versa. The $\rho$ in NFAs is a
   250 NFAs. Every DFA is a NFA, but not vice versa. The $\rho$ in NFAs is a
   247 transition \emph{relation} (DFAs have a transition function). The
   251 transition \emph{relation} (DFAs have a transition function). The
   258 and I also do not use transition functions that return sets of states
   262 and I also do not use transition functions that return sets of states
   259 (another popular choice for implementing NFAs).  For reasons that
   263 (another popular choice for implementing NFAs).  For reasons that
   260 become clear in a moment, I use sets of partial functions
   264 become clear in a moment, I use sets of partial functions
   261 instead---see Line 7 in Figure~\ref{nfa}. DFAs have only one such
   265 instead---see Line 7 in Figure~\ref{nfa}. DFAs have only one such
   262 partial function; my NFAs have a set.  Another parameter,
   266 partial function; my NFAs have a set.  Another parameter,
   263 \texttt{Starts}, is in NFAs a set of states; \texttt{fins} is again a
   267 \texttt{starts}, is in NFAs a set of states; \texttt{fins} is again a
   264 function from states to booleans. The \texttt{next} function
   268 function from states to booleans. The \texttt{next} function
   265 calculates the set of next states reachable from a single state
   269 calculates the set of next states reachable from a single state
   266 \texttt{q} by a character \texttt{c}---this is calculated by going
   270 \texttt{q} by a character~\texttt{c}---this is calculated by going
   267 through all the partial functions in the \texttt{delta}-set and apply
   271 through all the partial functions in the \texttt{delta}-set and apply
   268 \texttt{q} and \texttt{c} (see Line 13). This gives us a set of
   272 \texttt{q} and \texttt{c} (see Line 13). This gives a set of
   269 \texttt{Some}s (in case the application succeeded) and possibly some
   273 \texttt{Some}s (in case the application succeeded) and possibly some
   270 \texttt{None}s (in case the partial function is not defined or produces an
   274 \texttt{None}s (in case the partial function is not defined or produces an
   271 error).  The \texttt{None}s are filtered out by the \texttt{flatMap},
   275 error).  The \texttt{None}s are filtered out by the \texttt{flatMap},
   272 leaving the values inside the \texttt{Some}s. The function
   276 leaving the values inside the \texttt{Some}s. The function
   273 \texttt{nexts} just lifts this function to sets of
   277 \texttt{nexts} just lifts this function to sets of
   274 states. \texttt{Deltas} and \texttt{accept} are similar to the DFA
   278 states. \texttt{Deltas} and \texttt{accept} are similar to the DFA
   275 definitions.
   279 definitions.
   276 
   280 
   277 \begin{figure}[t]
   281 \begin{figure}[p]
   278 \small
   282 \small
   279 \lstinputlisting[numbers=left,linebackgroundcolor=
   283 \lstinputlisting[numbers=left,linebackgroundcolor=
   280                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   284                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   281                   {../progs/nfa.scala}
   285                   {../progs/nfa.scala}
   282 \caption{A Scala implementation of NFAs using sets of partial functions.
   286 \caption{A Scala implementation of NFAs using sets of partial functions.
   286   acceptance of a string in a breath-first fashion. This can be costly
   290   acceptance of a string in a breath-first fashion. This can be costly
   287   way of deciding whether a string is accepted in practical contexts.\label{nfa}}
   291   way of deciding whether a string is accepted in practical contexts.\label{nfa}}
   288 \end{figure}
   292 \end{figure}
   289 
   293 
   290 The reason for using sets of partial functions for specifying the
   294 The reason for using sets of partial functions for specifying the
   291 transitions in NFAs has to do with examples like this one: a
   295 transitions in NFAs has to do with pattern matching. Consider the
   292 popular benchmark regular expression is $(.)^*\cdot a\cdot
   296 following example: a popular benchmark regular expression is
   293 (.)^{\{n\}}\cdot b\cdot c$. A NFA that accepts the same strings
   297 $(.)^*\cdot a\cdot (.)^{\{n\}}\cdot b\cdot c$. The important point to
   294 (for $n=3$) is as follows:
   298 note is that it uses $.$ in order to represent the regular expression
       
   299 that accepts any character. A NFA that accepts the same strings as
       
   300 this regular expression (for $n=3$) is as follows:
   295 
   301 
   296 \begin{center}
   302 \begin{center}
   297 \begin{tikzpicture}[>=stealth',very thick, auto, node distance=7mm,
   303 \begin{tikzpicture}[>=stealth',very thick, auto, node distance=7mm,
   298     every state/.style={minimum size=0pt,inner sep=1pt,
   304     every state/.style={minimum size=0pt,inner sep=1pt,
   299       draw=blue!50,very thick,fill=blue!20},scale=0.5]
   305       draw=blue!50,very thick,fill=blue!20},scale=0.5]
   313 \path[->] (Q_5) edge node [above]  {$c$} (Q_6);
   319 \path[->] (Q_5) edge node [above]  {$c$} (Q_6);
   314 \end{tikzpicture}
   320 \end{tikzpicture}
   315 \end{center}
   321 \end{center}
   316 
   322 
   317 \noindent
   323 \noindent
   318 The $.$ stands for accepting any single character: for example if we
   324 Also here the $.$ stands for accepting any single character: for example if we
   319 are in $Q_0$ and read an $a$ we can either stay in $Q_0$ (since any
   325 are in $Q_0$ and read an $a$ we can either stay in $Q_0$ (since any
   320 character will do for this) or advance to $Q_1$ (but only if it is an
   326 character will do for this) or advance to $Q_1$ (but only if it is an
   321 $a$). Why this is a good benchmark regular expression is irrelevant
   327 $a$). Why this is a good benchmark regular expression is irrelevant
   322 here. The point is that this NFA can be conveniently represented by
   328 here. The point is that this NFA can be conveniently represented by
   323 the code:
   329 the code:
   340 \noindent
   346 \noindent
   341 where the $.$-transitions translate into a
   347 where the $.$-transitions translate into a
   342 underscore-pattern-matching. Recall that in $Q_0$ if we read an $a$ we
   348 underscore-pattern-matching. Recall that in $Q_0$ if we read an $a$ we
   343 can go to $Q_1$ (by the first partial function in the set) and also
   349 can go to $Q_1$ (by the first partial function in the set) and also
   344 stay in $Q_0$ (by the second partial function). Representing such
   350 stay in $Q_0$ (by the second partial function). Representing such
   345 transitions in any other way is somehow awkward; the set of partial
   351 transitions in any other way in Scala seems to be somehow awkward; the
   346 function representation makes them easy to implement.
   352 set of partial function representation makes them easy to implement.
   347 
   353 
   348 Look very careful again at the \texttt{accepts} and \texttt{deltas}
   354 Look very careful again at the \texttt{accepts} and \texttt{deltas}
   349 functions in NFAs and remember that when accepting a string by an NFA
   355 functions in NFAs and remember that when accepting a string by an NFA
   350 we might have to explore all possible transitions (which state to go
   356 we might have to explore all possible transitions (recall which state
   351 to is not unique anymore). The shown implementation achieves this
   357 to go to is not unique anymore with NFAs). The implementation achieves
   352 exploration in a \emph{breath-first search}. This is fine for very
   358 this exploration in a \emph{breadth-first search} manner. This is fine
   353 small NFAs, but can lead to problems when the NFAs are bigger. Take
   359 for very small NFAs, but can lead to problems when the NFAs are
   354 for example the regular expression $(.)^*\cdot a\cdot (.)^{\{n\}}\cdot
   360 bigger. Take for example the regular expression $(.)^*\cdot a\cdot
   355 b\cdot c$ from above. If $n$ is large, say 100 or 1000, then the
   361 (.)^{\{n\}}\cdot b\cdot c$ from above. If $n$ is large, say 100 or
   356 corresponding NFA will have 104, respectively 1004, nodes. The problem
   362 1000, then the corresponding NFA will have 104, respectively 1004,
   357 is that with certain strings this can lead to 1000 ``active'' nodes
   363 nodes. The problem is that with certain strings this can lead to 1000
   358 which we need to analyse when determining the next states. This can be
   364 ``active'' nodes in the breadth-first search, all of which we need to
   359 a real memory strain in practical applications. As a result, some
   365 analyse when determining the next states. This can be a real memory
   360 regular expression matching engines resort to a \emph{depth-first
   366 strain in practical applications. As result, some regular expression
   361   search} with \emph{backtracking} in unsuccessful cases. In our
   367 matching engines resort to a \emph{depth-first search} with
   362 implementation we could implement a depth-first version of
   368 \emph{backtracking} in unsuccessful cases. In our implementation we
   363 \texttt{accepts} using Scala's \texttt{exists} as follows:
   369 can implement a depth-first version of \texttt{accepts} using Scala's
       
   370 \texttt{exists} as follows:
   364 
   371 
   365 
   372 
   366 {\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
   373 {\small\begin{lstlisting}[language=Scala,linebackgroundcolor=
   367                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   374                   {\ifodd\value{lstnumber}\color{capri!3}\fi}]
   368 def search(q: A, s: List[C]) : Boolean = s match {
   375 def search(q: A, s: List[C]) : Boolean = s match {
   369   case Nil => fins(q)
   376   case Nil => fins(q)
   370   case c::cs =>
   377   case c::cs =>
   371     delta.exists((d) => Try(search(d(q, c), cs)) getOrElse false)
   378     delta.exists(d => Try(search(d(q, c), cs)) getOrElse false)
   372 }
   379 }
   373 
   380 
   374 def accepts(s: List[C]) : Boolean = 
   381 def accepts(s: List[C]) : Boolean = 
   375   starts.exists(search(_, s))
   382   starts.exists(search(_, s))
   376 \end{lstlisting}}
   383 \end{lstlisting}}
   377 
   384 
   378 \noindent
   385 \noindent
   379 This depth-first way of exploration seems to work efficiently in many
   386 This depth-first way of exploration seems to work efficiently in many
   380 examples and is much less of strain on memory. The problem is that the
   387 examples and is much less of strain on memory. The problem is that the
   381 backtracking can get catastrophic in some examples---remember the
   388 backtracking can get ``catastrophic'' in some examples---remember the
   382 catastrophic backtracking from earlier lectures. This depth-first
   389 catastrophic backtracking from earlier lectures. This depth-first
   383 search with backtracking is the reason for the abysmal performance of
   390 search with backtracking is the reason for the abysmal performance of
   384 regular expression macthing in Java, Ruby and Python.
   391 some regular expression macthings in Java, Ruby and Python. I like to
       
   392 show you this next.
   385 
   393 
   386 %This means if
   394 %This means if
   387 %we need to decide whether a string is accepted by a NFA, we might have
   395 %we need to decide whether a string is accepted by a NFA, we might have
   388 %to explore all possibilities. Also there is the special silent
   396 %to explore all possibilities. Also there is the special silent
   389 %transition in NFAs. As mentioned already this transition means you do
   397 %transition in NFAs. As mentioned already this transition means you do
   393 %%$Q_2$. This silent transition is also often called
   401 %%$Q_2$. This silent transition is also often called
   394 %\emph{$\epsilon$-transition}.
   402 %\emph{$\epsilon$-transition}.
   395 
   403 
   396 
   404 
   397 \subsubsection*{Thompson Construction}
   405 \subsubsection*{Thompson Construction}
       
   406 
       
   407 In order to get an idea what calculations are done in Java \& friends,
       
   408 we need a method for translating regular expressions into
       
   409 automata. The simplest and most well-known method is called
       
   410 \emph{Thompson Construction}, after the Turing Award winner Ken
       
   411 Thompson who implemented this method in early versions of grep????
       
   412 
   398 
   413 
   399 The reason for introducing NFAs is that there is a relatively
   414 The reason for introducing NFAs is that there is a relatively
   400 simple (recursive) translation of regular expressions into
   415 simple (recursive) translation of regular expressions into
   401 NFAs. Consider the simple regular expressions $\ZERO$,
   416 NFAs. Consider the simple regular expressions $\ZERO$,
   402 $\ONE$ and $c$. They can be translated as follows:
   417 $\ONE$ and $c$. They can be translated as follows: