\documentclass{article}\usepackage{../style}\usepackage{../langs}\usepackage{../graphics}%We can even allow ``silent%transitions'', also called epsilon-transitions. They allow us%to go from one state to the next without having a character%consumed. We label such silent transition with the letter%$\epsilon$.\begin{document}\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017}\section*{Handout 3 (Finite Automata)}Every formal language and compiler course I know of bombards you firstwith automata and then to a much, much smaller extend with regularexpressions. As you can see, this course is turned upside down:regular expressions come first. The reason is that regular expressionsare easier to reason about and the notion of derivatives, althoughalready quite old, only became more widely known ratherrecently. Still let us in this lecture have a closer look at automataand their relation to regular expressions. This will help us withunderstanding why the regular expression matchers in Python, Ruby andJava are so slow with certain regular expressions. On the way we willalso see what are the limitations of regular expressions.\subsection*{Deterministic Finite Automata}Lets start\ldots the central definition is:\medskip\noindent A \emph{deterministic finite automaton} (DFA), say $A$, isgiven by a five-tuple written ${\cal A}(\varSigma, Qs, Q_0, F, \delta)$ where\begin{itemize}\item $\varSigma$ is an alphabet, \item $Qs$ is a finite set of states,\item $Q_0 \in Qs$ is the start state,\item $F \subseteq Qs$ are the accepting states, and\item $\delta$ is the transition function.\end{itemize}\noindent The transition function determines how to ``transition''from one state to the next state with respect to a character. We havethe assumption that these transition functions do not need to bedefined everywhere: so it can be the case that given a character thereis no next state, in which case we need to raise a kind of ``failureexception''. That means actually we have \emph{partial} functions astransitions---see the Scala implementation of DFAs later on. Atypical example of a DFA is\begin{center}\begin{tikzpicture}[>=stealth',very thick,auto, every state/.style={minimum size=0pt, inner sep=2pt,draw=blue!50,very thick, fill=blue!20},scale=2]\node[state,initial] (Q_0) {$Q_0$};\node[state] (Q_1) [right=of Q_0] {$Q_1$};\node[state] (Q_2) [below right=of Q_0] {$Q_2$};\node[state] (Q_3) [right=of Q_2] {$Q_3$};\node[state, accepting] (Q_4) [right=of Q_1] {$Q_4$};\path[->] (Q_0) edge node [above] {$a$} (Q_1);\path[->] (Q_1) edge node [above] {$a$} (Q_4);\path[->] (Q_4) edge [loop right] node {$a, b$} ();\path[->] (Q_3) edge node [right] {$a$} (Q_4);\path[->] (Q_2) edge node [above] {$a$} (Q_3);\path[->] (Q_1) edge node [right] {$b$} (Q_2);\path[->] (Q_0) edge node [above] {$b$} (Q_2);\path[->] (Q_2) edge [loop left] node {$b$} ();\path[->] (Q_3) edge [bend left=95, looseness=1.3] node [below] {$b$} (Q_0);\end{tikzpicture}\end{center}\noindent In this graphical notation, the accepting state $Q_4$ isindicated with double circles. Note that there can be more than oneaccepting state. It is also possible that a DFA has no acceptingstate at all, or that the starting state is also an acceptingstate. In the case above the transition function is defined everywhereand can also be given as a table as follows:\[\begin{array}{lcl}(Q_0, a) &\rightarrow& Q_1\\(Q_0, b) &\rightarrow& Q_2\\(Q_1, a) &\rightarrow& Q_4\\(Q_1, b) &\rightarrow& Q_2\\(Q_2, a) &\rightarrow& Q_3\\(Q_2, b) &\rightarrow& Q_2\\(Q_3, a) &\rightarrow& Q_4\\(Q_3, b) &\rightarrow& Q_0\\(Q_4, a) &\rightarrow& Q_4\\(Q_4, b) &\rightarrow& Q_4\\\end{array}\]We need to define the notion of what language is accepted byan automaton. For this we lift the transition function$\delta$ from characters to strings as follows:\[\begin{array}{lcl}\widehat{\delta}(q, []) & \dn & q\\\widehat{\delta}(q, c\!::\!s) & \dn & \widehat{\delta}(\delta(q, c), s)\\\end{array}\]\noindent This lifted transition function is often called\emph{delta-hat}. Given a string, we start in the starting state andtake the first character of the string, follow to the next state, thentake the second character and so on. Once the string is exhausted andwe end up in an accepting state, then this string is accepted by theautomaton. Otherwise it is not accepted. This also means that if alongthe way we hit the case where the transition function $\delta$ is notdefined, we need to raise an error. In our implementation we will dealwith this case elegantly by using Scala's \texttt{Try}. Summing up: astring $s$ is in the \emph{language accepted by the automaton} ${\cal A}(\varSigma, Q, Q_0, F, \delta)$ iff\[\widehat{\delta}(Q_0, s) \in F \]\noindent I let you think about a definition that describesthe set of all strings accepted by an automaton. \begin{figure}[p]\small\lstinputlisting[numbers=left,linebackgroundcolor= {\ifodd\value{lstnumber}\color{capri!3}\fi}] {../progs/dfa.scala}\caption{A Scala implementation of DFAs using partial functions. Notice some subtleties: \texttt{deltas} implements the delta-hat construction by lifting the transition (partial) function to lists of characters. Since \texttt{delta} is given as a partial function, it can obviously go ``wrong'' in which case the \texttt{Try} in \texttt{accepts} catches the error and returns \texttt{false}---that means the string is not accepted. The example \texttt{delta} in Line 28--38 implements the DFA example shown earlier in the handout.\label{dfa}}\end{figure}My take on an implementation of DFAs in Scala is given inFigure~\ref{dfa}. As you can see, there are many features of themathematical definition that are quite closely reflected in thecode. In the DFA-class, there is a starting state, called\texttt{start}, with the polymorphic type \texttt{A}. There is apartial function \texttt{delta} for specifying the transitions---thesepartial functions take a state (of polymorphic type \texttt{A}) and aninput (of polymorphic type \texttt{C}) and produce a new state (oftype \texttt{A}). For the moment it is OK to assume that \texttt{A} issome arbitrary type for states and the input is just characters. (Thereason for not having concrete types, but polymorphic types for thestates and the input of DFAs will become clearer later on.)The DFA-class has also an argument for specifying final states. In theimplementation it is not a set of states, as in the mathematicaldefinition, but a function from states to booleans (this function issupposed to return true whenever a state is final; falseotherwise). While this boolean function is different from the sets ofstates, Scala allows to use sets for such functions (see Line 40 wherethe DFA is initialised). Again it will become clear later on why I usefunctions for final states, rather than sets.The most important point in the implementation is that I use Scala'spartial functions for representing the transitions; alternatives wouldhave been \texttt{Maps} or even \texttt{Lists}. One of the mainadvantages of using partial functions is that transitions can be quitenicely defined by a series of \texttt{case} statements (see Lines 28-- 38 for an example). If you need to represent an automaton with asink state (catch-all-state), you can use Scala's pattern matching andwrite something like{\small\begin{lstlisting}[language=Scala,linebackgroundcolor= {\ifodd\value{lstnumber}\color{capri!3}\fi}]abstract class State...case object Sink extends Stateval delta : (State, Char) :=> State = { case (S0, 'a') => S1 case (S1, 'a') => S2 case _ => Sink }\end{lstlisting}} \noindent I let you think what the corresponding DFA looks like in thegraph notation.Finally, I let you ponder whether this is a good implementation ofDFAs or not. In doing so I hope you notice that the $\varSigma$ and$Qs$ components (the alphabet and the set of finite states,respectively) are missing from the class definition. This means thatthe implementation allows you to do some fishy things you are notmeant to do with DFAs. Which fishy things could that be?\subsection*{Non-Deterministic Finite Automata}Remember we want to find out what the relation is between regularexpressions and automata. To do this with DFAs is a bit unwieldy.While with DFAs it is always clear that given a state and a characterwhat the next state is (potentially none), it will be convenient torelax this restriction. That means we allow states to have severalpotential successor states. We even allow more than one startingstate. The resulting construction is called a \emph{Non-Deterministic Finite Automaton} (NFA) given also as a five-tuple ${\cal A}(\varSigma, Qs, Q_{0s}, F, \rho)$ where\begin{itemize}\item $\varSigma$ is an alphabet, \item $Qs$ is a finite set of states\item $Q_{0s}$ is a set of start states ($Q_{0s} \subseteq Qs$)\item $F$ are some accepting states with $F \subseteq Qs$, and\item $\rho$ is a transition relation.\end{itemize}\noindentA typical example of a NFA is% A NFA for (ab* + b)*a\begin{center}\begin{tikzpicture}[>=stealth',very thick, auto, every state/.style={minimum size=0pt,inner sep=3pt, draw=blue!50,very thick,fill=blue!20},scale=2]\node[state,initial] (Q_0) {$Q_0$};\node[state] (Q_1) [right=of Q_0] {$Q_1$};\node[state, accepting] (Q_2) [right=of Q_1] {$Q_2$};\path[->] (Q_0) edge [loop above] node {$b$} ();\path[<-] (Q_0) edge node [below] {$b$} (Q_1);\path[->] (Q_0) edge [bend left] node [above] {$a$} (Q_1);\path[->] (Q_0) edge [bend right] node [below] {$a$} (Q_2);\path[->] (Q_1) edge [loop above] node {$a,b$} ();\path[->] (Q_1) edge node [above] {$a$} (Q_2);\end{tikzpicture}\end{center}\noindentThis NFA happens to have only one starting state, but in general therecould be more. Notice that in state $Q_0$ we might go to state $Q_1$\emph{or} to state $Q_2$ when receiving an $a$. Similarly in state$Q_1$ and receiving an $a$, we can stay in state $Q_1$ \emph{or} go to$Q_2$. This kind of choice is not allowed with DFAs. The downside ofthis choice is that when it comes to deciding whether a string isaccepted by a NFA we potentially have to explore all possibilities. Ilet you think which kind of strings the above NFA accepts.There are a number of additional points you should note aboutNFAs. Every DFA is a NFA, but not vice versa. The $\rho$ in NFAs is atransition \emph{relation} (DFAs have a transition function). Thedifference between a function and a relation is that a function hasalways a single output, while a relation gives, roughly speaking,several outputs. Look again at the NFA above: if you are currently inthe state $Q_1$ and you read a character $b$, then you can transitionto either $Q_0$ \emph{or} $Q_2$. Which route, or output, you take isnot determined. This non-determinism can be represented by arelation.My implementation of NFAs in Scala is shown in Figure~\ref{nfa}.Perhaps interestingly, I do not actually use relations for my NFAs,but use transition functions that return sets of states. DFAs havepartial transition functions that return a single state; my NFAsreturn a set. I let you think about this representation forNFA-transitions and how it corresponds to the relations used in themathematical definition of NFAs.Like in the mathematical definition, \texttt{starts} is in NFAs a setof states; \texttt{fins} is again a function from states tobooleans. The \texttt{next} function calculates the set of next statesreachable from a single state \texttt{q} by a character~\texttt{c}. Incase there is no such state---the partial transition function isundefined---the empty set is returned (see function\texttt{applyOrElse} in Lines 9 and 10). The function \texttt{nexts}just lifts this function to sets of states.\begin{figure}[p]\small\lstinputlisting[numbers=left,linebackgroundcolor= {\ifodd\value{lstnumber}\color{capri!3}\fi}] {../progs/nfa.scala}\caption{A Scala implementation of NFAs using partial functions. Notice that the function \texttt{accepts} implements the acceptance of a string in a breath-first search fashion. This can be a costly way of deciding whether a string is accepted or not in applications that need to handle large NFAs and large inputs.\label{nfa}}\end{figure}Look very careful at the \texttt{accepts} and \texttt{deltas}functions in NFAs and remember that when accepting a string by a NFAwe might have to explore all possible transitions (recall which stateto go to is not unique anymore with NFAs\ldots{}we need to explorepotentially all next states). The implementation achieves thisexploration in a \emph{breadth-first search} manner. This is fine forsmall NFAs, but can lead to real memory problems when the NFAs arebigger and larger strings need to be processed. As result, someregular expression matching engines resort to a \emph{depth-first search} with \emph{backtracking} in unsuccessful cases. In ourimplementation we can implement a depth-first version of\texttt{accepts} using Scala's \texttt{exists}-function as follows:{\small\begin{lstlisting}[language=Scala,linebackgroundcolor= {\ifodd\value{lstnumber}\color{capri!3}\fi}]def search(q: A, s: List[C]) : Boolean = s match { case Nil => fins(q) case c::cs => next(q, c).exists(search(_, cs)) }def accepts2(s: List[C]) : Boolean = starts.exists(search(_, s))\end{lstlisting}}\noindentThis depth-first way of exploration seems to work efficiently in manyexamples and is much less of a strain on memory. The problem is thatthe backtracking can get ``catastrophic'' in some examples---rememberthe catastrophic backtracking from earlier lectures. This depth-firstsearch with backtracking is the reason for the abysmal performance ofsome regular expression matchings in Java, Ruby and Python. I like toshow you this next.\subsubsection*{Thompson Construction}In order to get an idea what calculations are performed by Java \&friends, we need a method for transforming a regular expression intoan automaton. This automaton should accept exactly those strings thatare accepted by the regular expression. The simplest and mostwell-known method for this is called \emph{Thompson Construction},after the Turing Award winner Ken Thompson. This method is byrecursion over regular expressions and uses the non-determinism inNFAs. You will see shortly why this construction works well with NFAs,but is not so straightforward with DFAs. Unfortunately we are stillone step away from our intended target though---because thisconstruction uses a version of NFAs that allows ``silenttransitions''. The idea behind silent transitions is that theyallow us to go from one state to the next without having to consume acharacter. We label such silent transition with the letter$\epsilon$ and call the automata $\epsilon$NFAs. Twotypical examples of $\epsilon$NFAs are\begin{center}\begin{tabular}[t]{c@{\hspace{9mm}}c}\begin{tikzpicture}[>=stealth',very thick, every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]\node[state,initial] (Q_0) {$Q_0$};\node[state] (Q_1) [above=of Q_0] {$Q_1$};\node[state, accepting] (Q_2) [below=of Q_0] {$Q_2$};\path[->] (Q_0) edge node [left] {$\epsilon$} (Q_1);\path[->] (Q_0) edge node [left] {$\epsilon$} (Q_2);\path[->] (Q_0) edge [loop right] node {$a$} ();\path[->] (Q_1) edge [loop right] node {$a$} ();\path[->] (Q_2) edge [loop right] node {$b$} ();\end{tikzpicture} &\raisebox{20mm}{\begin{tikzpicture}[>=stealth',very thick, every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]\node[state,initial] (r_1) {$R_1$};\node[state] (r_2) [above=of r_1] {$R_2$};\node[state, accepting] (r_3) [right=of r_1] {$R_3$};\path[->] (r_1) edge node [below] {$b$} (r_3);\path[->] (r_2) edge [bend left] node [above] {$a$} (r_3);\path[->] (r_1) edge [bend left] node [left] {$\epsilon$} (r_2);\path[->] (r_2) edge [bend left] node [right] {$a$} (r_1);\end{tikzpicture}}\end{tabular}\end{center}\noindentConsider the $\epsilon$NFA on the left-hand side: an$\epsilon$-transition means you do not have to ``consume'' any part ofthe input string, but ``silently'' change to a different state. Forexample, if you are in the starting state $Q_0$, you can silently moveeither to $Q_1$ or $Q_2$. In this example you can see that once youare in $Q_1$, respectively $Q_2$, you cannot ``go back'' to the otherstates.On first appearances, $\epsilon$-transitions might look ratherstrange, or even silly. To start with, silent transitions make thedecision whether a string is accepted by an automaton even harder:with $\epsilon$NFAs we have to look whether we can do first some$\epsilon$-transitions and then do a ``proper''-transition; and afterany ``proper''-transition we again have to check whether we can doagain some silent transitions. Even worse, if there is a silenttransition pointing back to the same state, then we have to be carefulour decision procedure for strings does not loop (remember thedepth-first search for exploring all states).The obvious question is: Do we get anything in return for this hasslewith silent transitions? Well, we still have to work for it\ldotsunfortunately. If we were to follow the many textbooks on thesubject, we would now start with defining what $\epsilon$NFAsare---that would require extending the transition relation ofNFAs. Next, show that the $\epsilon$NFAs are equivalent to NFAs and soon. Once we have done all this on paper, we would need to implement$\epsilon$NFAs. Lets try to take a shortcut---we are not reallyinterested in $\epsilon$NFAs; they are only a convenient tool fortranslating regular expressions into automata. We are not going toimplementing them explicitly, but translate them directly into NFAs(in a sense $\epsilon$NFAs are just a convenient API for lazy people).How does the translation work: well we have to find all transitions ofthe form\[q\stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow}\;\stackrel{a}{\longrightarrow}\;\stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow} q'\]\noindent and replace them with $q \stackrel{a}{\longrightarrow}q'$. Doing this to the $\epsilon$NFA on the left-hand side above givesthe NFA\begin{center}\begin{tikzpicture}[>=stealth',very thick, every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]\node[state,initial] (r_1) {$R_1$};\node[state] (r_2) [above=of r_1] {$R_2$};\node[state, accepting] (r_3) [right=of r_1] {$R_3$};\path[->] (r_1) edge node [above] {$b$} (r_3);\path[->] (r_2) edge [bend left] node [above] {$a$} (r_3);\path[->] (r_1) edge [bend left] node [left] {$a$} (r_2);\path[->] (r_2) edge [bend left] node [right] {$a$} (r_1);\path[->] (r_1) edge [loop below] node {$a$} ();\path[->] (r_1) edge [bend right] node [below] {$a$} (r_3);\end{tikzpicture}\end{center}\noindent where the single the $\epsilon$-transition is replaced bythree more $a$-transitions. So whenever we are given $\epsilon$NFA wewill, similarly, replace it by an equivalent NFA. The code for this isgiven in Figure~\ref{enfa}. At the core is a function that calculatesa fixpoint of function (Lines 5--10). This is used for ``discovering''states reachable by $\epsilon$-transitions. Once no new state isreachable state, a fixpoint is reached. This is used for example whencalculating the starting states of an equivalent NFA---see Line 36.We starts with all starting states of the $\epsilon$NFA and then lookfor all other states that can be reached by$\epsilon$-transition. This is what the $\epsilon$-closure, called\texttt{ecl}, calculates.\begin{figure}[p]\small\lstinputlisting[numbers=left,linebackgroundcolor= {\ifodd\value{lstnumber}\color{capri!3}\fi}] {../progs/enfa.scala}\caption{A Scala function that translates $\epsilon$NFA into NFAs. The transtions of $\epsilon$NFA take as input an \texttt{Option[C]}. \texttt{None} stands for an $\epsilon$-transition; \texttt{Some(c)} for a ``proper'' transition. The functions in Lines 18--26 calculate all states reachable by one or more $\epsilon$-transition for a given set of states. The NFA is constructed in in Lines 36--38.\label{enfa}}\end{figure}Now having a translation of $\epsilon$NFAs to NFAs in place, we canfinally make headway with the problem of translating regularexpressions into equivalent NFAs. By equivalent we mean that the NFAsrecognise the same language. Consider the simple regular expressions$\ZERO$, $\ONE$ and $c$. They can be translated into equivalent NFAsas follows:\begin{center}\begin{tabular}[t]{l@{\hspace{10mm}}l}\raisebox{1mm}{$\ZERO$} & \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]\node[state, initial] (Q_0) {$\mbox{}$};\end{tikzpicture}\\\\\raisebox{1mm}{$\ONE$} & \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]\node[state, initial, accepting] (Q_0) {$\mbox{}$};\end{tikzpicture}\\\\\raisebox{2mm}{$c$} & \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]\node[state, initial] (Q_0) {$\mbox{}$};\node[state, accepting] (Q_1) [right=of Q_0] {$\mbox{}$};\path[->] (Q_0) edge node [below] {$c$} (Q_1);\end{tikzpicture}\\\\\end{tabular}\end{center}\begin{figure}[p]\small\lstinputlisting[numbers=left,linebackgroundcolor= {\ifodd\value{lstnumber}\color{capri!3}\fi}] {../progs/thompson.scala}\caption{A Scala XXX\label{thompson}}\end{figure}\noindent The case for the sequence regular expression $r_1\cdot r_2$ is as follows: We are given by recursion twoautomata representing $r_1$ and $r_2$ respectively. \begin{center}\begin{tikzpicture}[node distance=3mm, >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]\node[state, initial] (Q_0) {$\mbox{}$};\node (r_1) [right=of Q_0] {$\ldots$};\node[state, accepting] (t_1) [right=of r_1] {$\mbox{}$};\node[state, accepting] (t_2) [above=of t_1] {$\mbox{}$};\node[state, accepting] (t_3) [below=of t_1] {$\mbox{}$};\node[state, initial] (a_0) [right=2.5cm of t_1] {$\mbox{}$};\node (b_1) [right=of a_0] {$\ldots$};\node[state, accepting] (c_1) [right=of b_1] {$\mbox{}$};\node[state, accepting] (c_2) [above=of c_1] {$\mbox{}$};\node[state, accepting] (c_3) [below=of c_1] {$\mbox{}$};\begin{pgfonlayer}{background}\node (1) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (Q_0) (r_1) (t_1) (t_2) (t_3)] {};\node (2) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (a_0) (b_1) (c_1) (c_2) (c_3)] {};\node [yshift=2mm] at (1.north) {$r_1$};\node [yshift=2mm] at (2.north) {$r_2$};\end{pgfonlayer}\end{tikzpicture}\end{center}\noindent The first automaton has some accepting states. Weobtain an automaton for $r_1\cdot r_2$ by connecting theseaccepting states with $\epsilon$-transitions to the startingstate of the second automaton. By doing so we make themnon-accepting like so:\begin{center}\begin{tikzpicture}[node distance=3mm, >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]\node[state, initial] (Q_0) {$\mbox{}$};\node (r_1) [right=of Q_0] {$\ldots$};\node[state] (t_1) [right=of r_1] {$\mbox{}$};\node[state] (t_2) [above=of t_1] {$\mbox{}$};\node[state] (t_3) [below=of t_1] {$\mbox{}$};\node[state] (a_0) [right=2.5cm of t_1] {$\mbox{}$};\node (b_1) [right=of a_0] {$\ldots$};\node[state, accepting] (c_1) [right=of b_1] {$\mbox{}$};\node[state, accepting] (c_2) [above=of c_1] {$\mbox{}$};\node[state, accepting] (c_3) [below=of c_1] {$\mbox{}$};\path[->] (t_1) edge node [above, pos=0.3] {$\epsilon$} (a_0);\path[->] (t_2) edge node [above] {$\epsilon$} (a_0);\path[->] (t_3) edge node [below] {$\epsilon$} (a_0);\begin{pgfonlayer}{background}\node (3) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (Q_0) (c_1) (c_2) (c_3)] {};\node [yshift=2mm] at (3.north) {$r_1\cdot r_2$};\end{pgfonlayer}\end{tikzpicture}\end{center}\noindent The case for the choice regular expression $r_1 +r_2$ is slightly different: We are given by recursion twoautomata representing $r_1$ and $r_2$ respectively. \begin{center}\begin{tikzpicture}[node distance=3mm, >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]\node at (0,0) (1) {$\mbox{}$};\node[state, initial] (2) [above right=16mm of 1] {$\mbox{}$};\node[state, initial] (3) [below right=16mm of 1] {$\mbox{}$};\node (a) [right=of 2] {$\ldots$};\node[state, accepting] (a1) [right=of a] {$\mbox{}$};\node[state, accepting] (a2) [above=of a1] {$\mbox{}$};\node[state, accepting] (a3) [below=of a1] {$\mbox{}$};\node (b) [right=of 3] {$\ldots$};\node[state, accepting] (b1) [right=of b] {$\mbox{}$};\node[state, accepting] (b2) [above=of b1] {$\mbox{}$};\node[state, accepting] (b3) [below=of b1] {$\mbox{}$};\begin{pgfonlayer}{background}\node (1) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (2) (a1) (a2) (a3)] {};\node (2) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (3) (b1) (b2) (b3)] {};\node [yshift=3mm] at (1.north) {$r_1$};\node [yshift=3mm] at (2.north) {$r_2$};\end{pgfonlayer}\end{tikzpicture}\end{center}\noindent Each automaton has a single start state andpotentially several accepting states. We obtain a NFA for theregular expression $r_1 + r_2$ by introducing a new startingstate and connecting it with an $\epsilon$-transition to thetwo starting states above, like so\begin{center}\hspace{2cm}\begin{tikzpicture}[node distance=3mm, >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]\node at (0,0) [state, initial] (1) {$\mbox{}$};\node[state] (2) [above right=16mm of 1] {$\mbox{}$};\node[state] (3) [below right=16mm of 1] {$\mbox{}$};\node (a) [right=of 2] {$\ldots$};\node[state, accepting] (a1) [right=of a] {$\mbox{}$};\node[state, accepting] (a2) [above=of a1] {$\mbox{}$};\node[state, accepting] (a3) [below=of a1] {$\mbox{}$};\node (b) [right=of 3] {$\ldots$};\node[state, accepting] (b1) [right=of b] {$\mbox{}$};\node[state, accepting] (b2) [above=of b1] {$\mbox{}$};\node[state, accepting] (b3) [below=of b1] {$\mbox{}$};\path[->] (1) edge node [above] {$\epsilon$} (2);\path[->] (1) edge node [below] {$\epsilon$} (3);\begin{pgfonlayer}{background}\node (3) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (1) (a2) (a3) (b2) (b3)] {};\node [yshift=3mm] at (3.north) {$r_1+ r_2$};\end{pgfonlayer}\end{tikzpicture}\end{center}\noindent Finally for the $*$-case we have an automaton for $r$\begin{center}\begin{tikzpicture}[node distance=3mm, >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]\node at (0,0) (1) {$\mbox{}$};\node[state, initial] (2) [right=16mm of 1] {$\mbox{}$};\node (a) [right=of 2] {$\ldots$};\node[state, accepting] (a1) [right=of a] {$\mbox{}$};\node[state, accepting] (a2) [above=of a1] {$\mbox{}$};\node[state, accepting] (a3) [below=of a1] {$\mbox{}$};\begin{pgfonlayer}{background}\node (1) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (2) (a1) (a2) (a3)] {};\node [yshift=3mm] at (1.north) {$r$};\end{pgfonlayer}\end{tikzpicture}\end{center}\noindent and connect its accepting states to a new startingstate via $\epsilon$-transitions. This new starting state isalso an accepting state, because $r^*$ can recognise theempty string. This gives the following automaton for $r^*$:\begin{center}\begin{tikzpicture}[node distance=3mm, >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]\node at (0,0) [state, initial,accepting] (1) {$\mbox{}$};\node[state] (2) [right=16mm of 1] {$\mbox{}$};\node (a) [right=of 2] {$\ldots$};\node[state] (a1) [right=of a] {$\mbox{}$};\node[state] (a2) [above=of a1] {$\mbox{}$};\node[state] (a3) [below=of a1] {$\mbox{}$};\path[->] (1) edge node [above] {$\epsilon$} (2);\path[->] (a1) edge [bend left=45] node [above] {$\epsilon$} (1);\path[->] (a2) edge [bend right] node [below] {$\epsilon$} (1);\path[->] (a3) edge [bend left=45] node [below] {$\epsilon$} (1);\begin{pgfonlayer}{background}\node (2) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (1) (a2) (a3)] {};\node [yshift=3mm] at (2.north) {$r^*$};\end{pgfonlayer}\end{tikzpicture}\end{center}\noindent This construction of a NFA from a regular expressionwas invented by Ken Thompson in 1968.\subsubsection*{Subset Construction}What is interesting is that for every NFA we can find a DFAwhich recognises the same language. This can, for example, bedone by the \emph{subset construction}. Consider again the NFAbelow on the left, consisting of nodes labelled $0$, $1$ and $2$. \begin{center}\begin{tabular}{c@{\hspace{10mm}}c}\begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=0pt, draw=blue!50,very thick,fill=blue!20}, baseline=0mm]\node[state,initial] (Q_0) {$0$};\node[state] (Q_1) [above=of Q_0] {$1$};\node[state, accepting] (Q_2) [below=of Q_0] {$2$};\path[->] (Q_0) edge node [left] {$\epsilon$} (Q_1);\path[->] (Q_0) edge node [left] {$\epsilon$} (Q_2);\path[->] (Q_0) edge [loop right] node {$a$} ();\path[->] (Q_1) edge [loop above] node {$a$} ();\path[->] (Q_2) edge [loop below] node {$b$} ();\end{tikzpicture}&\begin{tabular}{r|cl}nodes & $a$ & $b$\\\hline$\{\}\phantom{\star}$ & $\{\}$ & $\{\}$\\$\{0\}\phantom{\star}$ & $\{0,1,2\}$ & $\{2\}$\\$\{1\}\phantom{\star}$ & $\{1\}$ & $\{\}$\\$\{2\}\star$ & $\{\}$ & $\{2\}$\\$\{0,1\}\phantom{\star}$ & $\{0,1,2\}$ & $\{2\}$\\$\{0,2\}\star$ & $\{0,1,2\}$ & $\{2\}$\\$\{1,2\}\star$ & $\{1\}$ & $\{2\}$\\s: $\{0,1,2\}\star$ & $\{0,1,2\}$ & $\{2\}$\\\end{tabular}\end{tabular}\end{center}\noindent The nodes of the DFA are given by calculating allsubsets of the set of nodes of the NFA (seen in the nodescolumn on the right). The table shows the transition functionfor the DFA. The first row states that $\{\}$ is thesink node which has transitions for $a$ and $b$ to itself.The next three lines are calculated as follows: \begin{itemize}\item suppose you calculate the entry for the transition for $a$ and the node $\{0\}$\item start from the node $0$ in the NFA\item do as many $\epsilon$-transition as you can obtaining a set of nodes, in this case $\{0,1,2\}$\item filter out all notes that do not allow an $a$-transition from this set, this excludes $2$ which does not permit a $a$-transition\item from the remaining set, do as many $\epsilon$-transition as you can, this yields again $\{0,1,2\}$ \item the resulting set specifies the transition from $\{0\}$ when given an $a$ \end{itemize}\noindent So the transition from the state $\{0\}$ reading an$a$ goes to the state $\{0,1,2\}$. Similarly for the otherentries in the rows for $\{0\}$, $\{1\}$ and $\{2\}$. Theother rows are calculated by just taking the union of thesingle node entries. For example for $a$ and $\{0,1\}$ we needto take the union of $\{0,1,2\}$ (for $0$) and $\{1\}$ (for$1$). The starting state of the DFA can be calculated from thestarting state of the NFA, that is $0$, and then do as many$\epsilon$-transitions as possible. This gives $\{0,1,2\}$which is the starting state of the DFA. The terminal states inthe DFA are given by all sets that contain a $2$, which is theterminal state of the NFA. This completes the subsetconstruction. So the corresponding DFA to the NFA from above is\begin{center}\begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=0pt, draw=blue!50,very thick,fill=blue!20}, baseline=0mm]\node[state,initial,accepting] (q012) {$0,1,2$};\node[state,accepting] (q02) [right=of q012] {$0,2$};\node[state] (q01) [above=of q02] {$0,1$};\node[state,accepting] (q12) [below=of q02] {$1,2$};\node[state] (q0) [right=2cm of q01] {$0$};\node[state] (q1) [right=2.5cm of q02] {$1$};\node[state,accepting] (q2) [right=1.5cm of q12] {$2$};\node[state] (qn) [right=of q1] {$\{\}$};\path[->] (q012) edge [loop below] node {$a$} ();\path[->] (q012) edge node [above] {$b$} (q2);\path[->] (q12) edge [bend left] node [below,pos=0.4] {$a$} (q1);\path[->] (q12) edge node [below] {$b$} (q2);\path[->] (q02) edge node [above] {$a$} (q012);\path[->] (q02) edge [bend left] node [above, pos=0.8] {$b$} (q2);\path[->] (q01) edge node [below] {$a$} (q012);\path[->] (q01) edge [bend left] node [above] {$b$} (q2);\path[->] (q0) edge node [below] {$a$} (q012);\path[->] (q0) edge node [right, pos=0.2] {$b$} (q2);\path[->] (q1) edge [loop above] node {$a$} ();\path[->] (q1) edge node [above] {$b$} (qn);\path[->] (q2) edge [loop right] node {$b$} ();\path[->] (q2) edge node [below] {$a$} (qn);\path[->] (qn) edge [loop above] node {$a,b$} ();\end{tikzpicture}\end{center}There are two points to note: One is that very often theresulting DFA contains a number of ``dead'' nodes that arenever reachable from the starting state. For examplethere is no way to reach node $\{0,2\}$ from the startingstate $\{0,1,2\}$. I let you find the other dead states.In effect the DFA in this example is not a minimal DFA. Suchdead nodes can be safely removed without changing the languagethat is recognised by the DFA. Another point is that in somecases, however, the subset construction produces a DFA thatdoes \emph{not} contain any dead nodes\ldots{}that means itcalculates a minimal DFA. Which in turn means that in somecases the number of nodes by going from NFAs to DFAsexponentially increases, namely by $2^n$ (which is the numberof subsets you can form for $n$ nodes). Removing all the dead states in the automaton above,gives a much more legible automaton, namely\begin{center}\begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=0pt, draw=blue!50,very thick,fill=blue!20}, baseline=0mm]\node[state,initial,accepting] (q012) {$0,1,2$};\node[state,accepting] (q2) [right=of q012] {$2$};\node[state] (qn) [right=of q2] {$\{\}$};\path[->] (q012) edge [loop below] node {$a$} ();\path[->] (q012) edge node [above] {$b$} (q2);\path[->] (q2) edge [loop below] node {$b$} ();\path[->] (q2) edge node [below] {$a$} (qn);\path[->] (qn) edge [loop above] node {$a,b$} ();\end{tikzpicture}\end{center}\noindent Now the big question is whether this DFAcan recognise the same language as the NFA we started with.I let you ponder about this question.\subsubsection*{Brzozowski's Method}As said before, we can also go into the other direction---fromDFAs to regular expressions. Brzozowski's method calculatesa regular expression using familiar transformations forsolving equational systems. Consider the DFA:\begin{center}\begin{tikzpicture}[scale=1.5,>=stealth',very thick,auto, every state/.style={minimum size=0pt, inner sep=2pt,draw=blue!50,very thick, fill=blue!20}] \node[state, initial] (q0) at ( 0,1) {$Q_0$}; \node[state] (q1) at ( 1,1) {$Q_1$}; \node[state, accepting] (q2) at ( 2,1) {$Q_2$}; \path[->] (q0) edge[bend left] node[above] {$a$} (q1) (q1) edge[bend left] node[above] {$b$} (q0) (q2) edge[bend left=50] node[below] {$b$} (q0) (q1) edge node[above] {$a$} (q2) (q2) edge [loop right] node {$a$} () (q0) edge [loop below] node {$b$} ();\end{tikzpicture}\end{center}\noindent for which we can set up the following equationalsystem\begin{eqnarray}Q_0 & = & \ONE + Q_0\,b + Q_1\,b + Q_2\,b\\Q_1 & = & Q_0\,a\\Q_2 & = & Q_1\,a + Q_2\,a\end{eqnarray}\noindent There is an equation for each node in the DFA. Letus have a look how the right-hand sides of the equations areconstructed. First have a look at the second equation: theleft-hand side is $Q_1$ and the right-hand side $Q_0\,a$. Theright-hand side is essentially all possible ways how to end upin node $Q_1$. There is only one incoming edge from $Q_0$ consumingan $a$. Therefore the right hand side is thisstate followed by character---in this case $Q_0\,a$. Now letshave a look at the third equation: there are two incomingedges for $Q_2$. Therefore we have two terms, namely $Q_1\,a$ and$Q_2\,a$. These terms are separated by $+$. The first statesthat if in state $Q_1$ consuming an $a$ will bring you to$Q_2$, and the second that being in $Q_2$ and consuming an $a$will make you stay in $Q_2$. The right-hand side of thefirst equation is constructed similarly: there are three incoming edges, therefore there are three terms. There isone exception in that we also ``add'' $\ONE$ to thefirst equation, because it corresponds to the starting statein the DFA.Having constructed the equational system, the question ishow to solve it? Remarkably the rules are very similar tosolving usual linear equational systems. For example thesecond equation does not contain the variable $Q_1$ on theright-hand side of the equation. We can therefore eliminate $Q_1$ from the system by just substituting this equationinto the other two. This gives\begin{eqnarray}Q_0 & = & \ONE + Q_0\,b + Q_0\,a\,b + Q_2\,b\\Q_2 & = & Q_0\,a\,a + Q_2\,a\end{eqnarray}\noindent where in Equation (4) we have two occurrencesof $Q_0$. Like the laws about $+$ and $\cdot$, we can simplify Equation (4) to obtain the following two equations:\begin{eqnarray}Q_0 & = & \ONE + Q_0\,(b + a\,b) + Q_2\,b\\Q_2 & = & Q_0\,a\,a + Q_2\,a\end{eqnarray}\noindent Unfortunately we cannot make any more progress withsubstituting equations, because both (6) and (7) contain thevariable on the left-hand side also on the right-hand side.Here we need to now use a law that is different from the usuallaws about linear equations. It is called \emph{Arden's rule}.It states that if an equation is of the form $q = q\,r + s$then it can be transformed to $q = s\, r^*$. Since we canassume $+$ is symmetric, Equation (7) is of that form: $s$ is$Q_0\,a\,a$ and $r$ is $a$. That means we can transform(7) to obtain the two new equations\begin{eqnarray}Q_0 & = & \ONE + Q_0\,(b + a\,b) + Q_2\,b\\Q_2 & = & Q_0\,a\,a\,(a^*)\end{eqnarray}\noindent Now again we can substitute the second equation intothe first in order to eliminate the variable $Q_2$.\begin{eqnarray}Q_0 & = & \ONE + Q_0\,(b + a\,b) + Q_0\,a\,a\,(a^*)\,b\end{eqnarray}\noindent Pulling $Q_0$ out as a single factor gives:\begin{eqnarray}Q_0 & = & \ONE + Q_0\,(b + a\,b + a\,a\,(a^*)\,b)\end{eqnarray}\noindent This equation is again of the form so that we canapply Arden's rule ($r$ is $b + a\,b + a\,a\,(a^*)\,b$ and $s$is $\ONE$). This gives as solution for $Q_0$ the followingregular expression:\begin{eqnarray}Q_0 & = & \ONE\,(b + a\,b + a\,a\,(a^*)\,b)^*\end{eqnarray}\noindent Since this is a regular expression, we can simplifyaway the $\ONE$ to obtain the slightly simpler regularexpression\begin{eqnarray}Q_0 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\end{eqnarray}\noindent Now we can unwind this process and obtain the solutionsfor the other equations. This gives:\begin{eqnarray}Q_0 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\\Q_1 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\,a\\Q_2 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\,a\,a\,(a)^*\end{eqnarray}\noindent Finally, we only need to ``add'' up the equationswhich correspond to a terminal state. In our running example,this is just $Q_2$. Consequently, a regular expressionthat recognises the same language as the automaton is\[(b + a\,b + a\,a\,(a^*)\,b)^*\,a\,a\,(a)^*\]\noindent You can somewhat crosscheck your solutionby taking a string the regular expression can match andand see whether it can be matched by the automaton.One string for example is $aaa$ and \emph{voila} this string is also matched by the automaton.We should prove that Brzozowski's method really producesan equivalent regular expression for the automaton. Butfor the purposes of this module, we omit this.\subsubsection*{Automata Minimization}As seen in the subset construction, the translation of a NFA to a DFA can result in a rather ``inefficient'' DFA. Meaning there are states that are not needed. ADFA can be \emph{minimised} by the following algorithm:\begin{enumerate}\item Take all pairs $(q, p)$ with $q \not= p$\item Mark all pairs that accepting and non-accepting states\item For all unmarked pairs $(q, p)$ and all characters $c$ test whether \begin{center} $(\delta(q, c), \delta(p,c))$ \end{center} are marked. If there is one, then also mark $(q, p)$.\item Repeat last step until no change.\item All unmarked pairs can be merged.\end{enumerate}\noindent To illustrate this algorithm, consider the following DFA.\begin{center}\begin{tikzpicture}[>=stealth',very thick,auto, every state/.style={minimum size=0pt, inner sep=2pt,draw=blue!50,very thick, fill=blue!20}]\node[state,initial] (Q_0) {$Q_0$};\node[state] (Q_1) [right=of Q_0] {$Q_1$};\node[state] (Q_2) [below right=of Q_0] {$Q_2$};\node[state] (Q_3) [right=of Q_2] {$Q_3$};\node[state, accepting] (Q_4) [right=of Q_1] {$Q_4$};\path[->] (Q_0) edge node [above] {$a$} (Q_1);\path[->] (Q_1) edge node [above] {$a$} (Q_4);\path[->] (Q_4) edge [loop right] node {$a, b$} ();\path[->] (Q_3) edge node [right] {$a$} (Q_4);\path[->] (Q_2) edge node [above] {$a$} (Q_3);\path[->] (Q_1) edge node [right] {$b$} (Q_2);\path[->] (Q_0) edge node [above] {$b$} (Q_2);\path[->] (Q_2) edge [loop left] node {$b$} ();\path[->] (Q_3) edge [bend left=95, looseness=1.3] node [below] {$b$} (Q_0);\end{tikzpicture}\end{center}\noindent In Step 1 and 2 we consider essentially a triangleof the form\begin{center}\begin{tikzpicture}[scale=0.6,line width=0.8mm]\draw (0,0) -- (4,0);\draw (0,1) -- (4,1);\draw (0,2) -- (3,2);\draw (0,3) -- (2,3);\draw (0,4) -- (1,4);\draw (0,0) -- (0, 4);\draw (1,0) -- (1, 4);\draw (2,0) -- (2, 3);\draw (3,0) -- (3, 2);\draw (4,0) -- (4, 1);\draw (0.5,-0.5) node {$Q_0$}; \draw (1.5,-0.5) node {$Q_1$}; \draw (2.5,-0.5) node {$Q_2$}; \draw (3.5,-0.5) node {$Q_3$};\draw (-0.5, 3.5) node {$Q_1$}; \draw (-0.5, 2.5) node {$Q_2$}; \draw (-0.5, 1.5) node {$Q_3$}; \draw (-0.5, 0.5) node {$Q_4$}; \draw (0.5,0.5) node {\large$\star$}; \draw (1.5,0.5) node {\large$\star$}; \draw (2.5,0.5) node {\large$\star$}; \draw (3.5,0.5) node {\large$\star$};\end{tikzpicture}\end{center}\noindent where the lower row is filled with stars, because inthe corresponding pairs there is always one state that isaccepting ($Q_4$) and a state that is non-accepting (the otherstates).Now in Step 3 we need to fill in more stars according whether one of the next-state pairs are marked. We have to do this for every unmarked field until there is no change anymore.This gives the triangle\begin{center}\begin{tikzpicture}[scale=0.6,line width=0.8mm]\draw (0,0) -- (4,0);\draw (0,1) -- (4,1);\draw (0,2) -- (3,2);\draw (0,3) -- (2,3);\draw (0,4) -- (1,4);\draw (0,0) -- (0, 4);\draw (1,0) -- (1, 4);\draw (2,0) -- (2, 3);\draw (3,0) -- (3, 2);\draw (4,0) -- (4, 1);\draw (0.5,-0.5) node {$Q_0$}; \draw (1.5,-0.5) node {$Q_1$}; \draw (2.5,-0.5) node {$Q_2$}; \draw (3.5,-0.5) node {$Q_3$};\draw (-0.5, 3.5) node {$Q_1$}; \draw (-0.5, 2.5) node {$Q_2$}; \draw (-0.5, 1.5) node {$Q_3$}; \draw (-0.5, 0.5) node {$Q_4$}; \draw (0.5,0.5) node {\large$\star$}; \draw (1.5,0.5) node {\large$\star$}; \draw (2.5,0.5) node {\large$\star$}; \draw (3.5,0.5) node {\large$\star$};\draw (0.5,1.5) node {\large$\star$}; \draw (2.5,1.5) node {\large$\star$}; \draw (0.5,3.5) node {\large$\star$}; \draw (1.5,2.5) node {\large$\star$}; \end{tikzpicture}\end{center}\noindent which means states $Q_0$ and $Q_2$, as well as $Q_1$and $Q_3$ can be merged. This gives the following minimal DFA\begin{center}\begin{tikzpicture}[>=stealth',very thick,auto, every state/.style={minimum size=0pt, inner sep=2pt,draw=blue!50,very thick, fill=blue!20}]\node[state,initial] (Q_02) {$Q_{0, 2}$};\node[state] (Q_13) [right=of Q_02] {$Q_{1, 3}$};\node[state, accepting] (Q_4) [right=of Q_13] {$Q_{4\phantom{,0}}$};\path[->] (Q_02) edge [bend left] node [above] {$a$} (Q_13);\path[->] (Q_13) edge [bend left] node [below] {$b$} (Q_02);\path[->] (Q_02) edge [loop below] node {$b$} ();\path[->] (Q_13) edge node [above] {$a$} (Q_4);\path[->] (Q_4) edge [loop above] node {$a, b$} ();\end{tikzpicture}\end{center}\subsubsection*{Regular Languages}Given the constructions in the previous sections we obtain the following overall picture:\begin{center}\begin{tikzpicture}\node (rexp) {\bf Regexps};\node (nfa) [right=of rexp] {\bf NFAs};\node (dfa) [right=of nfa] {\bf DFAs};\node (mdfa) [right=of dfa] {\bf\begin{tabular}{c}minimal\\ DFAs\end{tabular}};\path[->,line width=1mm] (rexp) edge node [above=4mm, black] {\begin{tabular}{c@{\hspace{9mm}}}Thompson's\\[-1mm] construction\end{tabular}} (nfa);\path[->,line width=1mm] (nfa) edge node [above=4mm, black] {\begin{tabular}{c}subset\\[-1mm] construction\end{tabular}}(dfa);\path[->,line width=1mm] (dfa) edge node [below=5mm, black] {minimisation} (mdfa);\path[->,line width=1mm] (dfa) edge [bend left=45] node [below] {\begin{tabular}{l}Brzozowski's\\ method\end{tabular}} (rexp);\end{tikzpicture}\end{center}\noindent By going from regular expressions over NFAs to DFAs,we can always ensure that for every regular expression thereexists a NFA and a DFA that can recognise the same language.Although we did not prove this fact. Similarly by going fromDFAs to regular expressions, we can make sure for every DFA there exists a regular expression that can recognise the samelanguage. Again we did not prove this fact. The interesting conclusion is that automata and regular expressions can recognise the same set of languages:\begin{quote} A language is \emph{regular} iff there exists aregular expression that recognises all its strings.\end{quote}\noindent or equivalently \begin{quote} A language is \emph{regular} iff there exists anautomaton that recognises all its strings.\end{quote}\noindent So for deciding whether a string is recognised by aregular expression, we could use our algorithm based onderivatives or NFAs or DFAs. But let us quickly look at whatthe differences mean in computational terms. Translating aregular expression into a NFA gives us an automaton that has$O(n)$ nodes---that means the size of the NFA grows linearlywith the size of the regular expression. The problem with NFAsis that the problem of deciding whether a string is acceptedor not is computationally not cheap. Remember with NFAs wehave potentially many next states even for the same input andalso have the silent $\epsilon$-transitions. If we want tofind a path from the starting state of a NFA to an acceptingstate, we need to consider all possibilities. In Ruby andPython this is done by a depth-first search, which in turnmeans that if a ``wrong'' choice is made, the algorithm has tobacktrack and thus explore all potential candidates. This isexactly the reason why Ruby and Python are so slow for evilregular expressions. An alternative to the potentially slowdepth-first search is to explore the search space in abreadth-first fashion, but this might incur a big memorypenalty. To avoid the problems with NFAs, we can translate them into DFAs. With DFAs the problem of deciding whether astring is recognised or not is much simpler, because ineach state it is completely determined what the nextstate will be for a given input. So no search is needed.The problem with this is that the translation to DFAscan explode exponentially the number of states. Therefore when this route is taken, we definitely need to minimise theresulting DFAs in order to have an acceptable memory and runtime behaviour. But remember the subset constructionin the worst case explodes the number of states by $2^n$.Effectively also the translation to DFAs can incur a bigruntime penalty.But this does not mean that everything is bad with automata.Recall the problem of finding a regular expressions for thelanguage that is \emph{not} recognised by a regularexpression. In our implementation we added explicitly such aregular expressions because they are useful for recognisingcomments. But in principle we did not need to. The argumentfor this is as follows: take a regular expression, translateit into a NFA and then a DFA that both recognise the samelanguage. Once you have the DFA it is very easy to constructthe automaton for the language not recognised by a DFA. Ifthe DFA is completed (this is important!), then you just needto exchange the accepting and non-accepting states. You canthen translate this DFA back into a regular expression andthat will be the regular expression that can match all stringsthe original regular expression could \emph{not} match.It is also interesting that not all languages are regular. Themost well-known example of a language that is not regularconsists of all the strings of the form\[a^n\,b^n\]\noindent meaning strings that have the same number of $a$sand $b$s. You can try, but you cannot find a regularexpression for this language and also not an automaton. Onecan actually prove that there is no regular expression norautomaton for this language, but again that would lead us toofar afield for what we want to do in this module.\section*{Further Reading}Compare what a ``human expert'' would create as an automaton for theregular expression $a (b + c)^*$ and what the Thomsonalgorithm generates.%http://www.inf.ed.ac.uk/teaching/courses/ct/\end{document}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End: