afl-material: comparison handouts/ho03.tex

equal deleted inserted replaced

-:31e89128ccd2
+:434891622131
 \noindent Unfortunately we cannot make any more progress with
 substituting equations, because both (6) and (7) contain the
 variable on the left-hand side also on the right-hand side.
 Here we need to now use a law that is different from the usual
-laws. It is called \emph{Arden's rule}. It states that
+laws about linear equations. It is called \emph{Arden's rule}.
-if an equation is of the form $q = q\,r + s$ then it can be
+It states that if an equation is of the form $q = q\,r + s$
-transformed to $q = s\, r^*$. Since we can assume $+$ is
+then it can be transformed to $q = s\, r^*$. Since we can
-symmetric, equation (7) is of that form: $s$ is $q_0\,a\,a$
+assume $+$ is symmetric, Equation (7) is of that form: $s$ is
-and $r$ is $a$. That means we can transform Equation (7)
+$q_0\,a\,a$ and $r$ is $a$. That means we can transform
-to obtain the two new equations
+(7) to obtain the two new equations
 \begin{eqnarray}
 q_0 & = & \epsilon + q_0\,(b + a\,b) +  q_2\,b\\
 q_2 & = & q_0\,a\,a\,(a^*)
 \end{eqnarray}
 \begin{eqnarray}
 q_0 & = & \epsilon\,(b + a\,b + a\,a\,(a^*)\,b)^*
 \end{eqnarray}
-\noindent SInce this is a regular expression, we can simplify
+\noindent Since this is a regular expression, we can simplify
 away the $\epsilon$ to obtain the slightly simpler regular
 expression
 \begin{eqnarray}
 q_0 & = & (b + a\,b + a\,a\,(a^*)\,b)^*
 \end{center}
 \subsubsection*{Regular Languages}
 Given the constructions in the previous sections we obtain
-the following picture:
+the following overall picture:
 \begin{center}
 \begin{tikzpicture}
 \node (rexp)  {\bf Regexps};
 \node (nfa) [right=of rexp] {\bf NFAs};
 \end{tikzpicture}
 \end{center}
 \noindent By going from regular expressions over NFAs to DFAs,
 we can always ensure that for every regular expression there
-exists a NFA and DFA that can recognise the same language.
+exists a NFA and a DFA that can recognise the same language.
 Although we did not prove this fact. Similarly by going from
 DFAs to regular expressions, we can make sure for every DFA
 there exists a regular expression that can recognise the same
 language. Again we did not prove this fact.
 the differences mean in computational terms. Translating a
 regular expression into a NFA gives us an automaton that has
 $O(n)$ nodes---that means the size of the NFA grows linearly
 with the size of the regular expression. The problem with NFAs
 is that the problem of deciding whether a string is accepted
-or not is computationally not cheap. Remember with NFAs we have
+or not is computationally not cheap. Remember with NFAs we
-potentially many next states even for the same input and also
+have potentially many next states even for the same input and
-have the silent $\epsilon$-transitions. If we want to find a
+also have the silent $\epsilon$-transitions. If we want to
-path from the starting state of an NFA to an accepting state,
+find a path from the starting state of an NFA to an accepting
-we need to consider all possibilities. In Ruby and Python this
+state, we need to consider all possibilities. In Ruby and
-is done by a depth-first search, which in turn means that if a
+Python this is done by a depth-first search, which in turn
-``wrong'' choice is made, the algorithm has to backtrack and
+means that if a ``wrong'' choice is made, the algorithm has to
-thus explore all potential candidates. This is exactly the
+backtrack and thus explore all potential candidates. This is
-reason why Ruby and Python are so slow for evil regular
+exactly the reason why Ruby and Python are so slow for evil
-expressions. The alternative is to explore the search space
+regular expressions. An alternative to the potentially slow
-in a breadth-first fashion, but this might incur a big memory
+depth-first search is to explore the search space in a
+breadth-first fashion, but this might incur a big memory
 penalty.
 To avoid the problems with NFAs, we can translate them
 into DFAs. With DFAs the problem of deciding whether a
 string is recognised or not is much simpler, because in
 can explode exponentially the number of states. Therefore when
 this route is taken, we definitely need to minimise the
 resulting DFAs in order to have an acceptable memory
 and runtime behaviour. But remember the subset construction
 in the worst case explodes the number of states by $2^n$.
+Effectively also the translation to DFAs can incur a big
+runtime penalty.
 But this does not mean that everything is bad with automata.
 Recall the problem of finding a regular expressions for the
 language that is \emph{not} recognised by a regular
 expression. In our implementation we added explicitly such a
 regular expressions because they are useful for recognising
 comments. But in principle we did not need to. The argument
 for this is as follows: take a regular expression, translate
-it into a NFA and DFA that recognise the same language. Once
+it into a NFA and then a DFA that both recognise the same
-you have the DFA it is very easy to construct the automaton
+language. Once you have the DFA it is very easy to construct
-for the language not recognised by an DFA. If the DFA is
+the automaton for the language not recognised by an DFA. If
-completed (this is important!), then you just need to exchange
+the DFA is completed (this is important!), then you just need
-the accepting and non-accepting states. You can then translate
+to exchange the accepting and non-accepting states. You can
-this DFA back into a regular expression.
+then translate this DFA back into a regular expression and
+that will be the regular expression that can match all strings
-Not all languages are regular. The most well-known example
+the original regular expression could \emph{not} match.
-of a language that is not regular consists of all the strings
-of the form
+It is also interesting that not all languages are regular. The
+most well-known example of a language that is not regular
+consists of all the strings of the form
 \[a^n\,b^n\]
 \noindent meaning strings that have the same number of $a$s
 and $b$s. You can try, but you cannot find a regular

changeset 349	434891622131
parent 344	408fd5994288
child 444	3056a4c071b0