handouts/ho02.tex
changeset 566 b153c04834eb
parent 550 71fc4a7a7039
child 571 499007a7bce2
equal deleted inserted replaced
565:2be8c4c77418 566:b153c04834eb
   276 $(r_1 \cdot r_2)^*$ & $\equiv$ & $1 + r_1\cdot (r_2 \cdot r_1)^* \cdot r_2$\\
   276 $(r_1 \cdot r_2)^*$ & $\equiv$ & $1 + r_1\cdot (r_2 \cdot r_1)^* \cdot r_2$\\
   277 \end{tabular}
   277 \end{tabular}
   278 \end{center}
   278 \end{center}
   279 
   279 
   280 \noindent
   280 \noindent
   281 We will not use them in our algorithm, but feel free to convince you
   281 We will not use them in our algorithm, but feel free to convince yourself
   282 that they hold. As an aside, there has been a lot of research about
   282 that they hold. As an aside, there has been a lot of research about
   283 questions like: Can one always decide when two regular expressions are
   283 questions like: Can one always decide when two regular expressions are
   284 equivalent or not? What does an algorithm look like to decide this
   284 equivalent or not? What does an algorithm look like to decide this
   285 efficiently? So in general it is not a trivial problem.
   285 efficiently? So in general it is not a trivial problem.
   286 
   286 
   476 \noindent holds, which means our algorithm satisfies the
   476 \noindent holds, which means our algorithm satisfies the
   477 specification. Of course we can claim many things\ldots
   477 specification. Of course we can claim many things\ldots
   478 whether the claim holds any water is a different question,
   478 whether the claim holds any water is a different question,
   479 which for example is the point of the Strand-2 Coursework.
   479 which for example is the point of the Strand-2 Coursework.
   480 
   480 
   481 This algorithm was introduced by Janus Brzozowski in 1964, but 
   481 This algorithm was introduced by Janusz Brzozowski in 1964, but 
   482 is more widely known only in the last 10 or so years. Its
   482 is more widely known only in the last 10 or so years. Its
   483 main attractions are simplicity and being fast, as well as
   483 main attractions are simplicity and being fast, as well as
   484 being easily extendable for other regular expressions such as
   484 being easily extendible for other regular expressions such as
   485 $r^{\{n\}}$, $r^?$, $\sim{}r$ and so on (this is subject of
   485 $r^{\{n\}}$, $r^?$, $\sim{}r$ and so on (this is subject of
   486 Strand-1 Coursework 1). 
   486 Strand-1 Coursework 1). 
   487 
   487 
   488 \subsection*{The Matching Algorithm in Scala}
   488 \subsection*{The Matching Algorithm in Scala}
   489 
   489 
   541 %This is not an error: it hardly takes more than half a second for
   541 %This is not an error: it hardly takes more than half a second for
   542 %strings up to the length of 6500. After that we receive a
   542 %strings up to the length of 6500. After that we receive a
   543 %StackOverflow exception, but still\ldots
   543 %StackOverflow exception, but still\ldots
   544 
   544 
   545 For running the algorithm with our first example, the evil
   545 For running the algorithm with our first example, the evil
   546 regular expression $a^?{}^{\{n\}}a^{\{n\}}$, we need to implement
   546 regular expression $a^?{}^{\{n\}}\cdot a^{\{n\}}$, we need to implement
   547 the optional regular expression and the `exactly $n$-times
   547 the optional regular expression and the `exactly $n$-times
   548 regular expression'. This can be done with the translations
   548 regular expression'. This can be done with the translations
   549 
   549 
   550 \lstinputlisting[numbers=none]{../progs/app51.scala}
   550 \lstinputlisting[numbers=none]{../progs/app51.scala}
   551 
   551 
   695 \caption{The simplification function and modified 
   695 \caption{The simplification function and modified 
   696 \texttt{ders}-function; this function now
   696 \texttt{ders}-function; this function now
   697 calls \texttt{der} first, but then simplifies
   697 calls \texttt{der} first, but then simplifies
   698 the resulting derivative regular expressions before
   698 the resulting derivative regular expressions before
   699 building the next derivative, see
   699 building the next derivative, see
   700 Line~\ref{simpline}.\label{scala2}}
   700 Line~24.\label{scala2}}
   701 \end{figure}
   701 \end{figure}
   702 
   702 
   703 \begin{center}
   703 \begin{center}
   704 \begin{tikzpicture}
   704 \begin{tikzpicture}
   705 \begin{axis}[
   705 \begin{axis}[
   727 
   727 
   728 \noindent
   728 \noindent
   729 To recap, Python and Ruby needed approximately 30 seconds to match a
   729 To recap, Python and Ruby needed approximately 30 seconds to match a
   730 string of 28 \texttt{a}s and the regular expression $a^{?\{n\}} \cdot
   730 string of 28 \texttt{a}s and the regular expression $a^{?\{n\}} \cdot
   731 a^{\{n\}}$.  We need a third of this time to do the same with strings
   731 a^{\{n\}}$.  We need a third of this time to do the same with strings
   732 up to 11,000 \texttt{a}s.  Similarly, Java and Python needed 30
   732 up to 11,000 \texttt{a}s.  Similarly, Java 8 and Python needed 30
   733 seconds to find out the regular expression $(a^*)^* \cdot b$ does not
   733 seconds to find out the regular expression $(a^*)^* \cdot b$ does not
   734 match the string of 28 \texttt{a}s. We can do the same in the same amount of time
   734 match the string of 28 \texttt{a}s. In Java 9 and later this has been 
   735 for strings composed of nearly 6,000,000 \texttt{a}s:
   735 cranked up to 39,000 \texttt{a}s, but we can do the same in the same 
       
   736 amount of time for strings composed of nearly 6,000,000 \texttt{a}s:
   736 
   737 
   737 
   738 
   738 \begin{center}
   739 \begin{center}
   739 \begin{tikzpicture}
   740 \begin{tikzpicture}
   740 \begin{axis}[
   741 \begin{axis}[
  1044 meets its specification. To have done so, requires a few induction
  1045 meets its specification. To have done so, requires a few induction
  1045 proofs about strings and regular expressions. Following the \emph{induction
  1046 proofs about strings and regular expressions. Following the \emph{induction
  1046 recipes} is already a big step in actually performing these proofs.
  1047 recipes} is already a big step in actually performing these proofs.
  1047 If you do not believe it, proofs have helped me to make sure my code
  1048 If you do not believe it, proofs have helped me to make sure my code
  1048 is correct and in several instances prevented me of letting slip
  1049 is correct and in several instances prevented me of letting slip
  1049 embarrassing mistakes into the `wild'.
  1050 embarrassing mistakes into the `wild'. 
  1050 
  1051 
  1051 \end{document}
  1052 \end{document}
  1052 
  1053 
  1053 
  1054 
  1054 
  1055 
  1055 
  1056 % !TeX program = latexmk -xelatex
  1056 %%% Local Variables: 
  1057 %%% Local Variables: 
  1057 %%% mode: latex
  1058 %%% mode: latex
  1058 %%% TeX-master: t
  1059 %%% TeX-master: t
  1059 %%% End: 
  1060 %%% End: