diff -r fa6d00968b0e -r bddf14e026b3 handouts/ho01.tex --- a/handouts/ho01.tex Tue Sep 25 21:57:14 2018 +0100 +++ b/handouts/ho01.tex Fri Sep 28 13:54:18 2018 +0100 @@ -248,7 +248,7 @@ \end{tabular} \end{center} -\noindent This first graph shows that Python and Java need +\noindent This first graph shows that Python and Java 8 need approximately 30 seconds to find out that the regular expression $\texttt{(a*)*\,b}$ does not match strings of 28 \texttt{a}s. Similarly, the second shows that Python needs approximately 29 seconds @@ -282,6 +282,7 @@ \url{http://davidvgalbraith.com/how-i-fixed-atom/} \end{center} +\noindent and when somebody tried to match web-addresses using regular expressions @@ -289,11 +290,11 @@ \url{https://www.tutorialdocs.com/article/regex-trap.html} \end{center} -\noindent + Such troublesome regular expressions are sometimes called \emph{evil regular expressions} because they have the potential to make regular expression matching engines to topple over, like in Python, Ruby and -Java. This ``toppling over'' is also sometimes called +Java 8. This ``toppling over'' is also sometimes called \emph{catastrophic backtracking}. I have also seen the term \emph{eternal matching} used for this. The problem with evil regular expressions is that they can have some serious consequences, for @@ -339,7 +340,7 @@ \end{center} \noindent And in the case of the regular expression $\texttt{(a*)*\,b}$ -and strings of \texttt{a}s we will beat Java by factor of +and strings of \texttt{a}s we will beat Java 8 by factor of approximately 1,000,000 (note the scale on the $x$-axis). \begin{center} @@ -571,7 +572,7 @@ \noindent That means two regular expressions are said to be equivalent if they match the same set of strings. That is -there meaning is the same. Therefore we +their meanings is the same. Therefore we do not really distinguish between the different regular expression $(r_1 + r_2) + r_3$ and $r_1 + (r_2 + r_3)$, because they are equivalent. I leave you to the question