246 \end{axis} |
246 \end{axis} |
247 \end{tikzpicture} |
247 \end{tikzpicture} |
248 \end{tabular} |
248 \end{tabular} |
249 \end{center} |
249 \end{center} |
250 |
250 |
251 \noindent This first graph shows that Python and Java need |
251 \noindent This first graph shows that Python and Java 8 need |
252 approximately 30 seconds to find out that the regular expression |
252 approximately 30 seconds to find out that the regular expression |
253 $\texttt{(a*)*\,b}$ does not match strings of 28 \texttt{a}s. |
253 $\texttt{(a*)*\,b}$ does not match strings of 28 \texttt{a}s. |
254 Similarly, the second shows that Python needs approximately 29 seconds |
254 Similarly, the second shows that Python needs approximately 29 seconds |
255 for finding out whether a string of 28 \texttt{a}s matches the regular |
255 for finding out whether a string of 28 \texttt{a}s matches the regular |
256 expression \texttt{a?\{28\}\,a\{28\}}. Ruby is even slightly |
256 expression \texttt{a?\{28\}\,a\{28\}}. Ruby is even slightly |
280 |
280 |
281 \begin{center} |
281 \begin{center} |
282 \url{http://davidvgalbraith.com/how-i-fixed-atom/} |
282 \url{http://davidvgalbraith.com/how-i-fixed-atom/} |
283 \end{center} |
283 \end{center} |
284 |
284 |
|
285 \noindent |
285 and when somebody tried to match web-addresses using regular |
286 and when somebody tried to match web-addresses using regular |
286 expressions |
287 expressions |
287 |
288 |
288 \begin{center} |
289 \begin{center} |
289 \url{https://www.tutorialdocs.com/article/regex-trap.html} |
290 \url{https://www.tutorialdocs.com/article/regex-trap.html} |
290 \end{center} |
291 \end{center} |
291 |
292 |
292 \noindent |
293 |
293 Such troublesome regular expressions are sometimes called \emph{evil |
294 Such troublesome regular expressions are sometimes called \emph{evil |
294 regular expressions} because they have the potential to make regular |
295 regular expressions} because they have the potential to make regular |
295 expression matching engines to topple over, like in Python, Ruby and |
296 expression matching engines to topple over, like in Python, Ruby and |
296 Java. This ``toppling over'' is also sometimes called |
297 Java 8. This ``toppling over'' is also sometimes called |
297 \emph{catastrophic backtracking}. I have also seen the term |
298 \emph{catastrophic backtracking}. I have also seen the term |
298 \emph{eternal matching} used for this. The problem with evil regular |
299 \emph{eternal matching} used for this. The problem with evil regular |
299 expressions is that they can have some serious consequences, for |
300 expressions is that they can have some serious consequences, for |
300 example, if you use them in your web-application. The reason is that |
301 example, if you use them in your web-application. The reason is that |
301 hackers can look for these instances where the matching engine behaves |
302 hackers can look for these instances where the matching engine behaves |
569 r_1 \equiv r_2 \;\dn\; L(r_1) = L(r_2) |
570 r_1 \equiv r_2 \;\dn\; L(r_1) = L(r_2) |
570 \] |
571 \] |
571 |
572 |
572 \noindent That means two regular expressions are said to be |
573 \noindent That means two regular expressions are said to be |
573 equivalent if they match the same set of strings. That is |
574 equivalent if they match the same set of strings. That is |
574 there meaning is the same. Therefore we |
575 their meanings is the same. Therefore we |
575 do not really distinguish between the different regular |
576 do not really distinguish between the different regular |
576 expression $(r_1 + r_2) + r_3$ and $r_1 + (r_2 + r_3)$, |
577 expression $(r_1 + r_2) + r_3$ and $r_1 + (r_2 + r_3)$, |
577 because they are equivalent. I leave you to the question |
578 because they are equivalent. I leave you to the question |
578 whether |
579 whether |
579 |
580 |