diff -r 5b5a68df6d16 -r e8ef8f38ca84 handouts/ho01.tex --- a/handouts/ho01.tex Mon Sep 15 09:36:02 2014 +0100 +++ b/handouts/ho01.tex Sun Sep 21 11:46:49 2014 +0100 @@ -149,7 +149,7 @@ regular expression matching in Python and in Ruby. \begin{center} -\begin{tikzpicture}[y=.1cm, x=.15cm] +\begin{tikzpicture}[y=.09cm, x=.15cm] %axis \draw (0,0) -- coordinate (x axis mid) (30,0); \draw (0,0) -- coordinate (y axis mid) (0,30); @@ -184,18 +184,21 @@ Ruby is even slightly worse.\footnote{In this example Ruby uses the slightly different regular expression \texttt{a?a?a?...a?a?aaa...aa}, where the \texttt{a?} and -\texttt{a} each occur $n$ times.} Admittedly, this regular -expression is carefully chosen to exhibit this exponential -behaviour, but similar ones occur more often than one wants in -``real life''. They are sometimes called \emph{evil regular -expressions} because they have the potential to make regular -expression matching engines to topple over, like in Python and -Ruby. The problem with evil regular expressions is that they -can have some serious consequences, for example, if you use -them in your web-application. The reason is that hackers can -look for these instances where the matching engine behaves -badly and then mount a nice DoS-attack against your -application. +\texttt{a} each occur $n$ times. More test results can be +found at \url{http://www.computerbytesman.com/redos/}.} +Admittedly, this regular expression is carefully chosen to +exhibit this exponential behaviour, but similar ones occur +more often than one wants in ``real life''. They are sometimes +called \emph{evil regular expressions} because they have the +potential to make regular expression matching engines to +topple over, like in Python and Ruby. The problem with evil +regular expressions is that they can have some serious +consequences, for example, if you use them in your +web-application. The reason is that hackers can look for these +instances where the matching engine behaves badly and then +mount a nice DoS-attack against your application. These +attacks are already have their own name: +\emph{Regular Expression Denial of Servive Attack (ReDoS)}. It will be instructive to look behind the ``scenes'' to find out why Python and Ruby (and others) behave so badly when @@ -208,7 +211,7 @@ up to 12,000 in less than 10(!) seconds, see the graph below: \begin{center} -\begin{tikzpicture}[y=.1cm, x=.0006cm] +\begin{tikzpicture}[y=.09cm, x=.0006cm] %axis \draw (0,0) -- coordinate (x axis mid) (12000,0); \draw (0,0) -- coordinate (y axis mid) (0,30);