ninems/ninems.tex
changeset 75 24d9d64c2a95
parent 72 83b021fc7d29
child 76 f575cf219377
--- a/ninems/ninems.tex	Sat Jul 13 22:56:31 2019 +0100
+++ b/ninems/ninems.tex	Tue Jul 16 22:18:18 2019 +0100
@@ -198,10 +198,31 @@
 massive amounts of CPU-resources---causing web servers to
 grind to a halt. This happened when a post with 20,000 white spaces was
 submitted, but importantly the white spaces were neither at the
-beginning nor at the end. As a result, the regular expression matching
-engine needed to backtrack over many choices. The underlying problem is
+beginning nor at the end. 
+As a result, the regular expression matching
+engine needed to backtrack over many choices.
+In this example, the time needed to process the string is not 
+exactly the classical exponential case, but rather $O(n^2)$
+with respect to the string length. But this is enough for the 
+home page of stackexchnge to respond not fast enough to
+the load balancer, which thought that there must be some
+attack and therefore stopped the servers from responding to 
+requests. This makes the whole site become unavailable.
+Another fresh example that just came out of the oven
+is the cloudfare outage incident. A poorly written
+regular expression exhibited exponential behaviour
+and exhausted CPUs that serve HTTP traffic on 2 July 2019,
+ resulting in a 27-minute outage. The regular expression contained 
+adjacent Kleene stars, which is the source of the exponential number
+of backtracking possibilities. Although this outage has a series of
+causes that came from different vulnerabilities within the system 
+of cloudfare, the heart of the problem lies within regular expression.
+Such outages could be excluded from happening if the matching 
+algorithm is guaranteed to be fast.\footnote{\url{https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/}}
+ The underlying problem is
 that many ``real life'' regular expression matching engines do not use
-DFAs for matching. This is because they support regular expressions that
+DFAs for matching. 
+This is because they support regular expressions that
 are not covered by the classical automata theory, and in this more
 general setting there are quite a few research questions still
 unanswered and fast algorithms still need to be developed (for example