# HG changeset patch # User Chengsong # Date 1563312011 -3600 # Node ID f575cf21937743b426f7a1fc1e1aa58a7ef5df88 # Parent 24d9d64c2a95034bb08c000fe0bb23dddfcd0d38# Parent 9e791ef6022f1603f31e9ec58aaa3bb7f1a0a730 merged diff -r 9e791ef6022f -r f575cf219377 Spiral.scala --- a/Spiral.scala Mon Jul 15 10:46:50 2019 +0100 +++ b/Spiral.scala Tue Jul 16 22:20:11 2019 +0100 @@ -562,7 +562,7 @@ def christian_def2(){ val a = AALTS(List(Z), List(AZERO, ASEQ(List(), AALTS(List(),List(AONE(List()), ACHAR(Nil, 'b'))), ACHAR(Nil, 'b')) ) ) val unsimp = bsimp(bder('b',a)) - val simped = bsimp(bder('b', bsimp(a))) + val simped = bsimp(bder('b', bsimp(a)) ) println(bsimp(a)) if(unsimp == simped){ println(s"bsimp(bder c r) = ${unsimp}, whereas bsimp(bder c bsimp r) = ${simped}") diff -r 9e791ef6022f -r f575cf219377 ninems/ninems.tex --- a/ninems/ninems.tex Mon Jul 15 10:46:50 2019 +0100 +++ b/ninems/ninems.tex Tue Jul 16 22:20:11 2019 +0100 @@ -198,10 +198,31 @@ massive amounts of CPU-resources---causing web servers to grind to a halt. This happened when a post with 20,000 white spaces was submitted, but importantly the white spaces were neither at the -beginning nor at the end. As a result, the regular expression matching -engine needed to backtrack over many choices. The underlying problem is +beginning nor at the end. +As a result, the regular expression matching +engine needed to backtrack over many choices. +In this example, the time needed to process the string is not +exactly the classical exponential case, but rather $O(n^2)$ +with respect to the string length. But this is enough for the +home page of stackexchnge to respond not fast enough to +the load balancer, which thought that there must be some +attack and therefore stopped the servers from responding to +requests. This makes the whole site become unavailable. +Another fresh example that just came out of the oven +is the cloudfare outage incident. A poorly written +regular expression exhibited exponential behaviour +and exhausted CPUs that serve HTTP traffic on 2 July 2019, + resulting in a 27-minute outage. The regular expression contained +adjacent Kleene stars, which is the source of the exponential number +of backtracking possibilities. Although this outage has a series of +causes that came from different vulnerabilities within the system +of cloudfare, the heart of the problem lies within regular expression. +Such outages could be excluded from happening if the matching +algorithm is guaranteed to be fast.\footnote{\url{https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/}} + The underlying problem is that many ``real life'' regular expression matching engines do not use -DFAs for matching. This is because they support regular expressions that +DFAs for matching. +This is because they support regular expressions that are not covered by the classical automata theory, and in this more general setting there are quite a few research questions still unanswered and fast algorithms still need to be developed (for example