test.html
changeset 71 7717f20f0504
parent 70 e6868bd2942b
child 72 d65525aeca08
equal deleted inserted replaced
70:e6868bd2942b 71:7717f20f0504
     1 <b>MSc Projects</b>
       
     2 
       
     3 <p>
       
     4 start of paragraph. <cyan> a <red>cyan</red> word</cyan> normal again something longer.
       
     5 </p> 
       
     6 
       
     7 
       
     8  <p><b>Description:</b>  
       
     9   <a>Regular expressions</a> are extremely useful for many text-processing tasks such as finding patterns in texts,
       
    10   lexing programs, syntax highlighting and so on. Given that regular expressions were
       
    11   introduced in 1950 by <a>Stephen Kleene</a>, you might think 
       
    12   regular expressions have since been studied and implemented to death. But you would definitely be mistaken: in fact they are still
       
    13   an active research area. For example
       
    14   <a>this paper</a> 
       
    15   about regular expression matching and partial derivatives was presented this summer at the international 
       
    16   PPDP'12 conference. The task in this project is to implement the results from this paper.</p>
       
    17 
       
    18   <p>The background for this project is that some regular expressions are 
       
    19   <a>evil</a>
       
    20   and can stab you in the back; according to
       
    21   this <a>blog post</a>.
       
    22   For example, if you use in <a>Python</a> or 
       
    23   in <a>Ruby</a> (probably also in other mainstream programming languages) the 
       
    24   innocently looking regular expression a?{28}a{28} and match it, say, against the string 
       
    25   <red>aaaaaaaaaaaaaaaaaaaaaaaaaaaa</red> (that is 28 as), you will soon notice that your CPU usage goes to 100%. In fact,
       
    26   Python and Ruby need approximately 30 seconds of hard work for matching this string. You can try it for yourself:
       
    27   <a>re.py</a> (Python version) and 
       
    28   <a>re.rb</a> 
       
    29   (Ruby version). You can imagine an attacker
       
    30   mounting a nice <a>DoS attack</a> against 
       
    31   your program if it contains such an evil regular expression. Actually 
       
    32   <a>Scala</a> (and also Java) are almost immune from such
       
    33   attacks as they can deal with strings of up to 4,300 as in less than a second. But if you scale
       
    34   the regular expression and string further to, say, 4,600 as, then you get a 
       
    35   StackOverflowError 
       
    36   potentially crashing your program.
       
    37   </p>