54 case T_BTAG("<blink>")::rest => interpret(rest, c, Console.BLINK :: ctr) |
54 case T_BTAG("<blink>")::rest => interpret(rest, c, Console.BLINK :: ctr) |
55 case T_ETAG(_)::rest => interpret(rest, c, ctr.tail) |
55 case T_ETAG(_)::rest => interpret(rest, c, ctr.tail) |
56 case _::rest => interpret(rest, c, ctr) |
56 case _::rest => interpret(rest, c, ctr) |
57 } |
57 } |
58 |
58 |
59 interpret(T.fromFile("test.html"), 0, Nil) |
59 val test_string = """ |
|
60 <b>MSc Projects</b> |
|
61 |
|
62 <p> |
|
63 start of paragraph. <cyan> a <red>cyan</red> word</cyan> normal again something longer. |
|
64 </p> |
|
65 |
|
66 |
|
67 <p><b>Description:</b> |
|
68 <a>Regular expressions</a> are extremely useful for many text-processing tasks such as finding patterns in texts, |
|
69 lexing programs, syntax highlighting and so on. Given that regular expressions were |
|
70 introduced in 1950 by <a>Stephen Kleene</a>, you might think |
|
71 regular expressions have since been studied and implemented to death. But you would definitely be mistaken: in fact they are still |
|
72 an active research area. For example |
|
73 <a>this paper</a> |
|
74 about regular expression matching and partial derivatives was presented this summer at the international |
|
75 PPDP'12 conference. The task in this project is to implement the results from this paper.</p> |
|
76 |
|
77 <p>The background for this project is that some regular expressions are |
|
78 <a>evil</a> |
|
79 and can stab you in the back; according to |
|
80 this <a>blog post</a>. |
|
81 For example, if you use in <a>Python</a> or |
|
82 in <a>Ruby</a> (probably also in other mainstream programming languages) the |
|
83 innocently looking regular expression a?{28}a{28} and match it, say, against the string |
|
84 <red>aaaaaaaaaa<cyan>aaaaaaa</cyan>aaaaaaaaaaa</red> (that is 28 as), you will soon notice that your CPU usage goes to 100%. In fact, |
|
85 Python and Ruby need approximately 30 seconds of hard work for matching this string. You can try it for yourself: |
|
86 <a>re.py</a> (Python version) and |
|
87 <a>re.rb</a> |
|
88 (Ruby version). You can imagine an attacker |
|
89 mounting a nice <a>DoS attack</a> against |
|
90 your program if it contains such an evil regular expression. Actually |
|
91 <a>Scala</a> (and also Java) are almost immune from such |
|
92 attacks as they can deal with strings of up to 4,300 as in less than a second. But if you scale |
|
93 the regular expression and string further to, say, 4,600 as, then you get a |
|
94 StackOverflowError |
|
95 potentially crashing your program. |
|
96 </p> |
|
97 """ |
|
98 |
|
99 interpret(T.fromString(test_string), 0, Nil) |