63 mistaken: in fact they are still an active research area. On the top of my head, I can give |
63 mistaken: in fact they are still an active research area. On the top of my head, I can give |
64 you at least ten research papers that appeared in the last few years. |
64 you at least ten research papers that appeared in the last few years. |
65 For example |
65 For example |
66 <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">this paper</A> |
66 <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">this paper</A> |
67 about regular expression matching and derivatives was presented in 2014 at the international |
67 about regular expression matching and derivatives was presented in 2014 at the international |
68 FLOPS conference. Another <A HREF="http://nms.kcl.ac.uk/christian.urban/Publications/posix.pdf">paper</A> by my PhD student and me was presented in 2016 |
68 FLOPS conference. Another <A HREF="https://nms.kcl.ac.uk/christian.urban/Publications/posix.pdf">paper</A> by my PhD student and me was presented in 2016 |
69 at the international ITP conference. |
69 at the international ITP conference. |
70 The task in this project is to implement these results and use them for lexing.</p> |
70 The task in this project is to implement these results and use them for lexing.</p> |
71 |
71 |
72 <p>The background for this project is that some regular expressions are |
72 <p>The background for this project is that some regular expressions are |
73 “<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>” |
73 “<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>” |
124 (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought |
124 (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought |
125 about the problem much longer than a single afternoon. The task |
125 about the problem much longer than a single afternoon. The task |
126 in this project is to find out how good they actually are by implementing the results from their paper. |
126 in this project is to find out how good they actually are by implementing the results from their paper. |
127 Their approach to regular expression matching is also based on the concept of derivatives. |
127 Their approach to regular expression matching is also based on the concept of derivatives. |
128 I used derivatives very successfully once for something completely different in a |
128 I used derivatives very successfully once for something completely different in a |
129 <A HREF="http://nms.kcl.ac.uk/christian.urban/Publications/rexp.pdf">paper</A> |
129 <A HREF="https://nms.kcl.ac.uk/christian.urban/Publications/rexp.pdf">paper</A> |
130 about the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>. |
130 about the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>. |
131 So I know they are worth their money. Still, it would be interesting to actually compare their results |
131 So I know they are worth their money. Still, it would be interesting to actually compare their results |
132 with my simple rainy-afternoon matcher and potentially “blow away” the regular expression matchers |
132 with my simple rainy-afternoon matcher and potentially “blow away” the regular expression matchers |
133 in Python, Ruby and Java (and possibly in Scala too). The application would be to implement a fast lexer for |
133 in Python, Ruby and Java (and possibly in Scala too). The application would be to implement a fast lexer for |
134 programming languages, or improve the network traffic analysers in the tools <A HREF="https://www.snort.org">Snort</A> and |
134 programming languages, or improve the network traffic analysers in the tools <A HREF="https://www.snort.org">Snort</A> and |
137 |
137 |
138 <p> |
138 <p> |
139 <B>Literature:</B> |
139 <B>Literature:</B> |
140 The place to start with this project is obviously this |
140 The place to start with this project is obviously this |
141 <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">paper</A> |
141 <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">paper</A> |
142 and this <A HREF="http://nms.kcl.ac.uk/christian.urban/Publications/posix.pdf">one</A>. |
142 and this <A HREF="https://nms.kcl.ac.uk/christian.urban/Publications/posix.pdf">one</A>. |
143 Traditional methods for regular expression matching are explained |
143 Traditional methods for regular expression matching are explained |
144 in the Wikipedia articles |
144 in the Wikipedia articles |
145 <A HREF="http://en.wikipedia.org/wiki/DFA_minimization">here</A> and |
145 <A HREF="http://en.wikipedia.org/wiki/DFA_minimization">here</A> and |
146 <A HREF="http://en.wikipedia.org/wiki/Powerset_construction">here</A>. |
146 <A HREF="http://en.wikipedia.org/wiki/Powerset_construction">here</A>. |
147 The authoritative <A HREF="http://infolab.stanford.edu/~ullman/ialc.html">book</A> |
147 The authoritative <A HREF="http://infolab.stanford.edu/~ullman/ialc.html">book</A> |
432 heart rate and body temperature; the Raspberry Pi collects this data and makes it accessible via a simple |
432 heart rate and body temperature; the Raspberry Pi collects this data and makes it accessible via a simple |
433 web-service. The picture on the right is another project that implements an airmouse using an Arduino. |
433 web-service. The picture on the right is another project that implements an airmouse using an Arduino. |
434 |
434 |
435 <center> |
435 <center> |
436 <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;" |
436 <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;" |
437 src="http://nms.kcl.ac.uk/christian.urban/rpi-photo.jpg" |
437 src="https://nms.kcl.ac.uk/christian.urban/rpi-photo.jpg" |
438 alt="Raspberry Pi" |
438 alt="Raspberry Pi" |
439 width="209" height="313"> |
439 width="209" height="313"> |
440 |
440 |
441 <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;" |
441 <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;" |
442 src="http://nms.kcl.ac.uk/christian.urban/rpi-watch.jpg" |
442 src="https://nms.kcl.ac.uk/christian.urban/rpi-watch.jpg" |
443 alt="Raspberry Pi" |
443 alt="Raspberry Pi" |
444 width="450" height="254"> |
444 width="450" height="254"> |
445 |
445 |
446 <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;" |
446 <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;" |
447 src="http://nms.kcl.ac.uk/christian.urban/rpi-airmouse.jpg" |
447 src="https://nms.kcl.ac.uk/christian.urban/rpi-airmouse.jpg" |
448 alt="Raspberry Pi" |
448 alt="Raspberry Pi" |
449 width="250" height="254"> |
449 width="250" height="254"> |
450 </center> |
450 </center> |
451 |
451 |
452 |
452 |
495 algorithm switched on and it almost caused a catastrophic mission failure (see |
495 algorithm switched on and it almost caused a catastrophic mission failure (see |
496 this youtube video <A HREF="http://www.youtube.com/watch?v=lyx7kARrGeM">here</A> |
496 this youtube video <A HREF="http://www.youtube.com/watch?v=lyx7kARrGeM">here</A> |
497 for an explanation what happened). |
497 for an explanation what happened). |
498 We were able to prove the correctness of this algorithm, but were also able to |
498 We were able to prove the correctness of this algorithm, but were also able to |
499 establish the correctness of some optimisations in this |
499 establish the correctness of some optimisations in this |
500 <A HREF="http://nms.kcl.ac.uk/christian.urban/Publications/pip.pdf">paper</A>. |
500 <A HREF="https://nms.kcl.ac.uk/christian.urban/Publications/pip.pdf">paper</A>. |
501 </p> |
501 </p> |
502 |
502 |
503 <p>On a much smaller scale, there are a few small programs and underlying algorithms where it |
503 <p>On a much smaller scale, there are a few small programs and underlying algorithms where it |
504 is not really understood whether they always compute a correct result (for example the |
504 is not really understood whether they always compute a correct result (for example the |
505 regular expression matcher by Sulzmann and Lu in project [CU1]). The aim of this |
505 regular expression matcher by Sulzmann and Lu in project [CU1]). The aim of this |
546 |
546 |
547 |
547 |
548 <li> <H4>Earlier Projects</H4> |
548 <li> <H4>Earlier Projects</H4> |
549 |
549 |
550 I am also open to project suggestions from you. You might find some inspiration from my earlier projects: |
550 I am also open to project suggestions from you. You might find some inspiration from my earlier projects: |
551 <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-12.html">BSc 2012/13</A>, |
551 <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-12.html">BSc 2012/13</A>, |
552 <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-12.html">MSc 2012/13</A>, |
552 <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-12.html">MSc 2012/13</A>, |
553 <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-13.html">BSc 2013/14</A>, |
553 <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-13.html">BSc 2013/14</A>, |
554 <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-13.html">MSc 2013/14</A>, |
554 <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-13.html">MSc 2013/14</A>, |
555 <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-14.html">BSc 2014/15</A>, |
555 <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-14.html">BSc 2014/15</A>, |
556 <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-14.html">MSc 2014/15</A>, |
556 <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-14.html">MSc 2014/15</A>, |
557 <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-15.html">BSc 2015/16</A>, |
557 <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-15.html">BSc 2015/16</A>, |
558 <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-15.html">MSc 2015/16</A>, |
558 <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-15.html">MSc 2015/16</A>, |
559 <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-16.html">BSc 2016/17</A>, |
559 <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-16.html">BSc 2016/17</A>, |
560 <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-16.html">MSc 2016/17</A> |
560 <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-16.html">MSc 2016/17</A> |
561 <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-17.html">BSc 2017/18</A> |
561 <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-17.html">BSc 2017/18</A> |
562 </ul> |
562 </ul> |
563 </TD> |
563 </TD> |
564 </TR> |
564 </TR> |
565 </TABLE> |
565 </TABLE> |
566 |
566 |