msc-projects-17.html
changeset 522 9c6887fae011
parent 521 56b89dea12df
child 523 407c63117a98
equal deleted inserted replaced
521:56b89dea12df 522:9c6887fae011
    63   mistaken: in fact they are still an active research area. On the top of my head, I can give
    63   mistaken: in fact they are still an active research area. On the top of my head, I can give
    64   you at least ten research papers that appeared in the last few years.
    64   you at least ten research papers that appeared in the last few years.
    65   For example
    65   For example
    66   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">this paper</A> 
    66   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">this paper</A> 
    67   about regular expression matching and derivatives was presented in 2014 at the international 
    67   about regular expression matching and derivatives was presented in 2014 at the international 
    68   FLOPS conference. Another <A HREF="http://nms.kcl.ac.uk/christian.urban/Publications/posix.pdf">paper</A> by my PhD student and me was presented in 2016
    68   FLOPS conference. Another <A HREF="https://nms.kcl.ac.uk/christian.urban/Publications/posix.pdf">paper</A> by my PhD student and me was presented in 2016
    69   at the international ITP conference.
    69   at the international ITP conference.
    70   The task in this project is to implement these results and use them for lexing.</p>
    70   The task in this project is to implement these results and use them for lexing.</p>
    71 
    71 
    72   <p>The background for this project is that some regular expressions are 
    72   <p>The background for this project is that some regular expressions are 
    73   &ldquo;<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>&rdquo;
    73   &ldquo;<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>&rdquo;
   124   (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought
   124   (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought
   125   about the problem much longer than a single afternoon. The task 
   125   about the problem much longer than a single afternoon. The task 
   126   in this project is to find out how good they actually are by implementing the results from their paper. 
   126   in this project is to find out how good they actually are by implementing the results from their paper. 
   127   Their approach to regular expression matching is also based on the concept of derivatives.
   127   Their approach to regular expression matching is also based on the concept of derivatives.
   128   I used derivatives very successfully once for something completely different in a
   128   I used derivatives very successfully once for something completely different in a
   129   <A HREF="http://nms.kcl.ac.uk/christian.urban/Publications/rexp.pdf">paper</A> 
   129   <A HREF="https://nms.kcl.ac.uk/christian.urban/Publications/rexp.pdf">paper</A> 
   130   about the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>.
   130   about the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>.
   131   So I know they are worth their money. Still, it would be interesting to actually compare their results
   131   So I know they are worth their money. Still, it would be interesting to actually compare their results
   132   with my simple rainy-afternoon matcher and potentially &ldquo;blow away&rdquo; the regular expression matchers 
   132   with my simple rainy-afternoon matcher and potentially &ldquo;blow away&rdquo; the regular expression matchers 
   133   in Python, Ruby and Java (and possibly in Scala too). The application would be to implement a fast lexer for
   133   in Python, Ruby and Java (and possibly in Scala too). The application would be to implement a fast lexer for
   134   programming languages, or improve the network traffic analysers in the tools <A HREF="https://www.snort.org">Snort</A> and
   134   programming languages, or improve the network traffic analysers in the tools <A HREF="https://www.snort.org">Snort</A> and
   137 
   137 
   138   <p>
   138   <p>
   139   <B>Literature:</B> 
   139   <B>Literature:</B> 
   140   The place to start with this project is obviously this
   140   The place to start with this project is obviously this
   141   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">paper</A>
   141   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">paper</A>
   142   and this <A HREF="http://nms.kcl.ac.uk/christian.urban/Publications/posix.pdf">one</A>.
   142   and this <A HREF="https://nms.kcl.ac.uk/christian.urban/Publications/posix.pdf">one</A>.
   143   Traditional methods for regular expression matching are explained
   143   Traditional methods for regular expression matching are explained
   144   in the Wikipedia articles 
   144   in the Wikipedia articles 
   145   <A HREF="http://en.wikipedia.org/wiki/DFA_minimization">here</A> and 
   145   <A HREF="http://en.wikipedia.org/wiki/DFA_minimization">here</A> and 
   146   <A HREF="http://en.wikipedia.org/wiki/Powerset_construction">here</A>.
   146   <A HREF="http://en.wikipedia.org/wiki/Powerset_construction">here</A>.
   147   The authoritative <A HREF="http://infolab.stanford.edu/~ullman/ialc.html">book</A>
   147   The authoritative <A HREF="http://infolab.stanford.edu/~ullman/ialc.html">book</A>
   432   heart rate and body temperature; the Raspberry Pi collects this data and makes it accessible via a simple
   432   heart rate and body temperature; the Raspberry Pi collects this data and makes it accessible via a simple
   433   web-service. The picture on the right is another project that implements an airmouse using an Arduino.
   433   web-service. The picture on the right is another project that implements an airmouse using an Arduino.
   434 
   434 
   435   <center>
   435   <center>
   436     <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;"
   436     <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;"
   437          src="http://nms.kcl.ac.uk/christian.urban/rpi-photo.jpg"
   437          src="https://nms.kcl.ac.uk/christian.urban/rpi-photo.jpg"
   438          alt="Raspberry Pi"
   438          alt="Raspberry Pi"
   439          width="209" height="313">
   439          width="209" height="313">
   440 
   440 
   441     <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;"
   441     <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;"
   442          src="http://nms.kcl.ac.uk/christian.urban/rpi-watch.jpg"
   442          src="https://nms.kcl.ac.uk/christian.urban/rpi-watch.jpg"
   443          alt="Raspberry Pi"
   443          alt="Raspberry Pi"
   444          width="450" height="254">
   444          width="450" height="254">
   445 
   445 
   446     <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;"
   446     <img style="-webkit-user-select: none; cursor: -webkit-zoom-in;"
   447          src="http://nms.kcl.ac.uk/christian.urban/rpi-airmouse.jpg"
   447          src="https://nms.kcl.ac.uk/christian.urban/rpi-airmouse.jpg"
   448          alt="Raspberry Pi"
   448          alt="Raspberry Pi"
   449          width="250" height="254">  
   449          width="250" height="254">  
   450   </center>
   450   </center>
   451  
   451  
   452 
   452 
   495  algorithm switched on and it almost caused a catastrophic mission failure (see
   495  algorithm switched on and it almost caused a catastrophic mission failure (see
   496  this youtube video <A HREF="http://www.youtube.com/watch?v=lyx7kARrGeM">here</A>
   496  this youtube video <A HREF="http://www.youtube.com/watch?v=lyx7kARrGeM">here</A>
   497  for an explanation what happened).
   497  for an explanation what happened).
   498  We were able to prove the correctness of this algorithm, but were also able to
   498  We were able to prove the correctness of this algorithm, but were also able to
   499  establish the correctness of some optimisations in this
   499  establish the correctness of some optimisations in this
   500  <A HREF="http://nms.kcl.ac.uk/christian.urban/Publications/pip.pdf">paper</A>.
   500  <A HREF="https://nms.kcl.ac.uk/christian.urban/Publications/pip.pdf">paper</A>.
   501  </p>
   501  </p>
   502 
   502 
   503  <p>On a much smaller scale, there are a few small programs and underlying algorithms where it
   503  <p>On a much smaller scale, there are a few small programs and underlying algorithms where it
   504  is not really understood whether they always compute a correct result (for example the
   504  is not really understood whether they always compute a correct result (for example the
   505  regular expression matcher by Sulzmann and Lu in project [CU1]). The aim of this
   505  regular expression matcher by Sulzmann and Lu in project [CU1]). The aim of this
   546 
   546 
   547 
   547 
   548 <li> <H4>Earlier Projects</H4>
   548 <li> <H4>Earlier Projects</H4>
   549 
   549 
   550  I am also open to project suggestions from you. You might find some inspiration from my earlier projects:
   550  I am also open to project suggestions from you. You might find some inspiration from my earlier projects:
   551  <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-12.html">BSc 2012/13</A>, 
   551  <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-12.html">BSc 2012/13</A>, 
   552  <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-12.html">MSc 2012/13</A>, 
   552  <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-12.html">MSc 2012/13</A>, 
   553  <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-13.html">BSc 2013/14</A>,
   553  <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-13.html">BSc 2013/14</A>,
   554  <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-13.html">MSc 2013/14</A>, 
   554  <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-13.html">MSc 2013/14</A>, 
   555  <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-14.html">BSc 2014/15</A>,
   555  <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-14.html">BSc 2014/15</A>,
   556  <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-14.html">MSc 2014/15</A>, 
   556  <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-14.html">MSc 2014/15</A>, 
   557  <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-15.html">BSc 2015/16</A>,
   557  <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-15.html">BSc 2015/16</A>,
   558  <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-15.html">MSc 2015/16</A>, 
   558  <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-15.html">MSc 2015/16</A>, 
   559  <A HREF="http://nms.kcl.ac.uk/christian.urban/bsc-projects-16.html">BSc 2016/17</A>,
   559  <A HREF="https://nms.kcl.ac.uk/christian.urban/bsc-projects-16.html">BSc 2016/17</A>,
   560  <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-16.html">MSc 2016/17</A>
   560  <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-16.html">MSc 2016/17</A>
   561  <A HREF="http://nms.kcl.ac.uk/christian.urban/msc-projects-17.html">BSc 2017/18</A>
   561  <A HREF="https://nms.kcl.ac.uk/christian.urban/msc-projects-17.html">BSc 2017/18</A>
   562 </ul>
   562 </ul>
   563 </TD>
   563 </TD>
   564 </TR>  
   564 </TR>  
   565 </TABLE>
   565 </TABLE>
   566         
   566