msc-projects-12.html
changeset 163 622c47857b69
parent 162 0752a5de9537
child 164 d99c0026ebaf
equal deleted inserted replaced
162:0752a5de9537 163:622c47857b69
    33 <H4>Email: christian dot urban at kcl dot ac dot uk,  Office: Strand Building S1.27</H4>
    33 <H4>Email: christian dot urban at kcl dot ac dot uk,  Office: Strand Building S1.27</H4>
    34 <H4>If you are interested in a project, please send me an email and we can discuss details. Please include
    34 <H4>If you are interested in a project, please send me an email and we can discuss details. Please include
    35 a short description about your programming skills and Computer Science background in your first email. 
    35 a short description about your programming skills and Computer Science background in your first email. 
    36 I will also need your King's username in order to book the project for you. Thanks.</H4> 
    36 I will also need your King's username in order to book the project for you. Thanks.</H4> 
    37 
    37 
    38 <H4>Note that besides being a lecturer at the theoretical end of Computer Science, I am also a passionate
    38 <H4>Note that besides being a lecturer in the theoretical part of Computer Science, I am also a passionate
    39     <A HREF="http://en.wikipedia.org/wiki/Hacker_(programmer_subculture)">hacker</A> &hellip;
    39     <A HREF="http://en.wikipedia.org/wiki/Hacker_(programmer_subculture)">hacker</A> &hellip;
    40     defined as &ldquo;a person who enjoys exploring the details of programmable systems and 
    40     defined as &ldquo;a person who enjoys exploring the details of programmable systems and 
    41     stretching their capabilities, as opposed to most users, who prefer to learn only the minimum 
    41     stretching their capabilities, as opposed to most users, who prefer to learn only the minimum 
    42     necessary.&rdquo; I am always happy to supervise like-minded students.</H4>  
    42     necessary.&rdquo; I am always happy to supervise like-minded students.</H4>  
    43 
    43 
    48   <B>Description:</b>  
    48   <B>Description:</b>  
    49   <A HREF="http://en.wikipedia.org/wiki/Regular_expression">Regular expressions</A> 
    49   <A HREF="http://en.wikipedia.org/wiki/Regular_expression">Regular expressions</A> 
    50   are extremely useful for many text-processing tasks...finding patterns in texts,
    50   are extremely useful for many text-processing tasks...finding patterns in texts,
    51   lexing programs, syntax highlighting and so on. Given that regular expressions were
    51   lexing programs, syntax highlighting and so on. Given that regular expressions were
    52   introduced in 1950 by <A HREF="http://en.wikipedia.org/wiki/Stephen_Cole_Kleene">Stephen Kleene</A>, you might think 
    52   introduced in 1950 by <A HREF="http://en.wikipedia.org/wiki/Stephen_Cole_Kleene">Stephen Kleene</A>, you might think 
    53   regular expressions have since been studied to death. But you would definitely be mistaken: in fact they are still
    53   regular expressions have since been studied and implemented to death. But you would definitely be mistaken: in fact they are still
    54   an active research area. For example
    54   an active research area. For example
    55   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">this paper</A> 
    55   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">this paper</A> 
    56   about regular expression matching and partial derivatives was presented this summer at the international 
    56   about regular expression matching and partial derivatives was presented this summer at the international 
    57   PPDP'12 conference.</p>
    57   PPDP'12 conference. The task in this project is to implement the results from this paper.</p>
    58 
    58 
    59   <p>The background for this project is that some regular expressions are 
    59   <p>The background for this project is that some regular expressions are 
    60   &quot;<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>&quot; 
    60   &quot;<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>&quot; 
    61   and can &quot;stab you in the back&quot; according to
    61   and can &quot;stab you in the back&quot; according to
    62   this recent <A HREF="http://tech.blog.cueup.com/regular-expressions-will-stab-you-in-the-back">blog post</A>.
    62   this <A HREF="http://tech.blog.cueup.com/regular-expressions-will-stab-you-in-the-back">blog post</A>.
    63   For example, if you use in <A HREF="http://www.python.org">Python</A> or 
    63   For example, if you use in <A HREF="http://www.python.org">Python</A> or 
    64   in <A HREF="http://www.ruby-lang.org/en/">Ruby</A> (probably also in other mainstream programming languages) the 
    64   in <A HREF="http://www.ruby-lang.org/en/">Ruby</A> (probably also in other mainstream programming languages) the 
    65   innocently looking regular expression <code>a?{28}a{28}</code> and match it, say, against the string 
    65   innocently looking regular expression <code>a?{28}a{28}</code> and match it, say, against the string 
    66   <code>aaaaaaaaaaaaaaaaaaaaaaaaaaaa</code> (that is 28 <code>a</code>s), you will soon notice that your CPU usage goes to 100%. In fact,
    66   <code>aaaaaaaaaaaaaaaaaaaaaaaaaaaa</code> (that is 28 <code>a</code>s), you will soon notice that your CPU usage goes to 100%. In fact,
    67   Python and Ruby need approximately 30 seconds for matching this string. You can try it for yourself:
    67   Python and Ruby need approximately 30 seconds of hard work for matching this string. You can try it for yourself:
    68   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re.py">re.py</A> (Python version) and 
    68   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re.py">re.py</A> (Python version) and 
    69   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re-internal.rb">re.rb</A> 
    69   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re-internal.rb">re.rb</A> 
    70   (Ruby version). You can imagine an attacker
    70   (Ruby version). You can imagine an attacker
    71   mounting a nice <A HREF="http://en.wikipedia.org/wiki/Denial-of-service_attack">DoS attack</A> against 
    71   mounting a nice <A HREF="http://en.wikipedia.org/wiki/Denial-of-service_attack">DoS attack</A> against 
    72   your program if it contains such an &quot;evil&quot; regular expression. Actually 
    72   your program if it contains such an &quot;evil&quot; regular expression. Actually 
    73   <A HREF="http://www.scala-lang.org/">Scala</A> (and also Java) are almost immune from such
    73   <A HREF="http://www.scala-lang.org/">Scala</A> (and also Java) are almost immune from such
    74   attacks as they can deal with strings of up to 4,300 <code>a</code>s in less than a second. But if you scale
    74   attacks as they can deal with strings of up to 4,300 <code>a</code>s in less than a second. But if you scale
    75   the regular expression and string further to, say, 4,600 <code>a</code>s, you get a <code>StackOverflowError</code> 
    75   the regular expression and string further to, say, 4,600 <code>a</code>s, then you get a <code>StackOverflowError</code> 
    76   exception chrashing your program.
    76   potentially chrashing your program.
    77   </p>
    77   </p>
    78 
    78 
    79   <p>
    79   <p>
    80   On a rainy afternoon, I implemented 
    80   On a rainy afternoon, I implemented 
    81   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re3.scala">this</A> 
    81   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re3.scala">this</A> 
    96 
    96 
    97   <p>
    97   <p>
    98   Now the guys from the 
    98   Now the guys from the 
    99   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">PPDP'12-paper</A> mentioned 
    99   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">PPDP'12-paper</A> mentioned 
   100   above claim they are even faster than me and can deal with even more features of regular expressions
   100   above claim they are even faster than me and can deal with even more features of regular expressions
   101   (for example subexpression matching, which my rainy-afternoon matcher lacks). I am sure they thought
   101   (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought
   102   about the problem much longer than a single afternoon. The task 
   102   about the problem much longer than a single afternoon. The task 
   103   in this project is to find out how good they actually are by implementing the results from their paper. 
   103   in this project is to find out how good they actually are by implementing the results from their paper. 
   104   Their approach is based on the concept of partial derivatives introduced in 1994 by
   104   Their approach is based on the concept of partial derivatives introduced in 1994 by
   105   <A HREF="http://reference.kfupm.edu.sa/content/p/a/partial_derivatives_of_regular_expressio_1319383.pdf">Valentin Antimirov</A>.
   105   <A HREF="http://reference.kfupm.edu.sa/content/p/a/partial_derivatives_of_regular_expressio_1319383.pdf">Valentin Antimirov</A>.
   106   I used them once myself in a <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Publications/rexp.pdf">paper</A> 
   106   I used them once myself in a <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Publications/rexp.pdf">paper</A> 
   107   in order to prove the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>.
   107   in order to prove the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>.
   108   So I know they are worth their money. Still, it would be interesting to actually compare their results
   108   So I know they are worth their money. Still, it would be interesting to actually compare their results
   109   with my simple rainy-afternoon matcher and &quot;blow away&quot; the regular expression matchers in Python and Ruby (and possibly
   109   with my simple rainy-afternoon matcher and potentially &quot;blow away&quot; the regular expression matchers 
   110   in Scala too).
   110   in Python and Ruby (and possibly in Scala too).
   111   </p>
   111   </p>
   112 
   112 
   113   <p>
   113   <p>
   114   <B>Literature:</B> 
   114   <B>Literature:</B> 
   115   The place to start with this project is obviously this
   115   The place to start with this project is obviously this
   141 
   141 
   142 
   142 
   143 <li> <H4>[CU2] Automata Theory in Your Web-Browser</H4>
   143 <li> <H4>[CU2] Automata Theory in Your Web-Browser</H4>
   144 
   144 
   145 <p>
   145 <p>
       
   146 There are a number of classic algorithms in automata theory (such as the transformation of regular
       
   147 expressions into NFAs and DFAs, automata minimisation, subset construction). All these algorithms involve a fair 
       
   148 amount of calculations, which cannot be easily done by hand. There are a few web applications that annimate these 
       
   149 calculations, for example <A HREF="http://hackingoff.com/compilers/regular-expression-to-nfa-dfa">this one<A/>.
       
   150 </p>
       
   151 
       
   152 <p>
       
   153 There now many useful libraries for Javascript, for example, this one for graphs. There are also
       
   154 a number of new programming languages targetting Javascript. This project is for someone who
       
   155 want to get to know these languges by implementing and animating algorithms from automata
       
   156 theory or parsing.
       
   157 </p>
       
   158 
       
   159   <B>Literature:</B> 
       
   160   The same general literature as in [CU1].
       
   161   </p>  
       
   162 
       
   163 <p>
   146   <B>Skills:</B> 
   164   <B>Skills:</B> 
   147   This is a project for a student with good programming skills. 
   165   This is a project for a student with good programming skills. 
   148   JavaScript or a similar web-programming language seems to be best suited 
   166   JavaScript or a similar web-programming language seems to be best suited 
   149   for this project. Some knowledge in HTML and CSS cannot hurd either.
   167   for this project. Some knowledge in HTML and CSS cannot hurt either.
   150   </p>
   168   </p>
   151 
   169 
   152 
   170 
   153 <!--
   171 <!--
   154 <li> <H4>[CU2] Equivalence Checking of Regular Expressions</H4>
   172 <li> <H4>[CU2] Equivalence Checking of Regular Expressions</H4>
   194 
   212 
   195 <li> <H4>[CU3] Machine Code Generation for a Simple Compiler</H4>
   213 <li> <H4>[CU3] Machine Code Generation for a Simple Compiler</H4>
   196 
   214 
   197   <p>
   215   <p>
   198   <b>Description:</b> 
   216   <b>Description:</b> 
   199   Compilers translate high-level programs that humans can read and write into
   217   Compilers translate high-level programs that humans can read into
   200   efficient machine code that can be run on a CPU or virtual machine.
   218   efficient machine code that can be run on a CPU or virtual machine.
   201   I recently implemented a very simple compiler for a very simple functional
   219   I recently implemented a very simple compiler for a very simple functional
   202   programming language following this 
   220   programming language following this 
   203   <A HREF="http://www.cs.princeton.edu/~dpw/papers/tal-toplas.pdf">paper</A> 
   221   <A HREF="http://www.cs.princeton.edu/~dpw/papers/tal-toplas.pdf">paper</A> 
   204   (also described <A HREF="http://www.cs.princeton.edu/~dpw/papers/tal-tr.pdf">here</A>).
   222   (also described <A HREF="http://www.cs.princeton.edu/~dpw/papers/tal-tr.pdf">here</A>).
   205   My code, written in <A HREF="http://www.scala-lang.org/">Scala</A>, of this compiler is 
   223   My code, written in <A HREF="http://www.scala-lang.org/">Scala</A>, of this compiler is 
   206   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/compiler.scala">here</A>.
   224   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/compiler.scala">here</A>.
   207   The compiler can deal with simple programs involving natural numbers, such
   225   The compiler can deal with simple programs involving natural numbers, such
   208   as Fibonacci numbers
   226   as Fibonacci 
   209   or factorial (but it can be easily extended - that is not the point).
   227   or factorial (but it can be easily extended - that is not the point).
   210   </p>
   228   </p>
   211 
   229 
   212   <p>
   230   <p>
   213   While the hard work has been done (understanding the two papers above),
   231   While the hard work has been done (understanding the two papers above),
   244   <B>Skills:</B> 
   262   <B>Skills:</B> 
   245   This is a project for a student with a deep interest in programming languages and
   263   This is a project for a student with a deep interest in programming languages and
   246   compilers. Since my compiler is implemented in <A HREF="http://www.scala-lang.org/">Scala</A>,
   264   compilers. Since my compiler is implemented in <A HREF="http://www.scala-lang.org/">Scala</A>,
   247   it would make sense to continue this project in this language. I can be
   265   it would make sense to continue this project in this language. I can be
   248   of help with questions and books about <A HREF="http://www.scala-lang.org/">Scala</A>.
   266   of help with questions and books about <A HREF="http://www.scala-lang.org/">Scala</A>.
   249   But if Scala is a problem, my code can also be translated quickly into any other functional
   267   But if Scala is a problem, my code can also be translated quickly into any other
   250   language. 
   268   language. 
   251   </p>
   269   </p>
   252 
   270 
   253 <li> <H4>[CU4] Implementation of Register Spilling Algorithms</H4>
   271 <li> <H4>[CU4] Implementation of Register Spilling Algorithms</H4>
   254   
   272   
   337   <A HREF="http://en.wikipedia.org/wiki/Electronic_voting">electronic voting</A>, 
   355   <A HREF="http://en.wikipedia.org/wiki/Electronic_voting">electronic voting</A>, 
   338   which essentially is still an unsolved problem in Computer Science. The
   356   which essentially is still an unsolved problem in Computer Science. The
   339   students only need to be prevented from answering question more than once thus skewing
   357   students only need to be prevented from answering question more than once thus skewing
   340   any statistics. Unlike electronic voting, no audit trail needs to be kept
   358   any statistics. Unlike electronic voting, no audit trail needs to be kept
   341   for student polling. Restricting the number of answers can probably be solved 
   359   for student polling. Restricting the number of answers can probably be solved 
   342   by setting appropriate cookies on the students
   360   by setting appropriate cookies on the students'
   343   computers or smart phones.
   361   computers or smart phones.
       
   362   </p>
       
   363 
       
   364   <p>
       
   365   However, there is one restriction that makes this project harder than it seems
       
   366   as first sight. The department does not allow large server applications and databases
       
   367   to be run on calcium. So the problem should be solved with as few resources needed
       
   368   on the &quot;back-end&quot; which collects the votes. 
   344   </p>
   369   </p>
   345 
   370 
   346   <p>
   371   <p>
   347   <B>Literature:</B> 
   372   <B>Literature:</B> 
   348   The project requires fluency in a web-programming language (for example 
   373   The project requires fluency in a web-programming language (for example 
   349   <A HREF="http://en.wikipedia.org/wiki/JavaScript">Javascript</A>,
   374   <A HREF="http://en.wikipedia.org/wiki/JavaScript">Javascript</A>,
   350   <A HREF="http://en.wikipedia.org/wiki/PHP">PHP</A>, 
   375   <A HREF="http://en.wikipedia.org/wiki/PHP">PHP</A>, 
   351   Java, <A HREF="http://www.python.org">Python</A>, 
   376   Java, <A HREF="http://www.python.org">Python</A>, 
   352   <A HREF="http://en.wikipedia.org/wiki/Go_(programming_language)">Go</A>, 
   377   <A HREF="http://en.wikipedia.org/wiki/Go_(programming_language)">Go</A>, 
   353   <A HREF="http://www.scala-lang.org/">Scala</A>,
   378   <A HREF="http://www.scala-lang.org/">Scala</A>,
   354   <A HREF="http://en.wikipedia.org/wiki/Ruby_(programming_language)">Ruby</A>) 
   379   <A HREF="http://en.wikipedia.org/wiki/Ruby_(programming_language)">Ruby</A>). 
   355   and possibly a cloud application platform (for example
       
   356   <A HREF="https://developers.google.com/appengine/">Google App Engine</a> or 
       
   357   <A HREF="http://www.heroku.com">Heroku</A>).
       
   358   For web-programming the 
   380   For web-programming the 
   359   <A HREF="http://www.udacity.com/overview/Course/cs253/CourseRev/apr2012">Web Application Engineering</A>
   381   <A HREF="http://www.udacity.com/overview/Course/cs253/CourseRev/apr2012">Web Application Engineering</A>
   360   course at <A HREF="http://www.udacity.com">Udacity</A> is a good starting point 
   382   course at <A HREF="http://www.udacity.com">Udacity</A> is a good starting point 
   361   to be aware of the issues involved. This course uses <A HREF="http://www.python.org">Python</A>.
   383   to be aware of the issues involved. This course uses <A HREF="http://www.python.org">Python</A>.
   362   To evaluate the answers from the student, Google's 
   384   To evaluate the answers from the student, Google's 
   365   <A HREF="http://www.youtube.com/watch?v=NZtgT4jgnE8">youtube</A> video.
   387   <A HREF="http://www.youtube.com/watch?v=NZtgT4jgnE8">youtube</A> video.
   366   </p>
   388   </p>
   367 
   389 
   368   <p>
   390   <p>
   369   <B>Skills:</B> 
   391   <B>Skills:</B> 
   370   In order to provide convenience for the lecturer, this project needs very good web-programming skills. A 
   392   This project needs very good web-programming skills. A 
   371   <A HREF="http://en.wikipedia.org/wiki/Hacker_(programmer_subculture)">hacker mentality</A>
   393   <A HREF="http://en.wikipedia.org/wiki/Hacker_(programmer_subculture)">hacker mentality</A>
   372   (see above) is probably very beneficial: web-programming is an area that only emerged recently and
   394   (see above) is probably very beneficial: web-programming is an area that only emerged recently and
   373   many tools still lack maturity. You probably have to experiment a lot with several different
   395   many tools still lack maturity. You probably have to experiment a lot with several different
   374   languages and tools.
   396   languages and tools.
   375   </p>
   397   </p>