projects.html
changeset 431 a1e588fe2f38
parent 430 9dae6e101cde
child 432 87c1ad539fc9
equal deleted inserted replaced
430:9dae6e101cde 431:a1e588fe2f38
     1 <?xml version="1.0" encoding="utf-8"?>
       
     2 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
       
     3 <HEAD>
       
     4 <TITLE>Christian Urban</TITLE>
       
     5 <BASE HREF="http://www.inf.kcl.ac.uk/staff/urbanc/">
       
     6 <script type="text/javascript" src="striper.js"></script>
       
     7 <link rel="stylesheet" href="nominal.css">
       
     8 </HEAD>
       
     9 <BODY TEXT="#000000" 
       
    10       BGCOLOR="#4169E1" 
       
    11       LINK="#0000EF" 
       
    12       VLINK="#51188E" 
       
    13       ALINK="#FF0000"
       
    14       ONLOAD="striper('ul','striped','li','first,second')">
       
    15 
       
    16 
       
    17 
       
    18 <TABLE WIDTH="100%" 
       
    19        BGCOLOR="#4169E1" 
       
    20        BORDER="0"   
       
    21        FRAME="border"  
       
    22        CELLPADDING="10"     
       
    23        CELLSPACING="2"
       
    24        RULES="all">
       
    25 
       
    26 <TR>
       
    27 <TD BGCOLOR="#FFFFFF" 
       
    28     WIDTH="75%" 
       
    29     VALIGN="TOP">
       
    30 
       
    31 <H2>2011/12 MSc Individual Projects</H2>
       
    32 <H4>Supervisor: Christian Urban</H4> 
       
    33 <H4>Email: christian dot urban at kcl dot ac dot uk,  Office: Strand Building S6.30</H4>
       
    34 <H4>If you are interested in a project, please send me an email and we can discuss details. Please include
       
    35 a short description about your programming skills and computer science background in your first email. 
       
    36 I will also need your King's username in order to book the project for you. Thanks.</H4> 
       
    37 
       
    38 <ul class="striped">
       
    39 <li> <H4>[CU1] Implementing a SAT-Solver in a Functional Programming Language</H4>
       
    40 
       
    41   <p><B>Description:</b>  
       
    42   SAT-solver search for satisfying assignments of boolean formulas. Although this 
       
    43   is a computationally hard problem (<A HREF="http://en.wikipedia.org/wiki/NP-complete">NP-complete</A>), 
       
    44   modern SAT-solvers routinely solve boolean formulas with 100,000 and more variables. 
       
    45   Application areas of SAT-solver are manifold: they range from hardware verification to 
       
    46   Sudoku solvers (see <a href="http://anytime.cs.umass.edu/aimath06/proceedings/P34.pdf">here</a>). 
       
    47   Every 2 years there is a competition of the best SAT-solvers in the world.</p> 
       
    48 
       
    49   <p>
       
    50   Most SAT-solvers are written in C. The aim of this project is to design and implement 
       
    51   a SAT-solver in a functional programming language (preferably 
       
    52   <A HREF="http://en.wikipedia.org/wiki/Standard_ML">ML</A>, but 
       
    53   <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>, 
       
    54   <A HREF="http://www.scala-lang.org/">Scala</A>,
       
    55   <A HREF="http://caml.inria.fr/">OCaml</A>, ... are also OK). Starting point is 
       
    56   the open source SAT-solver MiniSat (available <A HREF="http://minisat.se/Main.html">here</A>). 
       
    57   The long-term hope is that your implementation becomes part of the interactive theorem prover 
       
    58   <A HREF="http://www.cl.cam.ac.uk/research/hvg/isabelle/">Isabelle</A>. For this
       
    59   the SAT-solver needs to be implemented in ML.</p> 
       
    60 
       
    61   <p>
       
    62   <B>Tasks:</B> Understand MiniSat, design and code a SAT-solver in ML, 
       
    63   empirical evaluation and tuning of your code.</p>
       
    64 
       
    65   <p>
       
    66   <B>Literature:</B> A good starting point for reading about SAT-solving is the handbook
       
    67   article <A HREF="http://www.cs.cornell.edu/gomes/papers/SATSolvers-KR-Handbook.pdf">here</A>.
       
    68   MiniSat is explained <A HREF="http://minisat.se/downloads/MiniSat.pdf">here</A> and
       
    69   <A HREF="http://minisat.se/Papers.html">here</A>. The standard reference for ML is
       
    70   <A HREF="http://www.cl.cam.ac.uk/~lp15/MLbook/">here</A> (I can lend you my copy 
       
    71   of this book for the duration of the project). The best free implementation of ML is 
       
    72   <A HREF="http://www.polyml.org/">PolyML</A>.
       
    73   </p>
       
    74 
       
    75 <li> <H4>[CU2] A Compiler for System F</H4>
       
    76 
       
    77   <p><b>Description:</b> 
       
    78   <A HREF="http://en.wikipedia.org/wiki/System_F">System F</A> is a mini programming language, 
       
    79   which is often used to study the theory behind programming languages, but is also used as 
       
    80   a core-language of functional programming languages (for example 
       
    81   <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>). The language is small
       
    82   enough to implement in a reasonable amount of time a compiler to an
       
    83   idealised assembly language (preferably 
       
    84   <A HREF="http://en.wikipedia.org/wiki/Typed_assembly_language">TAL</A>) or an abstract machine.
       
    85   This has been explained in full detail in a PhD-thesis by  Louis-Julien Guillemette
       
    86   (available in English <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">here</A>). He used <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>
       
    87   as his implementation language. Other choices are possible.
       
    88   </p>
       
    89 
       
    90   <p>
       
    91   <b>Tasks:</b>
       
    92   Read the relevant literature and implement the various components of a compiler
       
    93   (parser, intermediate languages, simulator for the idealised assembly language).
       
    94   This project is for a good student with an interest in programming languages,
       
    95   who can also translate abstract ideas into code. If it is too difficult, the project can
       
    96   be easily scaled down to the 
       
    97   <A HREF="http://en.wikipedia.org/wiki/Simply_typed_lambda_calculus">simply-typed 
       
    98   lambda calculus</A> (which is simpler than
       
    99   System F) or to cover only some components of the compiler.
       
   100   </p> 
       
   101 
       
   102   <p>
       
   103   <B>Literature:</B>
       
   104   The <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">PhD-thesis</A> by  Louis-Julien Guillemette is required reading. A shorter
       
   105   paper about this subject is available <A HREF="http://www.iro.umontreal.ca/~monnier/icfp08.pdf">here</A>.
       
   106   A good starting point for TAL is <A HREF="http://www.cs.cornell.edu/talc/papers/tal-tr.pdf">here</A>.
       
   107   There is a lot of literature about compilers 
       
   108   (for example <A HREF="http://www.cs.princeton.edu/~appel/papers/cwc.html">this book</A> -
       
   109   I can lend you my copy for the duration of the project). A very good overview article
       
   110   about implementing compilers by 
       
   111   <A HREF="http://tratt.net/laurie/">Laurie Tratt</A> is 
       
   112   <A HREF="http://tratt.net/laurie/tech_articles/articles/how_difficult_is_it_to_write_a_compiler">here</A>.
       
   113   </p>
       
   114 
       
   115   <li> <H4>[CU3] Sorting Suffixes</H4>
       
   116   
       
   117   <p><b>Description:</b> Given a string, take all its suffixes, and sort them.
       
   118   This is often called <A HREF="http://en.wikipedia.org/wiki/Suffix_array">suffix 
       
   119   array sorting</A>. It sound simple, but there are some difficulties. 
       
   120   The naive algorithm would generate all suffix strings and sort them
       
   121   using a standard sorting algorithm, for example 
       
   122   <A HREF="http://en.wikipedia.org/wiki/Quicksort">quicksort</A>. 
       
   123   The problem is that
       
   124   this algorithm is not optimal for suffix sorting: it does not take into account that you sort
       
   125   suffixes and it also takes a quadratic amount of space. This is a 
       
   126   huge problem if you have to sort strings of several Megabytes or even Gigabytes,
       
   127   as happens often in biotech and DNA data mining. Suffix sorting is also a crucial operation for the 
       
   128   <A HREF="http://en.wikipedia.org/wiki/Burrows-Wheeler_transform">Burrows-Wheeler transform</A>
       
   129   on which the data compression algorithm of the popular 
       
   130   <A HREF="http://en.wikipedia.org/wiki/Bzip2">bzip2</A>
       
   131   program is based.
       
   132   </p>
       
   133 
       
   134   <p>
       
   135   There are more efficient algorithms for suffix sorting, for example 
       
   136   <A HREF="http://books.google.co.uk/books?id=Pn1sHToYf9oC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false">here</A> and 
       
   137   <A HREF="http://ls11-www.cs.uni-dortmund.de/people/rahmann/teaching/ss2008/AlgorithmenAufSequenzen/09-walk-bwt.pdf">here</A>. 
       
   138   However the most space efficient algorithm for suffix sorting  
       
   139   (<A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">here</A>) 
       
   140   is horrendously complicated. Your task would be to understand it, and then implement it.
       
   141   </p>
       
   142   
       
   143   <p>
       
   144   <B>Tasks:</B>
       
   145   Start by reading the literature about suffix sorting. Then work through the
       
   146   12-page <A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">paper</A> 
       
   147   explaining the horrendously complicated algorithm and implement it.
       
   148   Time permitting the work can include an implementation of the Burrows-Wheeler 
       
   149   data compression. This project is for a good student, who likes to study in-depth 
       
   150   algorithms. The project can be carried out in almost all programming languages,
       
   151   including C, Java, Scala, ML, Haskell and so on.
       
   152   </p>
       
   153 
       
   154   <p>
       
   155   <B>Literature:</B> A good starting point for reading about suffix sorting is the 
       
   156   <A HREF="http://books.google.co.uk/books?id=Pn1sHToYf9oC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false">book</A> by Crochemore. 
       
   157   Another good introduction is 
       
   158   <A HREF="http://people.unipmn.it/manzini/papers/esa02.pdf">here</A>, 
       
   159   which gives also good pointers for why efficient suffix sorting
       
   160   is practically relevant.
       
   161   Two simple algorithms are described
       
   162   <A HREF="http://ls11-www.cs.uni-dortmund.de/people/rahmann/teaching/ss2008/AlgorithmenAufSequenzen/09-walk-bwt.pdf">here</A>. The main literature is the 12-page
       
   163   <A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">article</A> about in-place
       
   164   suffix sorting. The Burrows-Wheeler data compression is described 
       
   165   <A HREF="http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.pdf">here</A>.
       
   166   </p>
       
   167 
       
   168 <li> <H4>[CU4] Simplification with Equivalence Relations in the Isabelle Theorem Prover</H4>
       
   169   <p>
       
   170   <B>Description:</B>
       
   171   In this project you have to extend the simplifier of the 
       
   172   <A HREF="http://isabelle.in.tum.de/">Isabelle theorem prover</A>.  
       
   173   The simplifier is an important reasoning tool of this theorem prover: it 
       
   174   replaces a term by another term that can be proved to be equal to it. However, 
       
   175   currently the simplifier only rewrites terms according to equalities. 
       
   176   Assuming &asymp; is an equivalence relation, the simplifier should also be able 
       
   177   to rewrite terms according to &asymp;. Since equivalence relations occur 
       
   178   frequently in automated reasoning, this extension would make the simplifier 
       
   179   more powerful and useful. The hope is that your code can go into the
       
   180   code base of Isabelle.
       
   181   </p>
       
   182 
       
   183   <p>
       
   184   <B>Tasks:</B>	
       
   185   Read the <A HREF="http://www.springerlink.com/content/x7041m1807738832/">paper</A>
       
   186   about rewriting with equivalence relations. Get familiar with parts of the 
       
   187   implementation of Isabelle (I will be of much help as I can). Implement
       
   188   the extension. This project is suitable for a student with a bit of math background.
       
   189   It requires knowledge of the functional programming language ML, which
       
   190   however can be learned quickly provided you have already written code
       
   191   in another functional programming language.
       
   192   </p>
       
   193 
       
   194   <p>
       
   195   <B>Literature:</B> A good starting point for reading about rewriting modulo equivalences 
       
   196   is the paper <A HREF="http://www.springerlink.com/content/x7041m1807738832/">here</A>, 
       
   197   which uses the ACL2 theorem prover. The implementation of the Isabelle theorem
       
   198   prover is described in much detail in this 
       
   199   <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Cookbook/">programming tutorial</A>.
       
   200   The standard reference for ML is
       
   201   <A HREF="http://www.cl.cam.ac.uk/~lp15/MLbook/">here</A> (I can lend you my copy 
       
   202   of this book for the duration of the project).
       
   203   </p>
       
   204 
       
   205 
       
   206 <li><h4>[CU5] Lexing and Parsing with Derivatives</h4>
       
   207 
       
   208   <p>
       
   209   <B>Description:</B>
       
   210   Lexing and parsing are usually done using automated tools, like 
       
   211   <A HREF="http://en.wikipedia.org/wiki/Lex_programming_tool">lex</A> and 
       
   212   <A HREF="http://en.wikipedia.org/wiki/Yacc">yacc</A>. The problem 
       
   213   with them is that they "work when they work", but if they do not, then they are
       
   214   <A HREF="http://en.wikipedia.org/wiki/Black_box">black boxes</A>
       
   215   which are difficult to debug and change. They are really quite 
       
   216   clumsy to the point that Might and Darais wrote a paper titled 
       
   217   "<A HREF="http://arxiv.org/pdf/1010.5023v1">Yacc is dead</A>".</p>
       
   218  
       
   219   <p>
       
   220   There is a simple algorithm for regular expression matching (that is lexing).
       
   221   This algorithm was introduced by 
       
   222   <A HREF="http://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)">Brzozowski</A> 
       
   223   in 1964. It is based on the notion of derivatives of regular expressions and 
       
   224   has proved <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">useful</A> 
       
   225   for practical lexing. Last year the notion of derivatives was extended by 
       
   226   <A HREF="http://matt.might.net/papers/might2011derivatives.pdf">Might et al</A>
       
   227   to <A HREF="http://en.wikipedia.org/wiki/Context-free_grammar">context free grammars</A> 
       
   228   and parsing.
       
   229   </p>		      
       
   230   
       
   231   <p>
       
   232   <B>Tasks:</B> Get familiar with the two algorithms and implement them. Regular
       
   233   expression matching is relatively simple; parsing with derivatives is the 
       
   234   harder part. Therefore you should empirically evaluate this part and
       
   235   tune your implementation. The project can be carried out in almost all programming 
       
   236   languages, including C, Java, Scala, ML, Haskell and so on.
       
   237   </p>
       
   238 
       
   239   <p>
       
   240   <B>Literature:</B> This 
       
   241   <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">paper</A> 
       
   242   gives a modern introduction to derivative based lexing. Derivative-based
       
   243   parsing is explained <A HREF="http://arxiv.org/pdf/1010.5023v1">here</A>
       
   244   and <A HREF="http://matt.might.net/papers/might2011derivatives.pdf">here</A>.
       
   245   A proposal for derivative PEG-parsing is 
       
   246   <A HREF="http://fmota.eu/2011/01/07/PEG-derivatives.html">here</a>. The mailing
       
   247   list about PEGs is <A HREF="https://lists.csail.mit.edu/pipermail/peg/">here</A>.
       
   248   </p>  
       
   249 
       
   250 <li> <H4>[CU6] Equivalence Checking of Regular Expressions using the Method by Antimirov and Mosses</H4>
       
   251 
       
   252   <p>
       
   253   <B>Description:</B> 
       
   254   Solving the problem of deciding equivalence of regular expressions can be used
       
   255   to decide a number of problems in automated reasoning. Therefore one likes to
       
   256   have a method for equivalence checking that is as fast as possible. There have
       
   257   been a number of algorithms proposed in the past, but one based on a method
       
   258   by Antimirov and Mosses seems relatively simple and easy to implement.
       
   259   </p>		      
       
   260   
       
   261   <p>
       
   262   <B>Tasks:</B>
       
   263   The task is to implement the algorithm by Antimirov and Mosses and compare it to
       
   264   other methods. Hopefully the algorithm can be tuned to be faster than other
       
   265   methods. The project can be carried out in almost all programming languages, but
       
   266   as usual functional programming languages such Scala, ML, Haskell have an edge
       
   267   for this kind of problems.
       
   268   </p>
       
   269 
       
   270   <p>
       
   271   <B>Literature:</B>
       
   272   Central to this project are the papers <A HREF="http://www.dcc.fc.up.pt/~nam/publica/ijcs08.pdf">here</A>
       
   273   and <A HREF="http://www.dcc.fc.up.pt/~nam/publica/51480046.pdf">here</A>.
       
   274   Other methods have been described, for example, 
       
   275   <A HREF="http://www4.informatik.tu-muenchen.de/~krauss/papers/rexp.pdf">here</A>.
       
   276   A relatively complicated method, based on automata, is described 
       
   277   <A HREF="http://sardes.inrialpes.fr/~braibant/atbr/">here</A>.
       
   278   </p>  
       
   279 
       
   280 <li> <H4>[CU7] Game-Playing Engine for Five-In-A-Row on a Large Board</H4>
       
   281 
       
   282   <p>
       
   283   <B>Literature:</b>
       
   284   There is a web-page with various pointers to computer players
       
   285   <A HREF="http://webdocs.cs.ualberta.ca/~games/">here</A>. There are
       
   286   also some good books about computer players, for example:
       
   287   <table cellspacing="10">
       
   288   <tr><td><i>Artificial Intelligence: A Modern Approach</i> by S. Russel and P. Norvig, Prentice Hall, 2003 
       
   289   (a standard textbook about search strategies).
       
   290   </td></tr>
       
   291   <tr><td><i>Principles of Artificial Intelligence</i> by N. J. Nilsson, Springer Verlag, 1980 
       
   292   (a standard textbook about search strategies).
       
   293   </td></tr>
       
   294 
       
   295   <tr><td><i>Computer Game-Playing: Theory and Practice</i> by M. Bramer, Ellis Horwood Ltd, 1983
       
   296   (considers techniques used for programming a variety of games: Chess, Go, Scrabble, Billiards, 
       
   297    Othello, etc; includes theoretical issues about game searching).
       
   298   </td></tr>
       
   299   <tr><td><i>Chips Challenging Champions: Games, Computers and Artificial Intelligence</i> by
       
   300   J. Schaeffer and H.J. van den Herik, North Holland, 2002.
       
   301   </td></tr>
       
   302   <tr><td>
       
   303   <i>Artificial Intelligence for Games</i> by I. Millington and J. Funge, Morgan Kaufmann, 2009.
       
   304   </td></tr>
       
   305   <tr><td>
       
   306   <i>Computer Gamesmanship: The Complete Guide to Creating and Structuring Intelligent Games Programs</i> 
       
   307   by D.N.L. Levy, Simon and Schuster, 1983.
       
   308   </td></tr>
       
   309   </table>
       
   310   </p>
       
   311 
       
   312 <li><h4>[CU8] Webserver for a Revision Control System</h4>
       
   313 
       
   314   <p>
       
   315     Modern revision control systems are
       
   316     <A HREF="http://mercurial.selenic.com/">mercurial</A> and
       
   317     <A HREF="http://git-scm.com/">git</A>.
       
   318   </p>
       
   319 
       
   320   <p>
       
   321     <b>Task:</b> Build a webserver for a revision control system 
       
   322     that allows user management. 
       
   323   </p>
       
   324 
       
   325 </ul>
       
   326 </TD>
       
   327 </TR>
       
   328 </TABLE>
       
   329 
       
   330 <P><!-- Created: Tue Mar  4 00:23:25 GMT 1997 -->
       
   331 <!-- hhmts start -->
       
   332 Last modified: Wed Jan 11 16:30:03 GMT 2012
       
   333 <!-- hhmts end -->
       
   334 <a href="http://validator.w3.org/check/referer">[Validate this page.]</a>
       
   335 </BODY>
       
   336 </HTML>