projects.html
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Thu, 26 Sep 2013 12:29:25 +0100
changeset 234 94341920e4d7
parent 125 d399003a53ca
permissions -rw-r--r--
added data

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HEAD>
<TITLE>Christian Urban</TITLE>
<BASE HREF="http://www.inf.kcl.ac.uk/staff/urbanc/">
<script type="text/javascript" src="striper.js"></script>
<link rel="stylesheet" href="nominal.css">
</HEAD>
<BODY TEXT="#000000" 
      BGCOLOR="#4169E1" 
      LINK="#0000EF" 
      VLINK="#51188E" 
      ALINK="#FF0000"
      ONLOAD="striper('ul','striped','li','first,second')">



<TABLE WIDTH="100%" 
       BGCOLOR="#4169E1" 
       BORDER="0"   
       FRAME="border"  
       CELLPADDING="10"     
       CELLSPACING="2"
       RULES="all">

<TR>
<TD BGCOLOR="#FFFFFF" 
    WIDTH="75%" 
    VALIGN="TOP">

<H2>2011/12 MSc Individual Projects</H2>
<H4>Supervisor: Christian Urban</H4> 
<H4>Email: christian dot urban at kcl dot ac dot uk,  Office: Strand Building S6.30</H4>
<H4>If you are interested in a project, please send me an email and we can discuss details. Please include
a short description about your programming skills and computer science background in your first email. 
I will also need your King's username in order to book the project for you. Thanks.</H4> 

<ul class="striped">
<li> <H4>[CU1] Implementing a SAT-Solver in a Functional Programming Language</H4>

  <p><B>Description:</b>  
  SAT-solver search for satisfying assignments of boolean formulas. Although this 
  is a computationally hard problem (<A HREF="http://en.wikipedia.org/wiki/NP-complete">NP-complete</A>), 
  modern SAT-solvers routinely solve boolean formulas with 100,000 and more variables. 
  Application areas of SAT-solver are manifold: they range from hardware verification to 
  Sudoku solvers (see <a href="http://anytime.cs.umass.edu/aimath06/proceedings/P34.pdf">here</a>). 
  Every 2 years there is a competition of the best SAT-solvers in the world.</p> 

  <p>
  Most SAT-solvers are written in C. The aim of this project is to design and implement 
  a SAT-solver in a functional programming language (preferably 
  <A HREF="http://en.wikipedia.org/wiki/Standard_ML">ML</A>, but 
  <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>, 
  <A HREF="http://www.scala-lang.org/">Scala</A>,
  <A HREF="http://caml.inria.fr/">OCaml</A>, ... are also OK). Starting point is 
  the open source SAT-solver MiniSat (available <A HREF="http://minisat.se/Main.html">here</A>). 
  The long-term hope is that your implementation becomes part of the interactive theorem prover 
  <A HREF="http://www.cl.cam.ac.uk/research/hvg/isabelle/">Isabelle</A>. For this
  the SAT-solver needs to be implemented in ML.</p> 

  <p>
  <B>Tasks:</B> Understand MiniSat, design and code a SAT-solver in ML, 
  empirical evaluation and tuning of your code.</p>

  <p>
  <B>Literature:</B> A good starting point for reading about SAT-solving is the handbook
  article <A HREF="http://www.cs.cornell.edu/gomes/papers/SATSolvers-KR-Handbook.pdf">here</A>.
  MiniSat is explained <A HREF="http://minisat.se/downloads/MiniSat.pdf">here</A> and
  <A HREF="http://minisat.se/Papers.html">here</A>. The standard reference for ML is
  <A HREF="http://www.cl.cam.ac.uk/~lp15/MLbook/">here</A> (I can lend you my copy 
  of this book for the duration of the project). The best free implementation of ML is 
  <A HREF="http://www.polyml.org/">PolyML</A>.
  </p>

<li> <H4>[CU2] A Compiler for System F</H4>

  <p><b>Description:</b> 
  <A HREF="http://en.wikipedia.org/wiki/System_F">System F</A> is a mini programming language, 
  which is often used to study the theory behind programming languages, but is also used as 
  a core-language of functional programming languages (for example 
  <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>). The language is small
  enough to implement in a reasonable amount of time a compiler to an
  idealised assembly language (preferably 
  <A HREF="http://en.wikipedia.org/wiki/Typed_assembly_language">TAL</A>) or an abstract machine.
  This has been explained in full detail in a PhD-thesis by  Louis-Julien Guillemette
  (available in English <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">here</A>). He used <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>
  as his implementation language. Other choices are possible.
  </p>

  <p>
  <b>Tasks:</b>
  Read the relevant literature and implement the various components of a compiler
  (parser, intermediate languages, simulator for the idealised assembly language).
  This project is for a good student with an interest in programming languages,
  who can also translate abstract ideas into code. If it is too difficult, the project can
  be easily scaled down to the 
  <A HREF="http://en.wikipedia.org/wiki/Simply_typed_lambda_calculus">simply-typed 
  lambda calculus</A> (which is simpler than
  System F) or to cover only some components of the compiler.
  </p> 

  <p>
  <B>Literature:</B>
  The <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">PhD-thesis</A> by  Louis-Julien Guillemette is required reading. A shorter
  paper about this subject is available <A HREF="http://www.iro.umontreal.ca/~monnier/icfp08.pdf">here</A>.
  A good starting point for TAL is <A HREF="http://www.cs.cornell.edu/talc/papers/tal-tr.pdf">here</A>.
  There is a lot of literature about compilers 
  (for example <A HREF="http://www.cs.princeton.edu/~appel/papers/cwc.html">this book</A> -
  I can lend you my copy for the duration of the project). A very good overview article
  about implementing compilers by 
  <A HREF="http://tratt.net/laurie/">Laurie Tratt</A> is 
  <A HREF="http://tratt.net/laurie/tech_articles/articles/how_difficult_is_it_to_write_a_compiler">here</A>.
  </p>

  <li> <H4>[CU3] Sorting Suffixes</H4>
  
  <p><b>Description:</b> Given a string, take all its suffixes, and sort them.
  This is often called <A HREF="http://en.wikipedia.org/wiki/Suffix_array">suffix 
  array sorting</A>. It sound simple, but there are some difficulties. 
  The naive algorithm would generate all suffix strings and sort them
  using a standard sorting algorithm, for example 
  <A HREF="http://en.wikipedia.org/wiki/Quicksort">quicksort</A>. 
  The problem is that
  this algorithm is not optimal for suffix sorting: it does not take into account that you sort
  suffixes and it also takes a quadratic amount of space. This is a 
  huge problem if you have to sort strings of several Megabytes or even Gigabytes,
  as happens often in biotech and DNA data mining. Suffix sorting is also a crucial operation for the 
  <A HREF="http://en.wikipedia.org/wiki/Burrows-Wheeler_transform">Burrows-Wheeler transform</A>
  on which the data compression algorithm of the popular 
  <A HREF="http://en.wikipedia.org/wiki/Bzip2">bzip2</A>
  program is based.
  </p>

  <p>
  There are more efficient algorithms for suffix sorting, for example 
  <A HREF="http://books.google.co.uk/books?id=Pn1sHToYf9oC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false">here</A> and 
  <A HREF="http://ls11-www.cs.uni-dortmund.de/people/rahmann/teaching/ss2008/AlgorithmenAufSequenzen/09-walk-bwt.pdf">here</A>. 
  However the most space efficient algorithm for suffix sorting  
  (<A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">here</A>) 
  is horrendously complicated. Your task would be to understand it, and then implement it.
  </p>
  
  <p>
  <B>Tasks:</B>
  Start by reading the literature about suffix sorting. Then work through the
  12-page <A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">paper</A> 
  explaining the horrendously complicated algorithm and implement it.
  Time permitting the work can include an implementation of the Burrows-Wheeler 
  data compression. This project is for a good student, who likes to study in-depth 
  algorithms. The project can be carried out in almost all programming languages,
  including C, Java, Scala, ML, Haskell and so on.
  </p>

  <p>
  <B>Literature:</B> A good starting point for reading about suffix sorting is the 
  <A HREF="http://books.google.co.uk/books?id=Pn1sHToYf9oC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false">book</A> by Crochemore. 
  Another good introduction is 
  <A HREF="http://people.unipmn.it/manzini/papers/esa02.pdf">here</A>, 
  which gives also good pointers for why efficient suffix sorting
  is practically relevant.
  Two simple algorithms are described
  <A HREF="http://ls11-www.cs.uni-dortmund.de/people/rahmann/teaching/ss2008/AlgorithmenAufSequenzen/09-walk-bwt.pdf">here</A>. The main literature is the 12-page
  <A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">article</A> about in-place
  suffix sorting. The Burrows-Wheeler data compression is described 
  <A HREF="http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.pdf">here</A>.
  </p>

<li> <H4>[CU4] Simplification with Equivalence Relations in the Isabelle Theorem Prover</H4>
  <p>
  <B>Description:</B>
  In this project you have to extend the simplifier of the 
  <A HREF="http://isabelle.in.tum.de/">Isabelle theorem prover</A>.  
  The simplifier is an important reasoning tool of this theorem prover: it 
  replaces a term by another term that can be proved to be equal to it. However, 
  currently the simplifier only rewrites terms according to equalities. 
  Assuming &asymp; is an equivalence relation, the simplifier should also be able 
  to rewrite terms according to &asymp;. Since equivalence relations occur 
  frequently in automated reasoning, this extension would make the simplifier 
  more powerful and useful. The hope is that your code can go into the
  code base of Isabelle.
  </p>

  <p>
  <B>Tasks:</B>	
  Read the <A HREF="http://www.springerlink.com/content/x7041m1807738832/">paper</A>
  about rewriting with equivalence relations. Get familiar with parts of the 
  implementation of Isabelle (I will be of much help as I can). Implement
  the extension. This project is suitable for a student with a bit of math background.
  It requires knowledge of the functional programming language ML, which
  however can be learned quickly provided you have already written code
  in another functional programming language.
  </p>

  <p>
  <B>Literature:</B> A good starting point for reading about rewriting modulo equivalences 
  is the paper <A HREF="http://www.springerlink.com/content/x7041m1807738832/">here</A>, 
  which uses the ACL2 theorem prover. The implementation of the Isabelle theorem
  prover is described in much detail in this 
  <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Cookbook/">programming tutorial</A>.
  The standard reference for ML is
  <A HREF="http://www.cl.cam.ac.uk/~lp15/MLbook/">here</A> (I can lend you my copy 
  of this book for the duration of the project).
  </p>


<li><h4>[CU5] Lexing and Parsing with Derivatives</h4>

  <p>
  <B>Description:</B>
  Lexing and parsing are usually done using automated tools, like 
  <A HREF="http://en.wikipedia.org/wiki/Lex_programming_tool">lex</A> and 
  <A HREF="http://en.wikipedia.org/wiki/Yacc">yacc</A>. The problem 
  with them is that they "work when they work", but if they do not, then they are
  <A HREF="http://en.wikipedia.org/wiki/Black_box">black boxes</A>
  which are difficult to debug and change. They are really quite 
  clumsy to the point that Might and Darais wrote a paper titled 
  "<A HREF="http://arxiv.org/pdf/1010.5023v1">Yacc is dead</A>".</p>
 
  <p>
  There is a simple algorithm for regular expression matching (that is lexing).
  This algorithm was introduced by 
  <A HREF="http://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)">Brzozowski</A> 
  in 1964. It is based on the notion of derivatives of regular expressions and 
  has proved <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">useful</A> 
  for practical lexing. Last year the notion of derivatives was extended by 
  <A HREF="http://matt.might.net/papers/might2011derivatives.pdf">Might et al</A>
  to <A HREF="http://en.wikipedia.org/wiki/Context-free_grammar">context free grammars</A> 
  and parsing.
  </p>		      
  
  <p>
  <B>Tasks:</B> Get familiar with the two algorithms and implement them. Regular
  expression matching is relatively simple; parsing with derivatives is the 
  harder part. Therefore you should empirically evaluate this part and
  tune your implementation. The project can be carried out in almost all programming 
  languages, including C, Java, Scala, ML, Haskell and so on.
  </p>

  <p>
  <B>Literature:</B> This 
  <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">paper</A> 
  gives a modern introduction to derivative based lexing. Derivative-based
  parsing is explained <A HREF="http://arxiv.org/pdf/1010.5023v1">here</A>
  and <A HREF="http://matt.might.net/papers/might2011derivatives.pdf">here</A>.
  A proposal for derivative PEG-parsing is 
  <A HREF="http://fmota.eu/2011/01/07/PEG-derivatives.html">here</a>. The mailing
  list about PEGs is <A HREF="https://lists.csail.mit.edu/pipermail/peg/">here</A>.
  </p>  

<li> <H4>[CU6] Equivalence Checking of Regular Expressions using the Method by Antimirov and Mosses</H4>

  <p>
  <B>Description:</B> 
  Solving the problem of deciding equivalence of regular expressions can be used
  to decide a number of problems in automated reasoning. Therefore one likes to
  have a method for equivalence checking that is as fast as possible. There have
  been a number of algorithms proposed in the past, but one based on a method
  by Antimirov and Mosses seems relatively simple and easy to implement.
  </p>		      
  
  <p>
  <B>Tasks:</B>
  The task is to implement the algorithm by Antimirov and Mosses and compare it to
  other methods. Hopefully the algorithm can be tuned to be faster than other
  methods. The project can be carried out in almost all programming languages, but
  as usual functional programming languages such Scala, ML, Haskell have an edge
  for this kind of problems.
  </p>

  <p>
  <B>Literature:</B>
  Central to this project are the papers <A HREF="http://www.dcc.fc.up.pt/~nam/publica/ijcs08.pdf">here</A>
  and <A HREF="http://www.dcc.fc.up.pt/~nam/publica/51480046.pdf">here</A>.
  Other methods have been described, for example, 
  <A HREF="http://www4.informatik.tu-muenchen.de/~krauss/papers/rexp.pdf">here</A>.
  A relatively complicated method, based on automata, is described 
  <A HREF="http://sardes.inrialpes.fr/~braibant/atbr/">here</A>.
  </p>  

<li> <H4>[CU7] Game-Playing Engine for Five-In-A-Row on a Large Board</H4>

  <p>
  <B>Literature:</b>
  There is a web-page with various pointers to computer players
  <A HREF="http://webdocs.cs.ualberta.ca/~games/">here</A>. There are
  also some good books about computer players, for example:
  <table cellspacing="10">
  <tr><td><i>Artificial Intelligence: A Modern Approach</i> by S. Russel and P. Norvig, Prentice Hall, 2003 
  (a standard textbook about search strategies).
  </td></tr>
  <tr><td><i>Principles of Artificial Intelligence</i> by N. J. Nilsson, Springer Verlag, 1980 
  (a standard textbook about search strategies).
  </td></tr>

  <tr><td><i>Computer Game-Playing: Theory and Practice</i> by M. Bramer, Ellis Horwood Ltd, 1983
  (considers techniques used for programming a variety of games: Chess, Go, Scrabble, Billiards, 
   Othello, etc; includes theoretical issues about game searching).
  </td></tr>
  <tr><td><i>Chips Challenging Champions: Games, Computers and Artificial Intelligence</i> by
  J. Schaeffer and H.J. van den Herik, North Holland, 2002.
  </td></tr>
  <tr><td>
  <i>Artificial Intelligence for Games</i> by I. Millington and J. Funge, Morgan Kaufmann, 2009.
  </td></tr>
  <tr><td>
  <i>Computer Gamesmanship: The Complete Guide to Creating and Structuring Intelligent Games Programs</i> 
  by D.N.L. Levy, Simon and Schuster, 1983.
  </td></tr>
  </table>
  </p>

<li><h4>[CU8] Webserver for a Revision Control System</h4>

  <p>
    Modern revision control systems are
    <A HREF="http://mercurial.selenic.com/">mercurial</A> and
    <A HREF="http://git-scm.com/">git</A>.
  </p>

  <p>
    <b>Task:</b> Build a webserver for a revision control system 
    that allows user management. 
  </p>

</ul>
</TD>
</TR>
</TABLE>

<P><!-- Created: Tue Mar  4 00:23:25 GMT 1997 -->
<!-- hhmts start -->
Last modified: Wed Jan 11 16:30:03 GMT 2012
<!-- hhmts end -->
<a href="http://validator.w3.org/check/referer">[Validate this page.]</a>
</BODY>
</HTML>