<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HEAD>
<TITLE>Christian Urban</TITLE>
<BASE HREF="http://www.inf.kcl.ac.uk/staff/urbanc/">
<script type="text/javascript" src="striper.js"></script>
<link rel="stylesheet" href="nominal.css">
</HEAD>
<BODY TEXT="#000000"
BGCOLOR="#4169E1"
LINK="#0000EF"
VLINK="#51188E"
ALINK="#FF0000"
ONLOAD="striper('ul','striped','li','first,second')">
<TABLE WIDTH="100%"
BGCOLOR="#4169E1"
BORDER="0"
FRAME="border"
CELLPADDING="10"
CELLSPACING="2"
RULES="all">
<TR>
<TD BGCOLOR="#FFFFFF"
WIDTH="75%"
VALIGN="TOP">
<H2>2011/12 MSc Individual Projects</H2>
<H4>Supervisor: Christian Urban</H4>
<H4>Email: @kcl Office: Strand Building S6.30</H4>
<ul class="striped">
<li> <H4>[CU1] Implementing a SAT-Solver in a Functional Programming Language</H4>
<p><B>Description:</b>
SAT-solver search for satisfying assignments of boolean formulas. Although this
is a computationally hard problem (<A HREF="http://en.wikipedia.org/wiki/NP-complete">NP-complete</A>),
modern SAT-solvers routinely solve boolean formulas with 100,000 and more variables.
Application areas of SAT-solver are manifold: they range from hardware verification to
Sudoku solvers. Every 2 years there is a competition of the best SAT-solvers in the world.</p>
<p>
Most SAT-solvers are written in C. The aim of this project is to design and implement
a SAT-solver in a functional programming language (preferably
<A HREF="http://en.wikipedia.org/wiki/Standard_ML">ML</A>, but
<A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>,
<A HREF="http://www.scala-lang.org/">Scala</A>,
<A HREF="http://caml.inria.fr/">OCaml</A>, ... are also OK). Starting point is
the open source SAT-solver MiniSat (available <A HREF="http://minisat.se/Main.html">here</A>).
The long-term hope is that your implementation becomes part of the interactive theorem prover
<A HREF="http://www.cl.cam.ac.uk/research/hvg/isabelle/">Isabelle</A>.</p>
<p>
<B>Tasks:</B> Understand MiniSat, design and code a SAT-solver in ML,
empirical evaluation and tuning of your code.</p>
<p>
<B>Literature:</B> A good starting point for reading about SAT-solving is the handbook
article in <A HREF="http://www.cs.cornell.edu/gomes/papers/SATSolvers-KR-Handbook.pdf">here</A>.
MiniSat is explained <A HREF="http://minisat.se/downloads/MiniSat.pdf">here</A> and
<A HREF="http://minisat.se/Papers.html">here</A>. The standard reference for ML is
<A HREF="http://www.cl.cam.ac.uk/~lp15/MLbook/">here</A> (I can lend you my copy
of this book for the duration of the project). The best free implementation of ML is
<A HREF="http://www.polyml.org/">PolyML</A>.
</p>
<li> <H4>[CU2] A Compiler for System F</H4>
<p><b>Description:</b>
<A HREF="http://en.wikipedia.org/wiki/System_F">System F</A> is a mini programming language,
which is often used to study the theory behind programming languages, but is also used as
a core-language of functional programming languages (for example
<A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>). The language is small
enough to implement in a reasonable amount of time a compiler to an
idealised assembly language (preferably
<A HREF="http://en.wikipedia.org/wiki/Typed_assembly_language">TAL</A>) or an abstract machine.
This has been explained in full detail in a PhD-thesis by Louis-Julien Guillemette
(available in English <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">here</A>). He used <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>
as his implementation language. Other choices are of course possible.
</p>
<p>
<b>Tasks:</b>
Read the relevant literature and implement the various components of a compiler
(parser, intermediate languages, simulator for the idealised assembly language).
This project is for a good student with an interest in programming languages,
who can also translate abstract ideas into code. If it is too difficult, the project can
easily be scaled back to the
<A HREF="http://en.wikipedia.org/wiki/Simply_typed_lambda_calculus">simply-typed
lambda calculus</A> (which is simpler than
System F) or only some components of the compiler are implemented.
</p>
<p>
<B>Literature:</B>
The <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">PhD-thesis</A> by Louis-Julien Guillemette is required reading. A shorter
paper about this subject is available <A HREF="http://www.iro.umontreal.ca/~monnier/icfp08.pdf">here</A>.
A good starting point for TAL is <A HREF="http://www.cs.cornell.edu/talc/papers/tal-tr.pdf">here</A>.
There is a lot of literature about compilers
(for example <A HREF="http://www.cs.princeton.edu/~appel/papers/cwc.html">this book</A> -
I can lend you my copy for the duration of the project).
</p>
<li> <H4>[CU3] Sorting Suffixes</H4>
<p><b>Description:</b> Given a string, take all its suffixes, and sort them.
This is often also called <A HREF="http://en.wikipedia.org/wiki/Suffix_array">suffix
array sorting</A>. It sound simple, but there are some difficulties.
The naive algorithm would generate all (suffix) strings and sort them
using a standard sorting algorithm, for example quick-sort. Unfortunately,
this algorithm is not optimal (it does not take into account that you sort
suffixes) and it also takes an quadratic amount of space, which is a
problem if you have to sort strings of several Mega-Bytes or even Giga-Bytes
(happens often in biotech DNA information.<p>
Aim: the notion of index on a text is central in many methods for text
processing and for the management of textual databases. Suffix Arrays is one
of these methods based on the sorted list of suffixes of the input text. The
project consists in implementing a linear-time sorting algorithm and other
elements related to Suffix Array construction and to Burrows-Wheeler text
compression. Plan: study of the sorting problem in the literature starting
with the reference below. Implementation of the sorting algorithm and the
LCP computation to obtain a Suffix Array construction software. Then, using
this work, implementation of the algorithms described in the second
reference below. Deliverables: report, suffix sorting and associated
software and their documentation.
References:
J. Kärkkäinen and P. Sanders, Simple linear work suffix array construction, in ICALP'03, LNCS 2719, Spinger, 2003, pp. 943--955.
M. Crochemore, J. Désarménien and D. Perrin, A note on the Burrows-Wheeler transformation, Theoret. Comput. Sci., 2005, to appear.
There is a horrendously complicated algorithm for solving these problems.
Your task would be to understand it, and then implement it.
<li> <H5>[CU 4] Simplification modulo Equivalences in Isabelle</H5>
In this project you have to extend the simplifier of the Isabelle theorem
prover. Currently, the simplifier only rewrites terms according to equalities
l = r. Provided ~ is an equivalence relation, the simplifier should also
be able to rewrite terms according to equivalences of the form l ~ r.
This project requires knowledge of the functional programming language ML.
<li><h5>[CU 5] Parsing with Derivatives</h5>
Derivatives can be used to implement a regular expression matcher. In
this project you have to apply this technique to parsing. The starting
point for this project is the paper "Yacc is Dead" by Matthew Might.
<li> <H5>[CU 6] Equivalence Checking of Regular Expression using Antimirov's Method<H5>
</ul>
</TD>
</TR>
</TABLE>
<P><!-- Created: Tue Mar 4 00:23:25 GMT 1997 -->
<!-- hhmts start -->
Last modified: Thu Dec 1 18:10:37 GMT 2011
<!-- hhmts end -->
<a href="http://validator.w3.org/check/referer">[Validate this page.]</a>
</BODY>
</HTML>