public_html: comparison projects.html

equal deleted inserted replaced

-:a6c077ba850a
+:790a40046dc8
 VALIGN="TOP">
 <H2>2011/12 MSc Individual Projects</H2>
 <H4>Supervisor: Christian Urban</H4>
 <H4>Email: @kcl   Office: Strand Building S6.30</H4>
+<H4>If you are interested in a project, please send me email and we can discuss details.</H4>
 <ul class="striped">
 <li> <H4>[CU1] Implementing a SAT-Solver in a Functional Programming Language</H4>
 <p><B>Description:</b>
 <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>,
 <A HREF="http://www.scala-lang.org/">Scala</A>,
 <A HREF="http://caml.inria.fr/">OCaml</A>, ... are also OK). Starting point is
 the open source SAT-solver MiniSat (available <A HREF="http://minisat.se/Main.html">here</A>).
 The long-term hope is that your implementation becomes part of the interactive theorem prover
-<A HREF="http://www.cl.cam.ac.uk/research/hvg/isabelle/">Isabelle</A>.</p>
+<A HREF="http://www.cl.cam.ac.uk/research/hvg/isabelle/">Isabelle</A>. For this
+the SAT-solver needs to be implemented in ML.</p>
 <p>
 <B>Tasks:</B> Understand MiniSat, design and code a SAT-solver in ML,
 empirical evaluation and tuning of your code.</p>
 <p>
 <B>Literature:</B> A good starting point for reading about SAT-solving is the handbook
-article in <A HREF="http://www.cs.cornell.edu/gomes/papers/SATSolvers-KR-Handbook.pdf">here</A>.
+article <A HREF="http://www.cs.cornell.edu/gomes/papers/SATSolvers-KR-Handbook.pdf">here</A>.
 MiniSat is explained <A HREF="http://minisat.se/downloads/MiniSat.pdf">here</A> and
 <A HREF="http://minisat.se/Papers.html">here</A>. The standard reference for ML is
 <A HREF="http://www.cl.cam.ac.uk/~lp15/MLbook/">here</A> (I can lend you my copy
 of this book for the duration of the project). The best free implementation of ML is
 <A HREF="http://www.polyml.org/">PolyML</A>.
 enough to implement in a reasonable amount of time a compiler to an
 idealised assembly language (preferably
 <A HREF="http://en.wikipedia.org/wiki/Typed_assembly_language">TAL</A>) or an abstract machine.
 This has been explained in full detail in a PhD-thesis by  Louis-Julien Guillemette
 (available in English <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">here</A>). He used <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>
-as his implementation language. Other choices are of course possible.
+as his implementation language. Other choices are possible.
 </p>
 <p>
 <b>Tasks:</b>
 Read the relevant literature and implement the various components of a compiler
 (parser, intermediate languages, simulator for the idealised assembly language).
 This project is for a good student with an interest in programming languages,
 who can also translate abstract ideas into code. If it is too difficult, the project can
-easily be scaled back to the
+be easily scaled down to the
 <A HREF="http://en.wikipedia.org/wiki/Simply_typed_lambda_calculus">simply-typed
 lambda calculus</A> (which is simpler than
-System F) or only some components of the compiler are implemented.
+System F) or to cover only some components of the compiler.
 </p>
 <p>
 <B>Literature:</B>
 The <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">PhD-thesis</A> by  Louis-Julien Guillemette is required reading. A shorter
 </p>
 <li> <H4>[CU3] Sorting Suffixes</H4>
 <p><b>Description:</b> Given a string, take all its suffixes, and sort them.
-This is often also called <A HREF="http://en.wikipedia.org/wiki/Suffix_array">suffix
+This is often called <A HREF="http://en.wikipedia.org/wiki/Suffix_array">suffix
 array sorting</A>. It sound simple, but there are some difficulties.
-The naive algorithm would generate all (suffix) strings and sort them
+The naive algorithm would generate all suffix strings and sort them
-using a standard sorting algorithm, for example quick-sort. Unfortunately,
+using a standard sorting algorithm, for example
-this algorithm is not optimal (it does not take into account that you sort
+<A HREF="http://en.wikipedia.org/wiki/Quicksort">quicksort</A>.
-suffixes) and it also takes an quadratic amount of space, which is a
+The problem is that
-problem if you have to sort strings of several Mega-Bytes or even Giga-Bytes
+this algorithm is not optimal for suffix sorting: it does not take into account that you sort
-(happens often in biotech DNA information.<p>
+suffixes and it also takes a quadratic amount of space. This is a
+huge problem if you have to sort strings of several Megabytes or even Gigabytes,
-Aim: the notion of index on a text is central in many methods for text
+as happens often in biotech and DNA data mining. Suffix sorting is also a crucial operation for the
-processing and for the management of textual databases. Suffix Arrays is one
+<A HREF="http://en.wikipedia.org/wiki/Burrows?Wheeler_transform">Burrows-Wheeler transform</A>
-of these methods based on the sorted list of suffixes of the input text. The
+on which the data compression algorithm of the popular
-project consists in implementing a linear-time sorting algorithm and other
+<A HREF="http://en.wikipedia.org/wiki/Bzip2">bzip2</A>
-elements related to Suffix Array construction and to Burrows-Wheeler text
+program is based.
-compression. Plan: study of the sorting problem in the literature starting
+</p>
-with the reference below. Implementation of the sorting algorithm and the
-LCP computation to obtain a Suffix Array construction software. Then, using
+<p>
-this work, implementation of the algorithms described in the second
+There are more efficient algorithms for suffix sorting, for example
-reference below. Deliverables: report, suffix sorting and associated
+<A HREF="http://books.google.co.uk/books?id=Pn1sHToYf9oC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false">here</A> and
-software and their documentation.
+<A HREF="http://ls11-www.cs.uni-dortmund.de/people/rahmann/teaching/ss2008/AlgorithmenAufSequenzen/09-walk-bwt.pdf">here</A>.
+However the most space efficient algorithm for suffix sorting
-References:
+(<A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">here</A>)
-J. Kärkkäinen and P. Sanders,  Simple linear work suffix array construction, in ICALP'03, LNCS 2719, Spinger, 2003, pp. 943--955.
+is horrendously complicated. Your task would be to understand it, and then implement it.
-M. Crochemore, J. Désarménien and D. Perrin,  A note on the Burrows-Wheeler transformation, Theoret. Comput. Sci., 2005, to appear.
+</p>
-There is a horrendously complicated algorithm for solving these problems.
+<p>
-Your task would be to understand it, and then implement it.
+<B>Tasks:</B>
+Start by reading the literature about suffix sorting. Then work through the
-<li> <H5>[CU 4] Simplification modulo Equivalences in Isabelle</H5>
+12-page <A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">paper</A>
-In this project you have to extend the simplifier of the Isabelle theorem
+explaining the horrendously complicated algorithm and implement it.
-prover.  Currently, the simplifier only rewrites terms according to equalities
+Time permitting the work can include an implementation of the Burrows-Wheeler
-l = r. Provided ~ is an equivalence relation, the simplifier should also
+data compression. This project is for a good student, who likes to study in-depth
-be able to rewrite terms according to equivalences of the form l ~ r.
+algorithms. The project can be carried out in almost all programming languages,
-This project requires knowledge of the functional programming language ML.
+including C, Java, Scala, ML, Haskell and so on.
+</p>
-<li><h5>[CU 5] Parsing with Derivatives</h5>
+<p>
-Derivatives can be used to implement a regular expression matcher. In
+<B>Literature:</B> A good starting point for reading about suffix sorting is the
-this project you have to apply this technique to parsing. The starting
+<A HREF="http://books.google.co.uk/books?id=Pn1sHToYf9oC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false">book</A> by Crochemore. Two simple algorithms are also described
-point for this project is the paper "Yacc is Dead" by Matthew Might.
+<A HREF="http://ls11-www.cs.uni-dortmund.de/people/rahmann/teaching/ss2008/AlgorithmenAufSequenzen/09-walk-bwt.pdf">here</A>. The main literature is the 12-page
+<A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">article</A> about in-place
-<li> <H5>[CU 6] Equivalence Checking of Regular Expression using Antimirov's Method<H5>
+suffix sorting. The Burrows-Wheeler data compression is described
+<A HREF="http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.pdf">here</A>.
+</p>
+<li> <H4>[CU4] Simplification with Equivalence Relations in the Isabelle Theorem Prover</H4>
+<p>
+<B>Description:</B>
+In this project you have to extend the simplifier of the
+<A HREF="http://isabelle.in.tum.de/">Isabelle theorem prover</A>.
+The simplifier is an important reasoning tool of this theorem prover: it
+replaces a term by another term that can be proved to be equal to it. However,
+currently the simplifier only rewrites terms according to equalities.
+Assuming &asymp; is an equivalence relation, the simplifier should also be able
+to rewrite terms according to &asymp;. Since equivalence relations occur
+frequently in automated reasoning, this extension would make the simplifier
+more powerful and useful. The hope is that your code can go into the
+code base of Isabelle.
+</p>
+<p>
+<B>Tasks:</B>
+Read the <A HREF="http://www.springerlink.com/content/x7041m1807738832/">paper</A>
+about rewriting with equivalence relations. Get familiar with parts of the
+implementation of Isabelle (I will be of much help as I can). Implement
+the extension. This project is suitable for a student with a bit of math background.
+It requires knowledge of the functional programming language ML, which
+however can be learned quickly provided you have already written code
+in another functional programming language.
+</p>
+<p>
+<B>Literature:</B> A good starting point for reading about rewriting modulo equivalences
+is the paper <A HREF="http://www.springerlink.com/content/x7041m1807738832/">here</A>,
+which uses the ACL2 theorem prover. The implementation of the Isabelle theorem
+prover is described in much detail in this
+<A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Cookbook/">programming tutorial</A>.
+The standard reference for ML is
+<A HREF="http://www.cl.cam.ac.uk/~lp15/MLbook/">here</A> (I can lend you my copy
+of this book for the duration of the project).
+</p>
+<li><h4>[CU5] Lexing and Parsing with Derivatives</h4>
+<p>
+<B>Description:</B>
+Lexing and parsing are usually done using automated tools, like
+<A HREF="http://en.wikipedia.org/wiki/Lex_programming_tool">lex</A> and
+<A HREF="http://en.wikipedia.org/wiki/Yacc">yacc</A>. The problem
+with them is that they "work when they work", but if not, they are
+<A HREF="http://en.wikipedia.org/wiki/Black_box">black boxes</A>
+which are difficult to debug and change. They are really quite
+clumsy, to the point that Might wrote a paper titled
+"<A HREF="http://arxiv.org/pdf/1010.5023v1">Yacc is dead</A>".</p>
+<p>
+There is simple algorithm for regular expression matching (that is lexing).
+This algorithm was introduced by
+<A HREF="http://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)">Brzozowski</A>
+in 1964. It is based on the notion of derivatives of regular expressions and
+has proved <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">useful</A>
+for practical lexing. Last year the notion of derivatives was extended by
+<A HREF="http://matt.might.net/papers/might2011derivatives.pdf">Might et al</A>
+to <A HREF="http://en.wikipedia.org/wiki/Context-free_grammar">context free grammars</A>
+and parsing.
+</p>
+<p>
+<B>Tasks:</B> Get familiar with the two algorithms and implement them. Regular
+expression matching is relatively simple; parsing with derivatives is the
+harder part. Therefore you should empirically evaluate this part and
+tune your implementation. The project can be carried out in almost all programming
+languages, including C, Java, Scala, ML, Haskell and so on.
+</p>
+<p>
+<B>Literature:</B> This
+<A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">paper</A>
+gives a modern introduction to derivative based lexing. Derivative-based
+parsing is explained <A HREF="http://arxiv.org/pdf/1010.5023v1">here</A>
+and <A HREF="http://matt.might.net/papers/might2011derivatives.pdf">here</A>.
+</p>
+<li> <H4>[CU6] Equivalence Checking of Regular Expressions using the Method by Antimirov and Mosses</H4>
+<p>
+<B>Description:</B>
+Solving the problem of deciding equivalence of regular expressions can be used
+to decide a number of problems in automated reasoning. Therefore one likes to
+have a method for equivalence checking that is as fast as possible.
+</p>
+<p>
+<B>Tasks:</B>
+The task is to implement the algorithm by Antimirov and Mosses and compare it to
+other methods. Hopefully the algorithm can be tuned to be faster than other
+methods.
+</p>
+<p>
+<B>Literature:</B>
+Central to this project is the paper <A HREF="http://www.dcc.fc.up.pt/~nam/publica/ijcs08.pdf">here</A>.
+Other methods have been described, for example,
+<A HREF="http://www4.informatik.tu-muenchen.de/~krauss/papers/rexp.pdf">here</A>.
+</p>
 </ul>
 </TD>
 </TR>
 </TABLE>
 <P><!-- Created: Tue Mar  4 00:23:25 GMT 1997 -->
 <!-- hhmts start -->
-Last modified: Thu Dec  1 18:10:37 GMT 2011
+Last modified: Fri Dec  2 03:26:32 GMT 2011
 <!-- hhmts end -->
 <a href="http://validator.w3.org/check/referer">[Validate this page.]</a>
 </BODY>
 </HTML>

changeset 44	790a40046dc8
parent 43	a6c077ba850a
child 47	e0d36fd0a8fd