projects.html
author Christian Urban <urbanc@in.tum.de>
Sat, 10 Dec 2011 16:07:00 +0000
changeset 50 37e9d1eb8004
parent 48 04797dfb3198
child 51 f14fa1742f36
permissions -rw-r--r--
tuned
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     1
<?xml version="1.0" encoding="utf-8"?>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     2
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     3
<HEAD>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     4
<TITLE>Christian Urban</TITLE>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     5
<BASE HREF="http://www.inf.kcl.ac.uk/staff/urbanc/">
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     6
<script type="text/javascript" src="striper.js"></script>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     7
<link rel="stylesheet" href="nominal.css">
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     8
</HEAD>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     9
<BODY TEXT="#000000" 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    10
      BGCOLOR="#4169E1" 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    11
      LINK="#0000EF" 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    12
      VLINK="#51188E" 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    13
      ALINK="#FF0000"
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    14
      ONLOAD="striper('ul','striped','li','first,second')">
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    15
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    16
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    17
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    18
<TABLE WIDTH="100%" 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    19
       BGCOLOR="#4169E1" 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    20
       BORDER="0"   
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    21
       FRAME="border"  
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    22
       CELLPADDING="10"     
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    23
       CELLSPACING="2"
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    24
       RULES="all">
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    25
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    26
<TR>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    27
<TD BGCOLOR="#FFFFFF" 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    28
    WIDTH="75%" 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    29
    VALIGN="TOP">
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    30
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    31
<H2>2011/12 MSc Individual Projects</H2>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    32
<H4>Supervisor: Christian Urban</H4> 
47
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
    33
<H4>Email: christian dot urban at kcl dot ac dot uk,  Office: Strand Building S6.30</H4>
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
    34
<H4>If you are interested in a project, please send me an email and we can discuss details. Please include
50
Christian Urban <urbanc@in.tum.de>
parents: 48
diff changeset
    35
a short description about your programming skills and computer science background in your first email. 
Christian Urban <urbanc@in.tum.de>
parents: 48
diff changeset
    36
I will also need your King's username in order to book the project for you. Thanks.</H4> 
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    37
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    38
<ul class="striped">
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    39
<li> <H4>[CU1] Implementing a SAT-Solver in a Functional Programming Language</H4>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    40
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    41
  <p><B>Description:</b>  
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    42
  SAT-solver search for satisfying assignments of boolean formulas. Although this 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    43
  is a computationally hard problem (<A HREF="http://en.wikipedia.org/wiki/NP-complete">NP-complete</A>), 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    44
  modern SAT-solvers routinely solve boolean formulas with 100,000 and more variables. 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    45
  Application areas of SAT-solver are manifold: they range from hardware verification to 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    46
  Sudoku solvers. Every 2 years there is a competition of the best SAT-solvers in the world.</p> 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    47
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    48
  <p>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    49
  Most SAT-solvers are written in C. The aim of this project is to design and implement 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    50
  a SAT-solver in a functional programming language (preferably 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    51
  <A HREF="http://en.wikipedia.org/wiki/Standard_ML">ML</A>, but 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    52
  <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>, 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    53
  <A HREF="http://www.scala-lang.org/">Scala</A>,
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    54
  <A HREF="http://caml.inria.fr/">OCaml</A>, ... are also OK). Starting point is 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    55
  the open source SAT-solver MiniSat (available <A HREF="http://minisat.se/Main.html">here</A>). 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    56
  The long-term hope is that your implementation becomes part of the interactive theorem prover 
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
    57
  <A HREF="http://www.cl.cam.ac.uk/research/hvg/isabelle/">Isabelle</A>. For this
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
    58
  the SAT-solver needs to be implemented in ML.</p> 
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    59
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    60
  <p>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    61
  <B>Tasks:</B> Understand MiniSat, design and code a SAT-solver in ML, 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    62
  empirical evaluation and tuning of your code.</p>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    63
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    64
  <p>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    65
  <B>Literature:</B> A good starting point for reading about SAT-solving is the handbook
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
    66
  article <A HREF="http://www.cs.cornell.edu/gomes/papers/SATSolvers-KR-Handbook.pdf">here</A>.
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    67
  MiniSat is explained <A HREF="http://minisat.se/downloads/MiniSat.pdf">here</A> and
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    68
  <A HREF="http://minisat.se/Papers.html">here</A>. The standard reference for ML is
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    69
  <A HREF="http://www.cl.cam.ac.uk/~lp15/MLbook/">here</A> (I can lend you my copy 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    70
  of this book for the duration of the project). The best free implementation of ML is 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    71
  <A HREF="http://www.polyml.org/">PolyML</A>.
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    72
  </p>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    73
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    74
<li> <H4>[CU2] A Compiler for System F</H4>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    75
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    76
  <p><b>Description:</b> 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    77
  <A HREF="http://en.wikipedia.org/wiki/System_F">System F</A> is a mini programming language, 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    78
  which is often used to study the theory behind programming languages, but is also used as 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    79
  a core-language of functional programming languages (for example 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    80
  <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>). The language is small
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    81
  enough to implement in a reasonable amount of time a compiler to an
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    82
  idealised assembly language (preferably 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    83
  <A HREF="http://en.wikipedia.org/wiki/Typed_assembly_language">TAL</A>) or an abstract machine.
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    84
  This has been explained in full detail in a PhD-thesis by  Louis-Julien Guillemette
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    85
  (available in English <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">here</A>). He used <A HREF="http://haskell.org/haskellwiki/Haskell">Haskell</A>
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
    86
  as his implementation language. Other choices are possible.
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    87
  </p>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    88
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    89
  <p>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    90
  <b>Tasks:</b>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    91
  Read the relevant literature and implement the various components of a compiler
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    92
  (parser, intermediate languages, simulator for the idealised assembly language).
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    93
  This project is for a good student with an interest in programming languages,
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    94
  who can also translate abstract ideas into code. If it is too difficult, the project can
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
    95
  be easily scaled down to the 
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    96
  <A HREF="http://en.wikipedia.org/wiki/Simply_typed_lambda_calculus">simply-typed 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    97
  lambda calculus</A> (which is simpler than
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
    98
  System F) or to cover only some components of the compiler.
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    99
  </p> 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   100
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   101
  <p>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   102
  <B>Literature:</B>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   103
  The <A HREF="https://papyrus.bib.umontreal.ca/jspui/bitstream/1866/3454/6/Guillemette_Louis-Julien_2009_these.pdf">PhD-thesis</A> by  Louis-Julien Guillemette is required reading. A shorter
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   104
  paper about this subject is available <A HREF="http://www.iro.umontreal.ca/~monnier/icfp08.pdf">here</A>.
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   105
  A good starting point for TAL is <A HREF="http://www.cs.cornell.edu/talc/papers/tal-tr.pdf">here</A>.
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   106
  There is a lot of literature about compilers 
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   107
  (for example <A HREF="http://www.cs.princeton.edu/~appel/papers/cwc.html">this book</A> -
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   108
  I can lend you my copy for the duration of the project).
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   109
  </p>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   110
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   111
  <li> <H4>[CU3] Sorting Suffixes</H4>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   112
  
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   113
  <p><b>Description:</b> Given a string, take all its suffixes, and sort them.
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   114
  This is often called <A HREF="http://en.wikipedia.org/wiki/Suffix_array">suffix 
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   115
  array sorting</A>. It sound simple, but there are some difficulties. 
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   116
  The naive algorithm would generate all suffix strings and sort them
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   117
  using a standard sorting algorithm, for example 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   118
  <A HREF="http://en.wikipedia.org/wiki/Quicksort">quicksort</A>. 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   119
  The problem is that
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   120
  this algorithm is not optimal for suffix sorting: it does not take into account that you sort
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   121
  suffixes and it also takes a quadratic amount of space. This is a 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   122
  huge problem if you have to sort strings of several Megabytes or even Gigabytes,
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   123
  as happens often in biotech and DNA data mining. Suffix sorting is also a crucial operation for the 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   124
  <A HREF="http://en.wikipedia.org/wiki/Burrows?Wheeler_transform">Burrows-Wheeler transform</A>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   125
  on which the data compression algorithm of the popular 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   126
  <A HREF="http://en.wikipedia.org/wiki/Bzip2">bzip2</A>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   127
  program is based.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   128
  </p>
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   129
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   130
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   131
  There are more efficient algorithms for suffix sorting, for example 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   132
  <A HREF="http://books.google.co.uk/books?id=Pn1sHToYf9oC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false">here</A> and 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   133
  <A HREF="http://ls11-www.cs.uni-dortmund.de/people/rahmann/teaching/ss2008/AlgorithmenAufSequenzen/09-walk-bwt.pdf">here</A>. 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   134
  However the most space efficient algorithm for suffix sorting  
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   135
  (<A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">here</A>) 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   136
  is horrendously complicated. Your task would be to understand it, and then implement it.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   137
  </p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   138
  
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   139
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   140
  <B>Tasks:</B>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   141
  Start by reading the literature about suffix sorting. Then work through the
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   142
  12-page <A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">paper</A> 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   143
  explaining the horrendously complicated algorithm and implement it.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   144
  Time permitting the work can include an implementation of the Burrows-Wheeler 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   145
  data compression. This project is for a good student, who likes to study in-depth 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   146
  algorithms. The project can be carried out in almost all programming languages,
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   147
  including C, Java, Scala, ML, Haskell and so on.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   148
  </p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   149
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   150
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   151
  <B>Literature:</B> A good starting point for reading about suffix sorting is the 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   152
  <A HREF="http://books.google.co.uk/books?id=Pn1sHToYf9oC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false">book</A> by Crochemore. Two simple algorithms are also described
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   153
  <A HREF="http://ls11-www.cs.uni-dortmund.de/people/rahmann/teaching/ss2008/AlgorithmenAufSequenzen/09-walk-bwt.pdf">here</A>. The main literature is the 12-page
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   154
  <A HREF="http://www.cs.rutgers.edu/~muthu/fm072.pdf">article</A> about in-place
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   155
  suffix sorting. The Burrows-Wheeler data compression is described 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   156
  <A HREF="http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.pdf">here</A>.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   157
  </p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   158
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   159
<li> <H4>[CU4] Simplification with Equivalence Relations in the Isabelle Theorem Prover</H4>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   160
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   161
  <B>Description:</B>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   162
  In this project you have to extend the simplifier of the 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   163
  <A HREF="http://isabelle.in.tum.de/">Isabelle theorem prover</A>.  
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   164
  The simplifier is an important reasoning tool of this theorem prover: it 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   165
  replaces a term by another term that can be proved to be equal to it. However, 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   166
  currently the simplifier only rewrites terms according to equalities. 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   167
  Assuming &asymp; is an equivalence relation, the simplifier should also be able 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   168
  to rewrite terms according to &asymp;. Since equivalence relations occur 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   169
  frequently in automated reasoning, this extension would make the simplifier 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   170
  more powerful and useful. The hope is that your code can go into the
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   171
  code base of Isabelle.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   172
  </p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   173
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   174
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   175
  <B>Tasks:</B>	
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   176
  Read the <A HREF="http://www.springerlink.com/content/x7041m1807738832/">paper</A>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   177
  about rewriting with equivalence relations. Get familiar with parts of the 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   178
  implementation of Isabelle (I will be of much help as I can). Implement
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   179
  the extension. This project is suitable for a student with a bit of math background.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   180
  It requires knowledge of the functional programming language ML, which
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   181
  however can be learned quickly provided you have already written code
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   182
  in another functional programming language.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   183
  </p>
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   184
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   185
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   186
  <B>Literature:</B> A good starting point for reading about rewriting modulo equivalences 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   187
  is the paper <A HREF="http://www.springerlink.com/content/x7041m1807738832/">here</A>, 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   188
  which uses the ACL2 theorem prover. The implementation of the Isabelle theorem
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   189
  prover is described in much detail in this 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   190
  <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Cookbook/">programming tutorial</A>.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   191
  The standard reference for ML is
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   192
  <A HREF="http://www.cl.cam.ac.uk/~lp15/MLbook/">here</A> (I can lend you my copy 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   193
  of this book for the duration of the project).
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   194
  </p>
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   195
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   196
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   197
<li><h4>[CU5] Lexing and Parsing with Derivatives</h4>
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   198
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   199
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   200
  <B>Description:</B>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   201
  Lexing and parsing are usually done using automated tools, like 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   202
  <A HREF="http://en.wikipedia.org/wiki/Lex_programming_tool">lex</A> and 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   203
  <A HREF="http://en.wikipedia.org/wiki/Yacc">yacc</A>. The problem 
47
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   204
  with them is that they "work when they work", but if they do not, then they are
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   205
  <A HREF="http://en.wikipedia.org/wiki/Black_box">black boxes</A>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   206
  which are difficult to debug and change. They are really quite 
47
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   207
  clumsy to the point that Might and Darais wrote a paper titled 
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   208
  "<A HREF="http://arxiv.org/pdf/1010.5023v1">Yacc is dead</A>".</p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   209
 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   210
  <p>
47
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   211
  There is a simple algorithm for regular expression matching (that is lexing).
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   212
  This algorithm was introduced by 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   213
  <A HREF="http://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)">Brzozowski</A> 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   214
  in 1964. It is based on the notion of derivatives of regular expressions and 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   215
  has proved <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">useful</A> 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   216
  for practical lexing. Last year the notion of derivatives was extended by 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   217
  <A HREF="http://matt.might.net/papers/might2011derivatives.pdf">Might et al</A>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   218
  to <A HREF="http://en.wikipedia.org/wiki/Context-free_grammar">context free grammars</A> 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   219
  and parsing.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   220
  </p>		      
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   221
  
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   222
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   223
  <B>Tasks:</B> Get familiar with the two algorithms and implement them. Regular
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   224
  expression matching is relatively simple; parsing with derivatives is the 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   225
  harder part. Therefore you should empirically evaluate this part and
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   226
  tune your implementation. The project can be carried out in almost all programming 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   227
  languages, including C, Java, Scala, ML, Haskell and so on.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   228
  </p>
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   229
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   230
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   231
  <B>Literature:</B> This 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   232
  <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">paper</A> 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   233
  gives a modern introduction to derivative based lexing. Derivative-based
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   234
  parsing is explained <A HREF="http://arxiv.org/pdf/1010.5023v1">here</A>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   235
  and <A HREF="http://matt.might.net/papers/might2011derivatives.pdf">here</A>.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   236
  </p>  
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   237
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   238
<li> <H4>[CU6] Equivalence Checking of Regular Expressions using the Method by Antimirov and Mosses</H4>
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   239
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   240
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   241
  <B>Description:</B> 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   242
  Solving the problem of deciding equivalence of regular expressions can be used
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   243
  to decide a number of problems in automated reasoning. Therefore one likes to
47
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   244
  have a method for equivalence checking that is as fast as possible. There have
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   245
  been a number of algorithms proposed in the past, but one based on a method
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   246
  by Antimirov and Mosses seems relatively simple and easy to implement.
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   247
  </p>		      
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   248
  
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   249
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   250
  <B>Tasks:</B>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   251
  The task is to implement the algorithm by Antimirov and Mosses and compare it to
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   252
  other methods. Hopefully the algorithm can be tuned to be faster than other
47
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   253
  methods. The project can be carried out in almost all programming languages, but
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   254
  as usual functional programming languages such Scala, ML, Haskell have an edge
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   255
  for this kind of problems.
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   256
  </p>
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   257
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   258
  <p>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   259
  <B>Literature:</B>
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   260
  Central to this project is the paper <A HREF="http://www.dcc.fc.up.pt/~nam/publica/ijcs08.pdf">here</A>.
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   261
  Other methods have been described, for example, 
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   262
  <A HREF="http://www4.informatik.tu-muenchen.de/~krauss/papers/rexp.pdf">here</A>.
47
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   263
  A relatively complicated method, based on automata, is described 
Christian Urban <urbanc@in.tum.de>
parents: 44
diff changeset
   264
  <A HREF="http://sardes.inrialpes.fr/~braibant/atbr/">here</A>.
44
790a40046dc8 improved
Christian Urban <urbanc@in.tum.de>
parents: 43
diff changeset
   265
  </p>  
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   266
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   267
</ul>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   268
</TD>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   269
</TR>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   270
</TABLE>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   271
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   272
<P><!-- Created: Tue Mar  4 00:23:25 GMT 1997 -->
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   273
<!-- hhmts start -->
50
Christian Urban <urbanc@in.tum.de>
parents: 48
diff changeset
   274
Last modified: Sat Dec 10 16:06:44 GMT 2011
43
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   275
<!-- hhmts end -->
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   276
<a href="http://validator.w3.org/check/referer">[Validate this page.]</a>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   277
</BODY>
a6c077ba850a added initial version of projects
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   278
</HTML>