msc-projects-12.html
changeset 164 d99c0026ebaf
parent 163 622c47857b69
child 165 89c9fcb211a8
equal deleted inserted replaced
163:622c47857b69 164:d99c0026ebaf
    45 <li> <H4>[CU1] Regular Expression Matching and Partial Derivatives</H4>
    45 <li> <H4>[CU1] Regular Expression Matching and Partial Derivatives</H4>
    46 
    46 
    47   <p>
    47   <p>
    48   <B>Description:</b>  
    48   <B>Description:</b>  
    49   <A HREF="http://en.wikipedia.org/wiki/Regular_expression">Regular expressions</A> 
    49   <A HREF="http://en.wikipedia.org/wiki/Regular_expression">Regular expressions</A> 
    50   are extremely useful for many text-processing tasks...finding patterns in texts,
    50   are extremely useful for many text-processing tasks such as finding patterns in texts,
    51   lexing programs, syntax highlighting and so on. Given that regular expressions were
    51   lexing programs, syntax highlighting and so on. Given that regular expressions were
    52   introduced in 1950 by <A HREF="http://en.wikipedia.org/wiki/Stephen_Cole_Kleene">Stephen Kleene</A>, you might think 
    52   introduced in 1950 by <A HREF="http://en.wikipedia.org/wiki/Stephen_Cole_Kleene">Stephen Kleene</A>, you might think 
    53   regular expressions have since been studied and implemented to death. But you would definitely be mistaken: in fact they are still
    53   regular expressions have since been studied and implemented to death. But you would definitely be mistaken: in fact they are still
    54   an active research area. For example
    54   an active research area. For example
    55   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">this paper</A> 
    55   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">this paper</A> 
    71   mounting a nice <A HREF="http://en.wikipedia.org/wiki/Denial-of-service_attack">DoS attack</A> against 
    71   mounting a nice <A HREF="http://en.wikipedia.org/wiki/Denial-of-service_attack">DoS attack</A> against 
    72   your program if it contains such an &quot;evil&quot; regular expression. Actually 
    72   your program if it contains such an &quot;evil&quot; regular expression. Actually 
    73   <A HREF="http://www.scala-lang.org/">Scala</A> (and also Java) are almost immune from such
    73   <A HREF="http://www.scala-lang.org/">Scala</A> (and also Java) are almost immune from such
    74   attacks as they can deal with strings of up to 4,300 <code>a</code>s in less than a second. But if you scale
    74   attacks as they can deal with strings of up to 4,300 <code>a</code>s in less than a second. But if you scale
    75   the regular expression and string further to, say, 4,600 <code>a</code>s, then you get a <code>StackOverflowError</code> 
    75   the regular expression and string further to, say, 4,600 <code>a</code>s, then you get a <code>StackOverflowError</code> 
    76   potentially chrashing your program.
    76   potentially crashing your program.
    77   </p>
    77   </p>
    78 
    78 
    79   <p>
    79   <p>
    80   On a rainy afternoon, I implemented 
    80   On a rainy afternoon, I implemented 
    81   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re3.scala">this</A> 
    81   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re3.scala">this</A> 
    82   regular expression matcher in Scala. It is not as fast as the official one in Scala, but
    82   regular expression matcher in Scala. It is not as fast as the official one in Scala, but
    83   it can match up to 11,000 <code>a</code>s in less than 5 seconds  without raising any exception
    83   it can match up to 11,000 <code>a</code>s in less than 5 seconds  without raising any exception
    84   (remember Python and Ruby both need nearly 30 seconds to process 28(!) <code>a</code>s, and Scala's
    84   (remember Python and Ruby both need nearly 30 seconds to process 28(!) <code>a</code>s, and Scala's
    85   offical matcher maxes out at 4,600 <code>a</code>s). My matcher is approximately
    85   official matcher maxes out at 4,600 <code>a</code>s). My matcher is approximately
    86   85 lines of code and based on the concept of 
    86   85 lines of code and based on the concept of 
    87   <A HREF="http://lambda-the-ultimate.org/node/2293">derivatives of regular experssions</A>.
    87   <A HREF="http://lambda-the-ultimate.org/node/2293">derivatives of regular expressions</A>.
    88   These derivatives were introduced in 1964 by <A HREF="http://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)">
    88   These derivatives were introduced in 1964 by <A HREF="http://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)">
    89   Janusz Brzozowski</A>, but according to this 
    89   Janusz Brzozowski</A>, but according to this 
    90   <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">paper</A> had been lost in the &quot;sands of time&quot;.
    90   <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">paper</A> had been lost in the &quot;sands of time&quot;.
    91   The advantage of derivatives is that they side-step completely the usual 
    91   The advantage of derivatives is that they side-step completely the usual 
    92   <A HREF="http://hackingoff.com/compilers/regular-expression-to-nfa-dfa">translations</A> of regular expressions
    92   <A HREF="http://hackingoff.com/compilers/regular-expression-to-nfa-dfa">translations</A> of regular expressions
   141 
   141 
   142 
   142 
   143 <li> <H4>[CU2] Automata Theory in Your Web-Browser</H4>
   143 <li> <H4>[CU2] Automata Theory in Your Web-Browser</H4>
   144 
   144 
   145 <p>
   145 <p>
   146 There are a number of classic algorithms in automata theory (such as the transformation of regular
   146 This project is about web-programming (but not in Java):
   147 expressions into NFAs and DFAs, automata minimisation, subset construction). All these algorithms involve a fair 
   147 There are a number of classic algorithms in automata theory (such as the 
   148 amount of calculations, which cannot be easily done by hand. There are a few web applications that annimate these 
   148 <A HREF="http://hackingoff.com/compilers/regular-expression-to-nfa-dfa">transformation</A> of regular
       
   149 expressions into NFAs and DFAs, 
       
   150 <A HREF="http://en.wikipedia.org/wiki/DFA_minimization">automata minimisation</A>, 
       
   151 <A HREF="http://en.wikipedia.org/wiki/Powerset_construction">subset construction</A>). 
       
   152 All these algorithms involve a fair 
       
   153 amount of calculations, which cannot be easily done by hand. There are a few web applications, typically
       
   154 written in <A HREF="http://en.wikipedia.org/wiki/JavaScript">Javascript</A>,  that animate these 
   149 calculations, for example <A HREF="http://hackingoff.com/compilers/regular-expression-to-nfa-dfa">this one<A/>.
   155 calculations, for example <A HREF="http://hackingoff.com/compilers/regular-expression-to-nfa-dfa">this one<A/>.
       
   156 But they all have their deficiencies and can be improved with more modern technology.
   150 </p>
   157 </p>
   151 
   158 
   152 <p>
   159 <p>
   153 There now many useful libraries for Javascript, for example, this one for graphs. There are also
   160 There now many useful libraries for JavaScript, for example, this 
   154 a number of new programming languages targetting Javascript. This project is for someone who
   161 <A HREF="http://getspringy.com">one</A> for graphs or this 
   155 want to get to know these languges by implementing and animating algorithms from automata
   162 <A HREF="http://demos.bonsaijs.org/demos/star/index.html">one</A> for graphics. 
   156 theory or parsing.
   163 
       
   164 There are also
       
   165 a number of new programming languages targeting JavaScript, for example
       
   166 <A HREF="http://www.typescriptlang.org">TypeScript</A>,
       
   167 <A HREF="http://coffeescript.org">CoffeeScript</A>, 
       
   168 <A HREF="http://www.dartlang.org">Dart</A>,
       
   169 <A HREF="http://scriptsharp.com">Script#</A>,
       
   170 <A HREF="http://clojure.org">Clojure</A>
       
   171  and so on.
       
   172 The task in this project is to use a web-programming
       
   173 language and suitable library to animate algorithms from automata theory (and also parsing, if wanted).
       
   174 This project is for someone who
       
   175 want to get to know these new languages.
   157 </p>
   176 </p>
   158 
   177 
   159   <B>Literature:</B> 
   178   <B>Literature:</B> 
   160   The same general literature as in [CU1].
   179   The same general literature as in [CU1].
   161   </p>  
   180   </p>  
   162 
   181 
   163 <p>
   182 <p>
   164   <B>Skills:</B> 
   183   <B>Skills:</B> 
   165   This is a project for a student with good programming skills. 
   184   This is a project for a student with very good programming 
   166   JavaScript or a similar web-programming language seems to be best suited 
   185   and <A HREF="http://en.wikipedia.org/wiki/Hacker_(programmer_subculture)">hacking</A> skills. 
   167   for this project. Some knowledge in HTML and CSS cannot hurt either.
   186   Some knowledge in JavaScript, HTML and CSS cannot hurt The algorithms from automata
       
   187   theory are fairly standard material.
   168   </p>
   188   </p>
   169 
   189 
   170 
   190 
   171 <!--
   191 <!--
   172 <li> <H4>[CU2] Equivalence Checking of Regular Expressions</H4>
   192 <li> <H4>[CU2] Equivalence Checking of Regular Expressions</H4>
   362   </p>
   382   </p>
   363 
   383 
   364   <p>
   384   <p>
   365   However, there is one restriction that makes this project harder than it seems
   385   However, there is one restriction that makes this project harder than it seems
   366   as first sight. The department does not allow large server applications and databases
   386   as first sight. The department does not allow large server applications and databases
   367   to be run on calcium. So the problem should be solved with as few resources needed
   387   to be run on calcium - the central server in the department. So the problem should be solved with as few resources needed
   368   on the &quot;back-end&quot; which collects the votes. 
   388   on the &quot;back-end&quot; which collects the votes. 
   369   </p>
   389   </p>
   370 
   390 
   371   <p>
   391   <p>
   372   <B>Literature:</B> 
   392   <B>Literature:</B> 
   373   The project requires fluency in a web-programming language (for example 
   393   The project requires fluency in a web-programming language (for example 
   374   <A HREF="http://en.wikipedia.org/wiki/JavaScript">Javascript</A>,
   394   <A HREF="http://en.wikipedia.org/wiki/JavaScript">JavaScript</A>,
   375   <A HREF="http://en.wikipedia.org/wiki/PHP">PHP</A>, 
   395   <A HREF="http://en.wikipedia.org/wiki/PHP">PHP</A>, 
   376   Java, <A HREF="http://www.python.org">Python</A>, 
   396   Java, <A HREF="http://www.python.org">Python</A>, 
   377   <A HREF="http://en.wikipedia.org/wiki/Go_(programming_language)">Go</A>, 
   397   <A HREF="http://en.wikipedia.org/wiki/Go_(programming_language)">Go</A>, 
   378   <A HREF="http://www.scala-lang.org/">Scala</A>,
   398   <A HREF="http://www.scala-lang.org/">Scala</A>,
   379   <A HREF="http://en.wikipedia.org/wiki/Ruby_(programming_language)">Ruby</A>). 
   399   <A HREF="http://en.wikipedia.org/wiki/Ruby_(programming_language)">Ruby</A>). 
   381   <A HREF="http://www.udacity.com/overview/Course/cs253/CourseRev/apr2012">Web Application Engineering</A>
   401   <A HREF="http://www.udacity.com/overview/Course/cs253/CourseRev/apr2012">Web Application Engineering</A>
   382   course at <A HREF="http://www.udacity.com">Udacity</A> is a good starting point 
   402   course at <A HREF="http://www.udacity.com">Udacity</A> is a good starting point 
   383   to be aware of the issues involved. This course uses <A HREF="http://www.python.org">Python</A>.
   403   to be aware of the issues involved. This course uses <A HREF="http://www.python.org">Python</A>.
   384   To evaluate the answers from the student, Google's 
   404   To evaluate the answers from the student, Google's 
   385   <A HREF="https://developers.google.com/chart/image/docs/making_charts">Chart Tools</A>
   405   <A HREF="https://developers.google.com/chart/image/docs/making_charts">Chart Tools</A>
   386   might be useful, which ar also described in this 
   406   might be useful, which are also described in this 
   387   <A HREF="http://www.youtube.com/watch?v=NZtgT4jgnE8">youtube</A> video.
   407   <A HREF="http://www.youtube.com/watch?v=NZtgT4jgnE8">youtube</A> video.
   388   </p>
   408   </p>
   389 
   409 
   390   <p>
   410   <p>
   391   <B>Skills:</B> 
   411   <B>Skills:</B> 
   400   
   420   
   401   <p>
   421   <p>
   402   <B>Description:</B>
   422   <B>Description:</B>
   403   There are many algorithms for synchronising clocks. This
   423   There are many algorithms for synchronising clocks. This
   404   <A HREF="http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20120000054_2011025573.pdf">paper</A> 
   424   <A HREF="http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20120000054_2011025573.pdf">paper</A> 
   405   describes a new algorithm for clocks that communicate by exchanging
   425   describes a new algorithm developed by NASA for clocks that communicate by exchanging
   406   messages and thereby reach a state in which (within some bound) all clocks are synchronised.
   426   messages and thereby reach a state in which (within some bound) all clocks are synchronised.
   407   A slightly longer and more detailed paper about the algorithm is 
   427   A slightly longer and more detailed paper about the algorithm is 
   408   <A HREF="http://hdl.handle.net/2060/20110020812">here</A>.
   428   <A HREF="http://hdl.handle.net/2060/20110020812">here</A>.
   409   The point of this project is to implement this algorithm and simulate networks of clocks.
   429   The point of this project is to implement this algorithm and simulate a networks of clocks.
   410   </p>
   430   </p>
   411 
   431 
   412   <p>
   432   <p>
   413   <B>Literature:</B> 
   433   <B>Literature:</B> 
   414   There is a wide range of literature on clock syncronisation algorithms. 
   434   There is a wide range of literature on clock synchronisation algorithms. 
   415   Some pointers are given in this
   435   Some pointers are given in this
   416   <A HREF="http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20120000054_2011025573.pdf">paper</A>,
   436   <A HREF="http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20120000054_2011025573.pdf">paper</A>,
   417   which describes the algorithm to be implemented in this project. Pointers
   437   which describes the algorithm to be implemented in this project. Pointers
   418   are given also <A HREF="http://en.wikipedia.org/wiki/Clock_synchronization">here</A>.
   438   are given also <A HREF="http://en.wikipedia.org/wiki/Clock_synchronization">here</A>.
   419   </p>
   439   </p>