updated
authorChristian Urban <urbanc@in.tum.de>
Sat, 17 Nov 2012 11:33:35 +0000
changeset 163 622c47857b69
parent 162 0752a5de9537
child 164 d99c0026ebaf
updated
msc-projects-12.html
--- a/msc-projects-12.html	Tue Nov 06 20:40:13 2012 +0000
+++ b/msc-projects-12.html	Sat Nov 17 11:33:35 2012 +0000
@@ -35,7 +35,7 @@
 a short description about your programming skills and Computer Science background in your first email. 
 I will also need your King's username in order to book the project for you. Thanks.</H4> 
 
-<H4>Note that besides being a lecturer at the theoretical end of Computer Science, I am also a passionate
+<H4>Note that besides being a lecturer in the theoretical part of Computer Science, I am also a passionate
     <A HREF="http://en.wikipedia.org/wiki/Hacker_(programmer_subculture)">hacker</A> &hellip;
     defined as &ldquo;a person who enjoys exploring the details of programmable systems and 
     stretching their capabilities, as opposed to most users, who prefer to learn only the minimum 
@@ -50,21 +50,21 @@
   are extremely useful for many text-processing tasks...finding patterns in texts,
   lexing programs, syntax highlighting and so on. Given that regular expressions were
   introduced in 1950 by <A HREF="http://en.wikipedia.org/wiki/Stephen_Cole_Kleene">Stephen Kleene</A>, you might think 
-  regular expressions have since been studied to death. But you would definitely be mistaken: in fact they are still
+  regular expressions have since been studied and implemented to death. But you would definitely be mistaken: in fact they are still
   an active research area. For example
   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">this paper</A> 
   about regular expression matching and partial derivatives was presented this summer at the international 
-  PPDP'12 conference.</p>
+  PPDP'12 conference. The task in this project is to implement the results from this paper.</p>
 
   <p>The background for this project is that some regular expressions are 
   &quot;<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>&quot; 
   and can &quot;stab you in the back&quot; according to
-  this recent <A HREF="http://tech.blog.cueup.com/regular-expressions-will-stab-you-in-the-back">blog post</A>.
+  this <A HREF="http://tech.blog.cueup.com/regular-expressions-will-stab-you-in-the-back">blog post</A>.
   For example, if you use in <A HREF="http://www.python.org">Python</A> or 
   in <A HREF="http://www.ruby-lang.org/en/">Ruby</A> (probably also in other mainstream programming languages) the 
   innocently looking regular expression <code>a?{28}a{28}</code> and match it, say, against the string 
   <code>aaaaaaaaaaaaaaaaaaaaaaaaaaaa</code> (that is 28 <code>a</code>s), you will soon notice that your CPU usage goes to 100%. In fact,
-  Python and Ruby need approximately 30 seconds for matching this string. You can try it for yourself:
+  Python and Ruby need approximately 30 seconds of hard work for matching this string. You can try it for yourself:
   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re.py">re.py</A> (Python version) and 
   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/afl-material/raw-file/tip/re-internal.rb">re.rb</A> 
   (Ruby version). You can imagine an attacker
@@ -72,8 +72,8 @@
   your program if it contains such an &quot;evil&quot; regular expression. Actually 
   <A HREF="http://www.scala-lang.org/">Scala</A> (and also Java) are almost immune from such
   attacks as they can deal with strings of up to 4,300 <code>a</code>s in less than a second. But if you scale
-  the regular expression and string further to, say, 4,600 <code>a</code>s, you get a <code>StackOverflowError</code> 
-  exception chrashing your program.
+  the regular expression and string further to, say, 4,600 <code>a</code>s, then you get a <code>StackOverflowError</code> 
+  potentially chrashing your program.
   </p>
 
   <p>
@@ -98,7 +98,7 @@
   Now the guys from the 
   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">PPDP'12-paper</A> mentioned 
   above claim they are even faster than me and can deal with even more features of regular expressions
-  (for example subexpression matching, which my rainy-afternoon matcher lacks). I am sure they thought
+  (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought
   about the problem much longer than a single afternoon. The task 
   in this project is to find out how good they actually are by implementing the results from their paper. 
   Their approach is based on the concept of partial derivatives introduced in 1994 by
@@ -106,8 +106,8 @@
   I used them once myself in a <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Publications/rexp.pdf">paper</A> 
   in order to prove the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>.
   So I know they are worth their money. Still, it would be interesting to actually compare their results
-  with my simple rainy-afternoon matcher and &quot;blow away&quot; the regular expression matchers in Python and Ruby (and possibly
-  in Scala too).
+  with my simple rainy-afternoon matcher and potentially &quot;blow away&quot; the regular expression matchers 
+  in Python and Ruby (and possibly in Scala too).
   </p>
 
   <p>
@@ -143,10 +143,28 @@
 <li> <H4>[CU2] Automata Theory in Your Web-Browser</H4>
 
 <p>
+There are a number of classic algorithms in automata theory (such as the transformation of regular
+expressions into NFAs and DFAs, automata minimisation, subset construction). All these algorithms involve a fair 
+amount of calculations, which cannot be easily done by hand. There are a few web applications that annimate these 
+calculations, for example <A HREF="http://hackingoff.com/compilers/regular-expression-to-nfa-dfa">this one<A/>.
+</p>
+
+<p>
+There now many useful libraries for Javascript, for example, this one for graphs. There are also
+a number of new programming languages targetting Javascript. This project is for someone who
+want to get to know these languges by implementing and animating algorithms from automata
+theory or parsing.
+</p>
+
+  <B>Literature:</B> 
+  The same general literature as in [CU1].
+  </p>  
+
+<p>
   <B>Skills:</B> 
   This is a project for a student with good programming skills. 
   JavaScript or a similar web-programming language seems to be best suited 
-  for this project. Some knowledge in HTML and CSS cannot hurd either.
+  for this project. Some knowledge in HTML and CSS cannot hurt either.
   </p>
 
 
@@ -196,7 +214,7 @@
 
   <p>
   <b>Description:</b> 
-  Compilers translate high-level programs that humans can read and write into
+  Compilers translate high-level programs that humans can read into
   efficient machine code that can be run on a CPU or virtual machine.
   I recently implemented a very simple compiler for a very simple functional
   programming language following this 
@@ -205,7 +223,7 @@
   My code, written in <A HREF="http://www.scala-lang.org/">Scala</A>, of this compiler is 
   <A HREF="http://www.dcs.kcl.ac.uk/staff/urbanc/compiler.scala">here</A>.
   The compiler can deal with simple programs involving natural numbers, such
-  as Fibonacci numbers
+  as Fibonacci 
   or factorial (but it can be easily extended - that is not the point).
   </p>
 
@@ -246,7 +264,7 @@
   compilers. Since my compiler is implemented in <A HREF="http://www.scala-lang.org/">Scala</A>,
   it would make sense to continue this project in this language. I can be
   of help with questions and books about <A HREF="http://www.scala-lang.org/">Scala</A>.
-  But if Scala is a problem, my code can also be translated quickly into any other functional
+  But if Scala is a problem, my code can also be translated quickly into any other
   language. 
   </p>
 
@@ -339,11 +357,18 @@
   students only need to be prevented from answering question more than once thus skewing
   any statistics. Unlike electronic voting, no audit trail needs to be kept
   for student polling. Restricting the number of answers can probably be solved 
-  by setting appropriate cookies on the students
+  by setting appropriate cookies on the students'
   computers or smart phones.
   </p>
 
   <p>
+  However, there is one restriction that makes this project harder than it seems
+  as first sight. The department does not allow large server applications and databases
+  to be run on calcium. So the problem should be solved with as few resources needed
+  on the &quot;back-end&quot; which collects the votes. 
+  </p>
+
+  <p>
   <B>Literature:</B> 
   The project requires fluency in a web-programming language (for example 
   <A HREF="http://en.wikipedia.org/wiki/JavaScript">Javascript</A>,
@@ -351,10 +376,7 @@
   Java, <A HREF="http://www.python.org">Python</A>, 
   <A HREF="http://en.wikipedia.org/wiki/Go_(programming_language)">Go</A>, 
   <A HREF="http://www.scala-lang.org/">Scala</A>,
-  <A HREF="http://en.wikipedia.org/wiki/Ruby_(programming_language)">Ruby</A>) 
-  and possibly a cloud application platform (for example
-  <A HREF="https://developers.google.com/appengine/">Google App Engine</a> or 
-  <A HREF="http://www.heroku.com">Heroku</A>).
+  <A HREF="http://en.wikipedia.org/wiki/Ruby_(programming_language)">Ruby</A>). 
   For web-programming the 
   <A HREF="http://www.udacity.com/overview/Course/cs253/CourseRev/apr2012">Web Application Engineering</A>
   course at <A HREF="http://www.udacity.com">Udacity</A> is a good starting point 
@@ -367,7 +389,7 @@
 
   <p>
   <B>Skills:</B> 
-  In order to provide convenience for the lecturer, this project needs very good web-programming skills. A 
+  This project needs very good web-programming skills. A 
   <A HREF="http://en.wikipedia.org/wiki/Hacker_(programmer_subculture)">hacker mentality</A>
   (see above) is probably very beneficial: web-programming is an area that only emerged recently and
   many tools still lack maturity. You probably have to experiment a lot with several different