updated
authorChristian Urban <urbanc@in.tum.de>
Tue, 12 Feb 2019 21:23:00 +0000
changeset 618 f4818c95a32e
parent 617 f7de0915fff2
child 619 136517d67d40
updated
LINKS
data.sty
handouts/ho01.pdf
handouts/ho01.tex
handouts/ho02.pdf
handouts/ho02.tex
handouts/ho05.tex
hws/hw01.pdf
progs/pow.scala
progs/re3.scala
style.sty
--- a/LINKS	Sat Dec 29 10:30:27 2018 +0000
+++ b/LINKS	Tue Feb 12 21:23:00 2019 +0000
@@ -1,7 +1,22 @@
 Thinking Forth (Book to read)
 =================
 
+=================
+opinionated functional programming language compiling to the JVM 
+https://flix.github.io/#/research/
+https://flix.github.io/#/principles/
 
+
+=================
+General tail call elimination on the JVM
+
+http://cs.au.dk/~magnusm/papers/cc18/paper.pdf
+
+=================
+incremental approach to compiler  construction
+http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf
+
+=================
 Compiler courses
 http://www.cse.chalmers.se/edu/year/2017/course/TDA283/lectures/
 http://www.cs.columbia.edu/~sedwards/classes/2017/4115-fall/index.html
--- a/data.sty	Sat Dec 29 10:30:27 2018 +0000
+++ b/data.sty	Tue Feb 12 21:23:00 2019 +0000
@@ -73,6 +73,21 @@
 28 34.79653
 \end{filecontents}
 
+% JavaScript, example (a*)*b  
+\begin{filecontents}{re-js.data}
+5   0.061
+10  0.061
+15  0.061
+20  0.070
+23  0.131
+25  0.308
+26  0.564
+28  1.994
+30  7.648
+31  15.881 
+32  32.190
+\end{filecontents}
+
 % Java 8, example (a*)*b  
 \begin{filecontents}{re-java.data}
 5  0.00298
@@ -117,6 +132,8 @@
 39000  26.15787
 \end{filecontents}
 
+
+
 %% re1.scala: example a?{n} a{n}
 \begin{filecontents}{re1.data}
 1 0.00179
Binary file handouts/ho01.pdf has changed
--- a/handouts/ho01.tex	Sat Dec 29 10:30:27 2018 +0000
+++ b/handouts/ho01.tex	Tue Feb 12 21:23:00 2019 +0000
@@ -36,29 +36,34 @@
 
 %https://www.youtube.com/watch?v=gmhMQfFQu20
 \begin{document}
-\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017, 2018}
+\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017, 2018, 2019}
 
 \section*{Handout 1}
 
 This module is about text processing, be it for web-crawlers,
-compilers, dictionaries, DNA-data, ad filters and so on.  When looking for a
-particular string, like $abc$ in a large text we can use the
-Knuth-Morris-Pratt algorithm, which is currently the most efficient
-general string search algorithm. But often we do \emph{not} just look
-for a particular string, but for string patterns. For example, in
-program code we need to identify what are the keywords (\texttt{if}, \texttt{then},
-\texttt{while}, \texttt{for}, etc), what are the identifiers (variable names). A pattern for
-identifiers could be stated as: they start with a letter, followed by
-zero or more letters, numbers and underscores.  Often we also face the
-problem that we are given a string (for example some user input) and
-want to know whether it matches a particular pattern---be it an email
-address, for example. In this way we can exclude user input that would
-otherwise have nasty effects on our program (crashing it or making it
-go into an infinite loop, if not worse). In tools like Snort, scanning
-for computer viruses or filtering out spam usually involves scanning
-for some signature (essentially a string pattern).  The point is that
-the fast Knuth-Morris-Pratt algorithm for strings is not good enough
-for such string \emph{patterns}.\smallskip
+compilers, dictionaries, DNA-data, ad filters and so on.  When looking
+for a particular string, like \pcode{"foobar"} in a large text we can
+use the Knuth-Morris-Pratt algorithm, which is currently the most
+efficient general string search algorithm. But often we do \emph{not}
+just look for a particular string, but for string patterns. For
+example, in program code we need to identify what are the keywords
+(\texttt{if}, \texttt{then}, \texttt{while}, \texttt{for}, etc), what
+are the identifiers (variable names). A pattern for identifiers could
+be stated as: they start with a letter, followed by zero or more
+letters, numbers and underscores. You might also be surprised what
+constraints programming languages impose about numbers: for example
+123 in JSON is OK, but 0123 is not.
+
+Often we also face the problem that we are given a string (for example
+some user input) and want to know whether it matches a particular
+pattern---be it an email address, for example. In this way we can
+exclude user input that would otherwise have nasty effects on our
+program (crashing it or making it go into an infinite loop, if not
+worse). In tools like Snort, scanning for computer viruses or
+filtering out spam usually involves scanning for some signature
+(essentially a string pattern).  The point is that the fast
+Knuth-Morris-Pratt algorithm for strings is not good enough for such
+string \emph{patterns}.\smallskip
 
 \defn{Regular expressions} help with conveniently specifying
 such patterns. The idea behind regular expressions is that
@@ -194,7 +199,7 @@
 before (remember the PRA module?). Why on earth then is there any
 interest in studying them again in depth in this module? Well, one
 answer is in the following two graphs about regular expression
-matching in Python, Ruby and Java (Version 8).
+matching in Python, Ruby, JavaScript and Java (Version 8).
 
 \begin{center}
 \begin{tabular}{@{\hspace{-1mm}}c@{\hspace{1mm}}c@{}}
@@ -214,11 +219,12 @@
     axis lines=left,
     width=5.5cm,
     height=4.5cm, 
-    legend entries={Python, Java 8},  
+    legend entries={Python, Java 8, JavaScript},  
     legend pos=north west,
     legend cell align=left]
 \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data};
 \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data};
+\addplot[red,mark=*, mark options={fill=white}] table {re-js.data};
 \end{axis}
 \end{tikzpicture}
 &
@@ -248,7 +254,7 @@
 \end{tabular}
 \end{center}
 
-\noindent This first graph shows that Python and Java 8 need
+\noindent This first graph shows that Python, JavaScript and Java 8 need
 approximately 30 seconds to find out that the regular expression
 $\texttt{(a*)*\,b}$ does not match strings of 28 \texttt{a}s.
 Similarly, the second shows that Python needs approximately 29 seconds
Binary file handouts/ho02.pdf has changed
--- a/handouts/ho02.tex	Sat Dec 29 10:30:27 2018 +0000
+++ b/handouts/ho02.tex	Tue Feb 12 21:23:00 2019 +0000
@@ -6,22 +6,23 @@
 
 
 \begin{document}
-\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017, 2018}
+\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017, 2018, 2019}
 
 
 \section*{Handout 2 (Regular Expression Matching)}
 
 This lecture is about implementing a more efficient regular expression
 matcher (the plots on the right below)---more efficient than the
-matchers from regular expression libraries in Ruby, Python and Java
-(the plots on the left). The first pair of plots shows the running time
-for the regular expression $(a^*)^*\cdot b$ and strings composed of
-$n$ \pcode{a}s (meaning this regular expression actually does not
-match the strings). The second pair of plots shows the running time for
-the regular expressions $a^?{}^{\{n\}}\cdot a^{\{n\}}$ and strings
-also composed of $n$ \pcode{a}s (this time the regular expressions
-match the strings).  To see the substantial differences in the left
-and right plots below, note the different scales of the $x$-axes.
+matchers from regular expression libraries in Ruby, Python, JavaScript
+and Java (the plots on the left). For this consider some experimental
+data: The first pair of plots shows the running time for the
+regular expression $(a^*)^*\cdot b$ and strings composed of $n$
+\pcode{a}s (meaning this regular expression actually does not match
+the strings). The second pair of plots shows the running time for the
+regular expressions $a^?{}^{\{n\}}\cdot a^{\{n\}}$ and strings also
+composed of $n$ \pcode{a}s (this time the regular expressions match
+the strings).  To see the substantial differences in the left and
+right plots below, note the different scales of the $x$-axes.
 
   
 \begin{center}
@@ -41,11 +42,12 @@
     axis lines=left,
     width=5cm,
     height=5cm, 
-    legend entries={Java 8, Python},  
+    legend entries={Java 8, Python, JavaScript},  
     legend pos=north west,
     legend cell align=left]
 \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data};
 \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data};
+\addplot[red,mark=*, mark options={fill=white}] table {re-js.data};
 \end{axis}
 \end{tikzpicture}
 &
--- a/handouts/ho05.tex	Sat Dec 29 10:30:27 2018 +0000
+++ b/handouts/ho05.tex	Tue Feb 12 21:23:00 2019 +0000
@@ -6,6 +6,9 @@
 % epsilon and left-recursion elimination
 % http://www.mollypages.org/page/grammar/index.mp
 
+%% parsing scala files
+%%https://scalameta.org/
+
 \begin{document}
 
 \section*{Handout 5 (Grammars \& Parser)}
Binary file hws/hw01.pdf has changed
--- a/progs/pow.scala	Sat Dec 29 10:30:27 2018 +0000
+++ b/progs/pow.scala	Tue Feb 12 21:23:00 2019 +0000
@@ -45,6 +45,8 @@
 
 val A = Set("a", "b", "c")
 pow(A, 3)
+pow(A, 3).size
 
 val B = Set("a", "b", "")
 pow(B, 4)
+pow(B, 4).size
--- a/progs/re3.scala	Sat Dec 29 10:30:27 2018 +0000
+++ b/progs/re3.scala	Tue Feb 12 21:23:00 2019 +0000
@@ -58,15 +58,26 @@
 
 
 // derivative w.r.t. a string (iterates der)
-def ders (s: List[Char], r: Rexp) : Rexp = s match {
+def ders(s: List[Char], r: Rexp) : Rexp = s match {
   case Nil => r
   case c::s => ders(s, simp(der(c, r)))
 }
 
+// derivative w.r.t. a string (iterates der)
+def dersp(s: List[Char], r: Rexp) : Rexp = s match {
+  case Nil => r
+  case c::s => dersp(s, der(c, r))
+}
+
 
 // main matcher function
 def matcher(r: Rexp, s: String) : Boolean = nullable(ders(s.toList, r))
 
+//tests
+val q = SEQ(SEQ(CHAR('x'), CHAR('y')), CHAR('z'))
+dersp("x".toList, q)
+dersp("xy".toList, q)
+dersp("xyz".toList, q)
 
 //one or zero
 def OPT(r: Rexp) = ALT(r, ONE)
--- a/style.sty	Sat Dec 29 10:30:27 2018 +0000
+++ b/style.sty	Tue Feb 12 21:23:00 2019 +0000
@@ -8,6 +8,7 @@
 \usepackage{menukeys}
 \definecolor{darkblue}{rgb}{0,0,0.6}
 \usepackage[colorlinks=true,urlcolor=darkblue,linkcolor=darkblue]{hyperref}
+\usepackage{soul}
 
 %%% for regular expressions and values
 \newcommand{\ZERO}{\mbox{\bf 0}}
@@ -76,4 +77,6 @@
 regular feedback to me: for example
 what were the most interesting, least interesting, or confusing 
 parts in this lecture? Any problems with my Scala code? Please 
-feel free to share any other questions or concerns.} 
+feel free to share any other questions or concerns. Also, all my
+material is \st{crap} imperfect. If you have any suggestions for
+improvement, I am very grateful to hear.}