# HG changeset patch # User Christian Urban # Date 1696284656 -3600 # Node ID 0b5f06539a84a9be567c9cfabe7cc89e384ba51c # Parent 4e221cf587fac788fc1c86e41907a0b2dda64961 updated diff -r 4e221cf587fa -r 0b5f06539a84 handouts/amm-ho.pdf Binary file handouts/amm-ho.pdf has changed diff -r 4e221cf587fa -r 0b5f06539a84 handouts/amm-ho.tex --- a/handouts/amm-ho.tex Sun Oct 01 15:25:22 2023 +0100 +++ b/handouts/amm-ho.tex Mon Oct 02 23:10:56 2023 +0100 @@ -73,7 +73,18 @@ \noindent This creates a file \code{amm} which before it can be run might -need some adjustments of the permissions. +need some adjustments of the permissions. Under recent versions of +Windows also have \texttt{curl}, but need a slightly different call: + +\begin{lstlisting}[numbers=none,language={},basicstyle=\ttfamily\small] +$ curl -L https://github.com/com-lihaoyi/Ammonite/releases/\ + download/2.5.9/3.2-2.5.9 --output amm.bat +\end{lstlisting} %% $ + +\noindent +Then you need to run Ammonite with \texttt{.$\backslash$amm} and there +is no need to change any permissions under Windows. + The big advantage of Ammonite is that it comes with some additional libraries already built-in and also allows one to easily break up code into smaller modules. For example reading and writing files in @@ -158,8 +169,11 @@ To sum up, Ammonite is a really useful addition to the Scala ecosystem. You can find more information about how to use it in the first five chapters of the ``Hands-on Scala'' book by Li Haoyi. These chapters are -free and can be used as a reference, see \url{https://www.handsonscala.com/part-i-introduction-to-scala.html} +free and can be used as a reference, see +\begin{center} +\url{https://www.handsonscala.com/part-i-introduction-to-scala.html} +\end{center} \end{document} diff -r 4e221cf587fa -r 0b5f06539a84 handouts/ho03.pdf Binary file handouts/ho03.pdf has changed diff -r 4e221cf587fa -r 0b5f06539a84 handouts/ho03.tex --- a/handouts/ho03.tex Sun Oct 01 15:25:22 2023 +0100 +++ b/handouts/ho03.tex Mon Oct 02 23:10:56 2023 +0100 @@ -562,11 +562,11 @@ \end{figure} \begin{figure}[p] -\small +\small \lstinputlisting[numbers=left,firstline=48,firstnumber=48,lastline=85]{../progs/automata/thompson.sc} \caption{The second part of the Thompson Construction implementing the composition of NFAs according to $\cdot$, $+$ and ${}^*$. - The implicit class (Lines 48--54) about rich partial functions + The extension (Lines 48--54) about rich partial functions implements the infix operation \texttt{+++} which combines an $\epsilon$NFA transition with an NFA transition (both are given as partial functions---but with different type!).\label{thompson2}} @@ -912,8 +912,11 @@ OK\ldots now you know why regular expression matchers in those languages are sometimes so slow. A bit surprising, don't you agree? Also it is still a mystery why Rust, which because of the reasons -above avoids NFAs and uses DFAs instead cannot compete in all cases +above avoids NFAs and uses DFAs instead, cannot compete in all cases with our simple derivative-based regular expression matcher in Scala. +There is an explanation for this as well\ldots{}remember there the +offending examples are of the form $r^{\{n\}}$. Why could they be +a problem in Rust? \subsection*{Subset Construction} @@ -1623,7 +1626,7 @@ \end{center} \noindent -I let you implement this ``automaton''. +You might want to implement this ``automaton''. What do you get? While this makes all sense (modulo the flaw with the infinite states), does this automaton teach us anything new? The answer is no, because it @@ -1634,7 +1637,7 @@ a way to restrict the number of states to something finite. Meaning it would give us a real automaton. However, this would lead us far, far away from what we want -discuss here. +discuss here. The end. diff -r 4e221cf587fa -r 0b5f06539a84 handouts/ho04.pdf Binary file handouts/ho04.pdf has changed diff -r 4e221cf587fa -r 0b5f06539a84 handouts/ho04.tex --- a/handouts/ho04.tex Sun Oct 01 15:25:22 2023 +0100 +++ b/handouts/ho04.tex Mon Oct 02 23:10:56 2023 +0100 @@ -17,21 +17,21 @@ Answering this question will also help us with the problem we are after, namely tokenising an input string. -The algorithm we will be looking at in this lecture was designed by Sulzmann -\& Lu in a rather recent research paper (from 2014). A link to it is -provided on KEATS, in case you are interested.\footnote{In my -humble opinion this is an interesting instance of the research -literature: it contains a very neat idea, but its presentation -is rather sloppy. In earlier versions of this paper, a King's -undergraduate student and I found several rather annoying typos in the -examples and definitions.} My former PhD student Fahad Ausaf and I even more recently -wrote a paper where we build on their result: we provided a -mathematical proof that their algorithm is really correct---the proof -Sulzmann \& Lu had originally given contained major flaws. Such correctness -proofs are important: Kuklewicz maintains a unit-test library -for the kind of algorithms we are interested in here and he showed -that many implementations in the ``wild'' are buggy, that is not -satisfy his unit tests: +The algorithm we will be looking at in this lecture was designed by +Sulzmann \& Lu in a rather recent research paper (from 2014). A link +to it is provided on KEATS, in case you are interested.\footnote{In my + humble opinion this is an interesting instance of the research + literature: it contains very clever ideas, but its presentation is + rather sloppy. In earlier versions of this paper, students and I + found several rather annoying typos in the examples and definitions; + we even found downright errors in their work.} Together with my former PhD +students Fahad Ausaf and Chengsong Tan we wrote several papers where +we build on their result: we provided a mathematical proof that their +algorithm is really correct---the proof Sulzmann \& Lu had originally +given contained major flaws. Such correctness proofs are important: +Kuklewicz maintains a unit-test library for the kind of algorithms we +are interested in here and he showed that many implementations in the +``wild'' are buggy, that is not satisfy his unit tests: \begin{center} \url{http://www.haskell.org/haskellwiki/Regex_Posix} @@ -433,7 +433,7 @@ a gigantic problem with the described algorithm so far: it is very slow. To make it faster, we have to include in all this the simplification from Lecture 2\ldots{}and what rotten luck: simplification messes things -up and we need to rectify the mess. This is what we shall do next. +up and we need to \emph{rectify} the mess. This is what we shall do next. \subsubsection*{Simplification} diff -r 4e221cf587fa -r 0b5f06539a84 langs.sty --- a/langs.sty Sun Oct 01 15:25:22 2023 +0100 +++ b/langs.sty Mon Oct 02 23:10:56 2023 +0100 @@ -37,7 +37,7 @@ \lstdefinelanguage{Scala}{ morekeywords={abstract,then,case,catch,class,def,% - do,else,enum,extends,false,final,finally,% + do,else,enum,extends,extension,false,final,finally,% for,given,if,implicit,import,lazy,match,mixin,% new,null,object,override,package,% private,protected,requires,return,sealed,% diff -r 4e221cf587fa -r 0b5f06539a84 progs/lexer/lex.sc --- a/progs/lexer/lex.sc Sun Oct 01 15:25:22 2023 +0100 +++ b/progs/lexer/lex.sc Mon Oct 02 23:10:56 2023 +0100 @@ -40,13 +40,13 @@ val HELLO : Rexp = "hello" -implicit def RexpOps(r: Rexp) = new { +extension (r: Rexp) { def | (s: Rexp) = ALT(r, s) def % = STAR(r) def ~ (s: Rexp) = SEQ(r, s) } -implicit def stringOps(s: String) = new { +extension (s: String) { def | (r: Rexp) = ALT(s, r) def | (r: String) = ALT(s, r) def % = STAR(s) diff -r 4e221cf587fa -r 0b5f06539a84 progs/lexer/lexer.sc --- a/progs/lexer/lexer.sc Sun Oct 01 15:25:22 2023 +0100 +++ b/progs/lexer/lexer.sc Mon Oct 02 23:10:56 2023 +0100 @@ -42,13 +42,13 @@ implicit def string2rexp(s : String) : Rexp = charlist2rexp(s.toList) -implicit def RexpOps(r: Rexp) = new { +extension (r: Rexp) { def | (s: Rexp) = ALT(r, s) def % = STAR(r) def ~ (s: Rexp) = SEQ(r, s) } -implicit def stringOps(s: String) = new { +extension (s: String) { def | (r: Rexp) = ALT(s, r) def | (r: String) = ALT(s, r) def % = STAR(s)