# HG changeset patch # User Christian Urban # Date 1471986775 -7200 # Node ID e74c696821a21c4c7aff51174f6625a675d42ef1 # Parent 357c395ae83887e4bf7ed7e1d8f22d8e90841472 updated diff -r 357c395ae838 -r e74c696821a2 handouts/ho01.pdf Binary file handouts/ho01.pdf has changed diff -r 357c395ae838 -r e74c696821a2 handouts/ho01.tex --- a/handouts/ho01.tex Tue Aug 23 15:38:20 2016 +0200 +++ b/handouts/ho01.tex Tue Aug 23 23:12:55 2016 +0200 @@ -256,8 +256,15 @@ \url{http://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016} \end{center} +\noindent I can also highly recommend a cool webtalk from an engineer +from Stack Exchange on the same subject: + +\begin{center} +\url{https://vimeo.com/112065252} +\end{center} + \noindent -It was also a problem in the Atom editor +A similar problem also occured in the Atom editor: \begin{center} \url{http://davidvgalbraith.com/how-i-fixed-atom/} @@ -564,15 +571,15 @@ regular expressions. They have been studied for the last 60 years (by smarter people than me)---surely nothing new can be found out about them. I even have the vague recollection that -I did not quite understand them during my study. If I remember +I did not quite understand them during my undergraduate study. If I remember correctly,\footnote{That was really a long time ago.} I got utterly confused about $\ONE$ (which my lecturer wrote as $\epsilon$) and the empty string.\footnote{Obviously the -lecturer must have been bad.} Since my study, I have used +lecturer must have been bad.} Since my then, I have used regular expressions for implementing lexers and parsers as I have always been interested in all kinds of programming languages and compilers, which invariably need regular -expression in some form or shape. +expressions in some form or shape. To understand my fascination \emph{nowadays} with regular expressions, you need to know that my main scientific interest @@ -603,7 +610,7 @@ proposed by Sulzmann and Lu in 2014.\footnote{\url{http://goo.gl/bz0eHp}} Hopefully we can prove that the machine code(!) -that implements this code efficiently is correct. Writing +that implements this code efficiently is correct also. Writing programs in this way does not leave any room for potential errors or bugs. How nice! @@ -625,10 +632,10 @@ To conclude: Despite my early ignorance about regular expressions, I find them now very interesting. They have a beautiful mathematical theory behind them, which can be -sometimes quite deep and which contains hidden snares. They have +sometimes quite deep and which sometimes contains hidden snares. They have practical importance (remember the shocking runtime of the regular expression matchers in Python, Ruby and Java in some -instances) and the problems in Stack Exchange and the Atom editor. +instances and the problems in Stack Exchange and the Atom editor). People who are not very familiar with the mathematical background of regular expressions get them consistently wrong. The hope is that we can do better in the diff -r 357c395ae838 -r e74c696821a2 handouts/ho04.pdf Binary file handouts/ho04.pdf has changed diff -r 357c395ae838 -r e74c696821a2 handouts/ho04.tex --- a/handouts/ho04.tex Tue Aug 23 15:38:20 2016 +0200 +++ b/handouts/ho04.tex Tue Aug 23 23:12:55 2016 +0200 @@ -25,17 +25,11 @@ student and I found several rather annoying typos in their examples and definitions.} In order to give an answer for \emph{how} a regular expression matches a string, Sulzmann and -Lu introduce \emph{values}. A value will be the output of the +Lu use \emph{values}. A value will be the output of the algorithm whenever the regular expression matches the string. If the string does not match the string, an error will be -raised. Since the first phase of the algorithm by Sulzmann \& -Lu is identical to the derivative based matcher from the first -coursework, the function \textit{nullable} will be used to -decide whether as string is matched by a regular expression. -If \textit{nullable} says yes, then values are constructed -that reflect how the regular expression matched the string. - -The definitions for values is given below. They are shown +raised. +The definition for values is given below. They are shown together with the regular expressions $r$ to which they correspond: @@ -60,13 +54,12 @@ & $\mid$ & $Seq(v_1,v_2)$\\ & $\mid$ & $\Left(v)$ \\ & $\mid$ & $Right(v)$ \\ - & $\mid$ & $[v_1,\ldots\,v_n]$ \\ + & $\mid$ & $Stars [v_1,\ldots\,v_n]$ \\ \end{tabular} \end{tabular} \end{center} -\noindent The reason is that there is a very strong -correspondence between them. There is no value for the +\noindent There is no value for the $\ZERO$ regular expression, since it does not match any string. Otherwise there is exactly one value corresponding to each regular expression with the exception of $r_1 + r_2$ @@ -74,7 +67,7 @@ corresponding to the two alternatives. Note that $r^*$ is associated with a list of values, one for each copy of $r$ that was needed to match the string. This means we might also -return the empty list $[]$, if no copy was needed in case +return the empty list $Stars []$, if no copy was needed in case of $r^*$. For sequence, there is exactly one value, composed of two component values ($v_1$ and $v_2$).