Binary file handouts/ho01.pdf has changed
--- a/handouts/ho01.tex Mon Sep 22 13:42:14 2014 +0100
+++ b/handouts/ho01.tex Fri Sep 26 14:06:55 2014 +0100
@@ -160,7 +160,7 @@
\draw (1pt,\y) -- (-3pt,\y) node[anchor=east] {\y};
%labels
\node[below=0.6cm] at (x axis mid) {number of \texttt{a}s};
- \node[rotate=90,above=0.8cm] at (y axis mid) {time in secs};
+ \node[rotate=90,left=0.9cm] at (y axis mid) {time in secs};
%plots
\draw[color=blue] plot[mark=*]
file {re-python.data};
@@ -222,7 +222,7 @@
\draw (1pt,\y) -- (-3pt,\y) node[anchor=east] {\y};
%labels
\node[below=0.6cm] at (x axis mid) {number of \texttt{a}s};
- \node[rotate=90, above=0.8cm] at (y axis mid) {time in secs};
+ \node[rotate=90,left=0.9cm] at (y axis mid) {time in secs};
%plots
\draw[color=green] plot[mark=square*, mark options={fill=white} ]
@@ -247,7 +247,7 @@
\begin{center}
\begin{tabular}{r@{\hspace{1mm}}r@{\hspace{1mm}}l@{\hspace{13mm}}l}
$r$ & $::=$ & $\varnothing$ & null\\
- & $\mid$ & $\epsilon$ & empty string / "" / []\\
+ & $\mid$ & $\epsilon$ & empty string / \texttt{""} / []\\
& $\mid$ & $c$ & single character\\
& $\mid$ & $r_1 + r_2$ & alternative / choice\\
& $\mid$ & $r_1 \cdot r_2$ & sequence\\
@@ -444,7 +444,7 @@
\varnothing^* \equiv \epsilon^*
\]
-\noindent holds. Such equivalences will be important for our
+\noindent holds? Such equivalences will be important for our
matching algorithm, because we can use them to simplify
regular expressions.
Binary file handouts/ho02.pdf has changed
--- a/handouts/ho02.tex Mon Sep 22 13:42:14 2014 +0100
+++ b/handouts/ho02.tex Fri Sep 26 14:06:55 2014 +0100
@@ -7,10 +7,9 @@
\section*{Handout 2}
-Having specified what problem our matching algorithm,
-\pcode{match}, is supposed to solve, namely for a given
-regular expression $r$ and string $s$ answer \textit{true} if
-and only if
+Having specified what problem our matching algorithm, \pcode{match},
+is supposed to solve, namely for a given regular expression $r$ and
+string $s$ answer \textit{true} if and only if
\[
s \in L(r)
@@ -21,10 +20,18 @@
because in general the set of strings $L$ returns is infinite
(recall what $L(a^*)$ is). In such cases there is no way we
can implement an exhaustive test for whether a string is
-member of this set or not.
+member of this set or not. Before we come to the matching
+algorithm, lets have a closer look at what it means when
+two regular expressions are equivalent.
+
+\subsection*{Regular Expression Equivalences}
-The algorithm we will define below consists of two parts. One is the function $nullable$ which takes a
-regular expression as argument and decides whether it can match the empty string (this means it returns a
+
+\subsection*{Matching Algorithm}
+
+The algorithm we will define below consists of two parts. One is the
+function $nullable$ which takes a regular expression as argument and
+decides whether it can match the empty string (this means it returns a
boolean). This can be easily defined recursively as follows:
\begin{center}
@@ -46,13 +53,17 @@
\]
\noindent
-Note on the left-hand side we have a function we can implement; on the right we have its specification.
+Note on the left-hand side we have a function we can implement; on the
+right we have its specification.
-The other function of our matching algorithm calculates a \emph{derivative} of a regular expression. This is a function
-which will take a regular expression, say $r$, and a character, say $c$, as argument and return
-a new regular expression. Be careful that the intuition behind this function is not so easy to grasp on first
-reading. Essentially this function solves the following problem: if $r$ can match a string of the form
-$c\!::\!s$, what does the regular expression look like that can match just $s$. The definition of this
+The other function of our matching algorithm calculates a
+\emph{derivative} of a regular expression. This is a function which
+will take a regular expression, say $r$, and a character, say $c$, as
+argument and return a new regular expression. Be careful that the
+intuition behind this function is not so easy to grasp on first
+reading. Essentially this function solves the following problem: if
+$r$ can match a string of the form $c\!::\!s$, what does the regular
+expression look like that can match just $s$. The definition of this
function is as follows:
\begin{center}
@@ -69,33 +80,45 @@
\end{center}
\noindent
-The first two clauses can be rationalised as follows: recall that $der$ should calculate a regular
-expression, if the ``input'' regular expression can match a string of the form $c\!::\!s$. Since neither
-$\varnothing$ nor $\epsilon$ can match such a string we return $\varnothing$. In the third case
-we have to make a case-distinction: In case the regular expression is $c$, then clearly it can recognise
-a string of the form $c\!::\!s$, just that $s$ is the empty string. Therefore we return the $\epsilon$-regular
-expression. In the other case we again return $\varnothing$ since no string of the $c\!::\!s$ can be matched.
-The $+$-case is relatively straightforward: all strings of the form $c\!::\!s$ are either matched by the
-regular expression $r_1$ or $r_2$. So we just have to recursively call $der$ with these two regular
-expressions and compose the results again with $+$. The $\cdot$-case is more complicated:
-if $r_1\cdot r_2$ matches a string of the form $c\!::\!s$, then the first part must be matched by $r_1$.
-Consequently, it makes sense to construct the regular expression for $s$ by calling $der$ with $r_1$ and
-``appending'' $r_2$. There is however one exception to this simple rule: if $r_1$ can match the empty
-string, then all of $c\!::\!s$ is matched by $r_2$. So in case $r_1$ is nullable (that is can match the
-empty string) we have to allow the choice $der\,c\,r_2$ for calculating the regular expression that can match
-$s$. The $*$-case is again simple: if $r^*$ matches a string of the form $c\!::\!s$, then the first part must be
-``matched'' by a single copy of $r$. Therefore we call recursively $der\,c\,r$ and ``append'' $r^*$ in order to
-match the rest of $s$.
+The first two clauses can be rationalised as follows: recall that
+$der$ should calculate a regular expression, if the ``input'' regular
+expression can match a string of the form $c\!::\!s$. Since neither
+$\varnothing$ nor $\epsilon$ can match such a string we return
+$\varnothing$. In the third case we have to make a case-distinction:
+In case the regular expression is $c$, then clearly it can recognise a
+string of the form $c\!::\!s$, just that $s$ is the empty
+string. Therefore we return the $\epsilon$-regular expression. In the
+other case we again return $\varnothing$ since no string of the
+$c\!::\!s$ can be matched. The $+$-case is relatively
+straightforward: all strings of the form $c\!::\!s$ are either matched
+by the regular expression $r_1$ or $r_2$. So we just have to
+recursively call $der$ with these two regular expressions and compose
+the results again with $+$. The $\cdot$-case is more complicated: if
+$r_1\cdot r_2$ matches a string of the form $c\!::\!s$, then the first
+part must be matched by $r_1$. Consequently, it makes sense to
+construct the regular expression for $s$ by calling $der$ with $r_1$
+and ``appending'' $r_2$. There is however one exception to this simple
+rule: if $r_1$ can match the empty string, then all of $c\!::\!s$ is
+matched by $r_2$. So in case $r_1$ is nullable (that is can match the
+empty string) we have to allow the choice $der\,c\,r_2$ for
+calculating the regular expression that can match $s$. The $*$-case is
+again simple: if $r^*$ matches a string of the form $c\!::\!s$, then
+the first part must be ``matched'' by a single copy of $r$. Therefore
+we call recursively $der\,c\,r$ and ``append'' $r^*$ in order to match
+the rest of $s$.
-Another way to rationalise the definition of $der$ is to consider the following operation on sets:
+Another way to rationalise the definition of $der$ is to consider the
+following operation on sets:
\[
Der\,c\,A\;\dn\;\{s\,|\,c\!::\!s \in A\}
\]
\noindent
-which essentially transforms a set of strings $A$ by filtering out all strings that do not start with $c$ and then
-strips off the $c$ from all the remaining strings. For example suppose $A = \{"f\!oo", "bar", "f\!rak"\}$ then
+which essentially transforms a set of strings $A$ by filtering out all
+strings that do not start with $c$ and then strips off the $c$ from
+all the remaining strings. For example suppose $A = \{"f\!oo", "bar",
+"f\!rak"\}$ then
\[
Der\,f\,A = \{"oo", "rak"\}\quad,\quad
Der\,b\,A = \{"ar"\} \quad \text{and} \quad
@@ -103,20 +126,23 @@
\]
\noindent
-Note that in the last case $Der$ is empty, because no string in $A$ starts with $a$. With this operation we can
-state the following property about $der$:
+Note that in the last case $Der$ is empty, because no string in $A$
+starts with $a$. With this operation we can state the following
+property about $der$:
\[
L(der\,c\,r) = Der\,c\,(L(r))
\]
\noindent
-This property clarifies what regular expression $der$ calculates, namely take the set of strings
-that $r$ can match (that is $L(r)$), filter out all strings not starting with $c$ and strip off the $c$ from the
-remaining strings---this is exactly the language that $der\,c\,r$ can match.
+This property clarifies what regular expression $der$ calculates,
+namely take the set of strings that $r$ can match (that is $L(r)$),
+filter out all strings not starting with $c$ and strip off the $c$
+from the remaining strings---this is exactly the language that
+$der\,c\,r$ can match.
-If we want to find out whether the string $"abc"$ is matched by the regular expression $r$
-then we can iteratively apply $Der$ as follows
+If we want to find out whether the string $"abc"$ is matched by the
+regular expression $r$ then we can iteratively apply $Der$ as follows
\begin{enumerate}
\item $Der\,a\,(L(r))$
@@ -125,10 +151,12 @@
\end{enumerate}
\noindent
-In the last step we need to test whether the empty string is in the set. Our matching algorithm will work similarly,
-just using regular expression instead of sets. For this we need to lift the notion of derivatives from characters to strings. This can be
-done using the following function, taking a string and regular expression as input and a regular expression
-as output.
+In the last step we need to test whether the empty string is in the
+set. Our matching algorithm will work similarly, just using regular
+expression instead of sets. For this we need to lift the notion of
+derivatives from characters to strings. This can be done using the
+following function, taking a string and regular expression as input
+and a regular expression as output.
\begin{center}
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {\hspace{-10mm}}l@ {}}
@@ -152,9 +180,15 @@
\]
\noindent
-holds, which means our algorithm satisfies the specification. This algorithm was introduced by
-Janus Brzozowski in 1964. Its main attractions are simplicity and being fast, as well as
-being easily extendable for other regular expressions such as $r^{\{n\}}$, $r^?$, $\sim{}r$ and so on.
+holds, which means our algorithm satisfies the specification. This
+algorithm was introduced by Janus Brzozowski in 1964. Its main
+attractions are simplicity and being fast, as well as being easily
+extendable for other regular expressions such as $r^{\{n\}}$, $r^?$,
+$\sim{}r$ and so on.
+
+\subsection*{The Matching Algorithm in Scala}
+
+
\begin{figure}[p]
{\lstset{language=Scala}\texttt{\lstinputlisting{../progs/app5.scala}}}
Binary file hws/hw01.pdf has changed
--- a/hws/hw01.tex Mon Sep 22 13:42:14 2014 +0100
+++ b/hws/hw01.tex Fri Sep 26 14:06:55 2014 +0100
@@ -37,7 +37,8 @@
\item Assume the concatenation operation of two strings is
written as $s_1 @ s_2$. Define the operation of
- \emph{concatenating} two sets of strings.
+ \emph{concatenating}, written $\_ \,@\, \_$ two sets of
+ strings.
\item Assume a set $A$ contains 4 strings and a set $B$ 7
strings, how many strings are in $A @ B$?
@@ -47,14 +48,14 @@
$\_^{n+1}$.)
\item How many regular expressions are there to match the
- string $abc$? (How many if they cannot include
+ string $abc$? How many if they cannot include
$\epsilon$ and $\varnothing$? How many if they are also
not allowed to contain stars? How many if they are also
- not allowed to contain $\_ + \_$?)
+ not allowed to contain $\_ + \_$?
\item When are two regular expressions equivalent? Can you
think of instances where two regular expressions match
- teh same strings, but it is not so obvious that they do? For
+ the same strings, but it is not so obvious that they do? For
example $a + b$ and $b + a$ do not count\ldots they
obviously match the same strings, namely $[a]$ and $[b]$.
Binary file hws/hw02.pdf has changed
--- a/hws/hw02.tex Mon Sep 22 13:42:14 2014 +0100
+++ b/hws/hw02.tex Fri Sep 26 14:06:55 2014 +0100
@@ -9,40 +9,53 @@
\section*{Homework 2}
\begin{enumerate}
-\item Review the first handout about sets of strings and read the second handout.
-Assuming the alphabet is $\{a, b\}$, decide which of the following equations are true
-in general for arbitrary languages $A$, $B$ and $C$:
+\item Review the first handout about sets of strings and read
+ the second handout. Assuming the alphabet is $\{a, b\}$,
+ decide which of the following equations are true in
+ general for arbitrary languages $A$, $B$ and $C$:
+
\begin{eqnarray}
-(A \cup B) @ C & = & A @ C \cup B @ C\nonumber\\
-A^* \cup B^* & = & (A \cup B)^*\nonumber\\
-A^* @ A^* & = & A^*\nonumber\\
-(A \cap B)@ C & = & (A@C) \cap (B@C)\nonumber
+(A \cup B) @ C & =^? & A @ C \cup B @ C\nonumber\\
+A^* \cup B^* & =^? & (A \cup B)^*\nonumber\\
+A^* @ A^* & =^? & A^*\nonumber\\
+(A \cap B)@ C & =^? & (A@C) \cap (B@C)\nonumber
\end{eqnarray}
-\noindent
-In case an equation is true, give an explanation; otherwise give a counter-example.
+\noindent In case an equation is true, give an explanation;
+otherwise give a counter-example.
-\item What is the meaning of a regular expression? Give an inductive definition.
+\item What is the meaning of a regular expression? Give an
+ inductive definition.
-\item Given the regular expressions $r_1 = \epsilon$ and $r_2 = \varnothing$ and $r_3 = a$.
-How many strings can the regular expressions $r_1^*$, $r_2^*$ and $r_3^*$ each match?
+\item Given the regular expressions $r_1 = \epsilon$ and $r_2
+ = \varnothing$ and $r_3 = a$. How many strings can the
+ regular expressions $r_1^*$, $r_2^*$ and $r_3^*$ each
+ match?
-\item Give regular expressions for (a) decimal numbers and for (b) binary numbers.
-(Hint: Observe that the empty string is not a number. Also observe that leading 0s
-are normally not written.)
+\item Give regular expressions for (a) decimal numbers and for
+ (b) binary numbers. (Hint: Observe that the empty string
+ is not a number. Also observe that leading 0s are
+ normally not written.)
-\item Decide whether the following two regular expressions are equivalent $(\epsilon + a)^* \equiv^? a^*$ and
-$(a \cdot b)^* \cdot a \equiv^? a \cdot (b \cdot a)^*$.
+\item Decide whether the following two regular expressions are
+ equivalent $(\epsilon + a)^* \equiv^? a^*$ and $(a \cdot
+ b)^* \cdot a \equiv^? a \cdot (b \cdot a)^*$.
-\item Given the regular expression $r = (a \cdot b + b)^*$. Compute what the derivative of $r$ is with respect to
-$a$ and $b$. Is $r$ nullable?
+\item Given the regular expression $r = (a \cdot b + b)^*$.
+ Compute what the derivative of $r$ is with respect to
+ $a$, $b$ and $c$. Is $r$ nullable?
\item Prove that for all regular expressions $r$ we have
-\begin{center}
-$\text{nullable}(r)$ \quad if and only if \quad $\texttt{""} \in L(r)$
+
+\begin{center}
+ $\textit{nullable}(r) \quad \text{if and only if}
+ \quad [] \in L(r)$
\end{center}
+ Write down clearly in each case what you need to prove and
+ what are the assumptions.
+
\end{enumerate}
\end{document}
Binary file hws/hw03.pdf has changed
--- a/hws/hw03.tex Mon Sep 22 13:42:14 2014 +0100
+++ b/hws/hw03.tex Fri Sep 26 14:06:55 2014 +0100
@@ -11,34 +11,41 @@
\begin{enumerate}
\item What is a regular language?
-\item Assume you have an alphabet consisting of the letters $a$, $b$ and $c$ only.
-(1) Find a regular expression that recognises the two strings $ab$ and $ac$. (2)
-Find a regular expression that matches all strings \emph{except} these two strings.
-Note, you can only use regular expressions of the form
+\item Assume you have an alphabet consisting of the letters
+ $a$, $b$ and $c$ only. (1) Find a regular expression
+ that recognises the two strings $ab$ and $ac$. (2) Find
+ a regular expression that matches all strings
+ \emph{except} these two strings. Note, you can only use
+ regular expressions of the form
+
+ \begin{center} $r ::=
+ \varnothing \;|\; \epsilon \;|\; c \;|\; r_1 + r_2 \;|\;
+ r_1 \cdot r_2 \;|\; r^*$
+ \end{center}
+
+\item Define the function \textit{zeroable} which takes a
+ regular expression as argument and returns a boolean.
+ The function should satisfy the following property:
+
\begin{center}
-$r ::= \varnothing \;|\; \epsilon \;|\; c \;|\; r_1 + r_2 \;|\; r_1 \cdot r_2 \;|\; r^*$
-\end{center}
-
-\item Define the function $zeroable$ which takes a regular expression as argument
-and returns a boolean. The
-function should satisfy the following property:
-\begin{center}
-$zeroable(r)$ \;if and only if\; $L(r) = \varnothing$
+$\textit{zeroable(r)} \;\text{if and only if}\; L(r) = \varnothing$
\end{center}
\item Define the tokens and regular expressions for a language
-consisting of numbers, left-parenthesis (, right-parenthesis ),
-identifiers and the operations $+$, $-$ and $*$. Can the following strings
-in this language be lexed?
+ consisting of numbers, left-parenthesis $($,
+ right-parenthesis $)$, identifiers and the operations $+$,
+ $-$ and $*$. Can the following strings in this language
+ be lexed?
\begin{itemize}
-\item \texttt{"}$(a + 3) * b$\texttt{"}
-\item \texttt{"}$)()++ -33$\texttt{"}
-\item \texttt{"}$(a / 3) * 3$\texttt{"}
+ \item $(a + 3) * b$
+ \item $)()++ -33$
+ \item $(a / 3) * 3$
\end{itemize}
+In case they can, can you give the corresponding token
+sequences.
-In case they can, can you give the corresponding token sequences.
\end{enumerate}
--- a/progs/re1.scala Mon Sep 22 13:42:14 2014 +0100
+++ b/progs/re1.scala Fri Sep 26 14:06:55 2014 +0100
@@ -1,7 +1,4 @@
-import scala.language.implicitConversions
-
abstract class Rexp
-
case object NULL extends Rexp
case object EMPTY extends Rexp
case class CHAR(c: Char) extends Rexp
@@ -9,15 +6,6 @@
case class SEQ(r1: Rexp, r2: Rexp) extends Rexp
case class STAR(r: Rexp) extends Rexp
-// some convenience for typing in regular expressions
-def charlist2rexp(s : List[Char]) : Rexp = s match {
- case Nil => EMPTY
- case c::Nil => CHAR(c)
- case c::s => SEQ(CHAR(c), charlist2rexp(s))
-}
-implicit def string2rexp(s : String) : Rexp = charlist2rexp(s.toList)
-
-
// nullable function: tests whether the regular
// expression can recognise the empty string
def nullable (r: Rexp) : Boolean = r match {
@@ -50,12 +38,13 @@
// main matcher function
def matcher(r: Rexp, s: String) : Boolean = nullable(ders(s.toList, r))
-//example
+//example from the homework
//val r = STAR(ALT(SEQ(CHAR('a'), CHAR('b')), CHAR('b')))
//der('a', r)
//der('b', r)
+//der('c', r)
-//one or zero
+//optional: one or zero times
def OPT(r: Rexp) = ALT(r, EMPTY)
//n-times
@@ -65,8 +54,10 @@
case n => SEQ(r, NTIMES(r, n - 1))
}
-def EVIL(n: Int) = SEQ(NTIMES(OPT("a"), n), NTIMES("a", n))
+// the evil regular expression a?{n} a{n}
+def EVIL(n: Int) = SEQ(NTIMES(OPT(CHAR('a')), n), NTIMES(CHAR('a'), n))
+//for measuring time
def time_needed[T](i: Int, code: => T) = {
val start = System.nanoTime()
for (j <- 1 to i) code
@@ -77,5 +68,3 @@
for (i <- 1 to 29) {
println(i + ": " + "%.5f".format(time_needed(1, matcher(EVIL(i), "a" * i))))
}
-
-
--- a/progs/re2.scala Mon Sep 22 13:42:14 2014 +0100
+++ b/progs/re2.scala Fri Sep 26 14:06:55 2014 +0100
@@ -1,26 +1,12 @@
-import scala.language.implicitConversions
-
abstract class Rexp
-
case object NULL extends Rexp
case object EMPTY extends Rexp
case class CHAR(c: Char) extends Rexp
case class ALT(r1: Rexp, r2: Rexp) extends Rexp
case class SEQ(r1: Rexp, r2: Rexp) extends Rexp
case class STAR(r: Rexp) extends Rexp
-case class NTIMES(r: Rexp, n: Int) extends Rexp
+case class NTIMES(r: Rexp, n: Int) extends Rexp //explicit constructor
-// some convenience for typing in regular expressions
-def charlist2rexp(s : List[Char]) : Rexp = s match {
- case Nil => EMPTY
- case c::Nil => CHAR(c)
- case c::s => SEQ(CHAR(c), charlist2rexp(s))
-}
-implicit def string2rexp(s : String) : Rexp = charlist2rexp(s.toList)
-
-
-// nullable function: tests whether the regular
-// expression can recognise the empty string
def nullable (r: Rexp) : Boolean = r match {
case NULL => false
case EMPTY => true
@@ -31,7 +17,6 @@
case NTIMES(r, i) => if (i == 0) true else nullable(r)
}
-// derivative of a regular expression w.r.t. a character
def der (c: Char, r: Rexp) : Rexp = r match {
case NULL => NULL
case EMPTY => NULL
@@ -45,20 +30,18 @@
if (i == 0) NULL else SEQ(der(c, r), NTIMES(r, i - 1))
}
-// derivative w.r.t. a string (iterates der)
def ders (s: List[Char], r: Rexp) : Rexp = s match {
case Nil => r
case c::s => ders(s, der(c, r))
}
-// main matcher function
def matcher(r: Rexp, s: String) : Boolean = nullable(ders(s.toList, r))
-//one or zero
+//optional: one or zero times
def OPT(r: Rexp) = ALT(r, EMPTY)
-def EVIL(n: Int) = SEQ(NTIMES(OPT("a"), n), NTIMES("a", n))
+def EVIL(n: Int) = SEQ(NTIMES(OPT(CHAR('a')), n), NTIMES(CHAR('a'), n))
def time_needed[T](i: Int, code: => T) = {
val start = System.nanoTime()
@@ -71,7 +54,7 @@
println(i + ": " + "%.5f".format(time_needed(1, matcher(EVIL(i), "a" * i))))
}
-
+//a bit bolder test
for (i <- 1 to 1000 by 50) {
println(i + " " + "%.5f".format(time_needed(1, matcher(EVIL(i), "a" * i))))
}
--- a/progs/re3.scala Mon Sep 22 13:42:14 2014 +0100
+++ b/progs/re3.scala Fri Sep 26 14:06:55 2014 +0100
@@ -1,57 +1,37 @@
-import scala.language.implicitConversions
-
-abstract class Rexp {
- def simp : Rexp = this
-}
-
+abstract class Rexp
case object NULL extends Rexp
case object EMPTY extends Rexp
case class CHAR(c: Char) extends Rexp
-case class ALT(r1: Rexp, r2: Rexp) extends Rexp {
- override def simp = (r1.simp, r2.simp) match {
- case (NULL, r) => r
- case (r, NULL) => r
- case (r, EMPTY) => if (nullable(r)) r else ALT(r, EMPTY)
- case (EMPTY, r) => if (nullable(r)) r else ALT(r, EMPTY)
- case (r1, r2) => if (r1 == r2) r1 else ALT(r1, r2)
+case class ALT(r1: Rexp, r2: Rexp) extends Rexp
+case class SEQ(r1: Rexp, r2: Rexp) extends Rexp
+case class STAR(r: Rexp) extends Rexp
+case class NTIMES(r: Rexp, n: Int) extends Rexp
+
+def simp(r: Rexp): Rexp = r match {
+ case ALT(r1, r2) => {
+ val r1s = simp(r1)
+ val r2s = simp(r2)
+ (r1s, r2s) match {
+ case (NULL, _) => r2s
+ case (_, NULL) => r1s
+ case _ => if (r1s == r2s) r1s else ALT(r1s, r2s)
+ }
}
-}
-case class SEQ(r1: Rexp, r2: Rexp) extends Rexp {
- override def simp = (r1.simp, r2.simp) match {
- case (NULL, _) => NULL
- case (_, NULL) => NULL
- case (EMPTY, r) => r
- case (r, EMPTY) => r
- case (r1, r2) => SEQ(r1, r2)
+ case SEQ(r1, r2) => {
+ val r1s = simp(r1)
+ val r2s = simp(r2)
+ (r1s, r2s) match {
+ case (NULL, _) => NULL
+ case (_, NULL) => NULL
+ case (EMPTY, _) => r2s
+ case (_, EMPTY) => r1s
+ case _ => SEQ(r1s, r2s)
+ }
}
-}
-case class STAR(r: Rexp) extends Rexp {
- override def simp = r.simp match {
- case NULL => EMPTY
- case EMPTY => EMPTY
- case r => STAR(r)
- }
-}
-case class NTIMES(r: Rexp, n: Int) extends Rexp {
- override def simp = if (n == 0) EMPTY else
- r.simp match {
- case NULL => NULL
- case EMPTY => EMPTY
- case r => NTIMES(r, n)
- }
+ case NTIMES(r, n) => NTIMES(simp(r), n)
+ case r => r
}
-// some convenience for typing in regular expressions
-def charlist2rexp(s : List[Char]) : Rexp = s match {
- case Nil => EMPTY
- case c::Nil => CHAR(c)
- case c::s => SEQ(CHAR(c), charlist2rexp(s))
-}
-implicit def string2rexp(s : String) : Rexp = charlist2rexp(s.toList)
-
-
-// nullable function: tests whether the regular
-// expression can recognise the empty string
def nullable (r: Rexp) : Boolean = r match {
case NULL => false
case EMPTY => true
@@ -62,7 +42,6 @@
case NTIMES(r, i) => if (i == 0) true else nullable(r)
}
-// derivative of a regular expression w.r.t. a character
def der (c: Char, r: Rexp) : Rexp = r match {
case NULL => NULL
case EMPTY => NULL
@@ -76,20 +55,18 @@
if (i == 0) NULL else SEQ(der(c, r), NTIMES(r, i - 1))
}
-// derivative w.r.t. a string (iterates der)
def ders (s: List[Char], r: Rexp) : Rexp = s match {
case Nil => r
- case c::s => ders(s, der(c, r).simp)
+ case c::s => ders(s, simp(der(c, r)))
}
-// main matcher function
def matcher(r: Rexp, s: String) : Boolean = nullable(ders(s.toList, r))
//one or zero
def OPT(r: Rexp) = ALT(r, EMPTY)
-def EVIL(n: Int) = SEQ(NTIMES(OPT("a"), n), NTIMES("a", n))
+def EVIL(n: Int) = SEQ(NTIMES(OPT(CHAR('a')), n), NTIMES(CHAR('a'), n))
def time_needed[T](i: Int, code: => T) = {
val start = System.nanoTime()
--- a/progs/scraper.scala Mon Sep 22 13:42:14 2014 +0100
+++ b/progs/scraper.scala Fri Sep 26 14:06:55 2014 +0100
@@ -24,7 +24,7 @@
val wr = new OutputStreamWriter(conn.getOutputStream())
//possible date ranges
-wr.write("Fdate=2012-8-24&Tdate=2012-09-25")
+wr.write("Fdate=2011-6-24&Tdate=2011-09-25")
//wr.write("Fdate=2011-8-24&Tdate=2011-09-25")
//wr.write("Fdate=2001-9-18&Tdate=2012-09-25")
wr.flush
Binary file slides/slides02.pdf has changed
--- a/slides/slides02.tex Mon Sep 22 13:42:14 2014 +0100
+++ b/slides/slides02.tex Fri Sep 26 14:06:55 2014 +0100
@@ -1,34 +1,48 @@
\documentclass[dvipsnames,14pt,t]{beamer}
-\usepackage{beamerthemeplaincu}
-\usepackage[absolute,overlay]{textpos}
-\usepackage{ifthen}
-\usepackage{tikz}
-\usepackage{pgf}
-\usepackage{calc}
-\usepackage{ulem}
-\usepackage{courier}
-\usepackage{listings}
-\renewcommand{\uline}[1]{#1}
-\usetikzlibrary{arrows}
-\usetikzlibrary{automata}
-\usetikzlibrary{shapes}
-\usetikzlibrary{shadows}
-\usetikzlibrary{positioning}
-\usetikzlibrary{plotmarks}
-\usetikzlibrary{calc}
-\usepackage{graphicx}
+\usepackage{../slides}
+\usepackage{../graphics}
\usepackage{../langs}
\usepackage{../data}
-\makeatletter
-\lst@CCPutMacro\lst@ProcessOther {"2D}{\lst@ttfamily{-{}}{-{}}}
-\@empty\z@\@empty
-\makeatother
+%\usepackage{beamerthemeplaincu}
+%\usepackage[absolute,overlay]{textpos}
+%\usepackage{ifthen}
+%\usepackage{tikz}
+%\usepackage{pgf}
+%\usepackage{calc}
+%\usepackage{ulem}
+%\usepackage{courier}
+%\usepackage{listings}
+%\renewcommand{\uline}[1]{#1}
+%\usetikzlibrary{arrows}
+%\usetikzlibrary{automata}
+%\usetikzlibrary{shapes}
+%\usetikzlibrary{shadows}
+%\usetikzlibrary{positioning}
+%\usetikzlibrary{plotmarks}
+%\usetikzlibrary{calc}
+%\usepackage{graphicx}
+%\usepackage{../langs}
+%\usepackage{../data}
+
+%\makeatletter
+%\lst@CCPutMacro\lst@ProcessOther {"2D}{\lst@ttfamily{-{}}{-{}}}
+%\@empty\z@\@empty
+%\makeatother
+
+\hfuzz=220pt
+
+\lstset{language=Scala,
+ style=mystyle,
+ numbersep=0pt,
+ numbers=none,
+ xleftmargin=0mm}
+
+\newcommand{\bl}[1]{\textcolor{blue}{#1}}
% beamer stuff
-\renewcommand{\slidecaption}{AFL 02, King's College London, 2.~October 2013}
-\newcommand{\bl}[1]{\textcolor{blue}{#1}}
-\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}% for definitions
+\renewcommand{\slidecaption}{AFL 02, King's College London}
+
\begin{document}