# HG changeset patch # User Christian Urban # Date 1411774622 -3600 # Node ID 65d1ea0e989fbd9412e38e0d4d4a94eb15a9f329 # Parent e5f4b8ff23b88fdd181664cb0dd848cb5153fd91 updated cws diff -r e5f4b8ff23b8 -r 65d1ea0e989f coursework/cw01.pdf Binary file coursework/cw01.pdf has changed diff -r e5f4b8ff23b8 -r 65d1ea0e989f coursework/cw01.tex --- a/coursework/cw01.tex Fri Sep 26 14:40:49 2014 +0100 +++ b/coursework/cw01.tex Sat Sep 27 00:37:02 2014 +0100 @@ -4,7 +4,7 @@ \begin{document} -\section*{Coursework 1} +\section*{Coursework 1 (Strand 1)} This coursework is worth 5\% and is due on 13 October at 16:00. You are asked to implement a regular expression matcher and submit a @@ -17,9 +17,9 @@ \subsubsection*{Disclaimer} -It should be understood that the work submitted represents your own effort. -You have not copied from anyone else. An exception is the Scala code I -showed during the lectures, which you can use.\bigskip +It should be understood that the work you submit represents your own +effort. You have not copied from anyone else. An exception is the +Scala code I showed during the lectures, which you can use.\bigskip \subsubsection*{Tasks} @@ -62,16 +62,10 @@ \noindent whereby in the last clause the set $\mathbb{A}$ stands for the set of \emph{all} strings. So $\sim{}r$ means `all the strings that $r$ -cannot match'. We assume ranges like $[a\mbox{-}z0\mbox{-}9]$ are a -shorthand for the regular expression +cannot match'. -\[ -[a b c d\ldots z 0 1\ldots 9]\;. -\] - -\noindent Be careful that your implementation of $nullable$ and $der$ satisfies -for every $r$ the following two properties: +for every $r$ the following two properties (see also Question 2): \begin{itemize} \item $nullable(r)$ if and only if $[]\in L(r)$ @@ -79,11 +73,13 @@ \end{itemize} \noindent -{\bf Important!} Your implementation should have explicit cases for the -basic regular expressions, but also for the extended regular expressions. -That means do not treat the extended regular expressions by just translating -them into the basic ones. See also Question 2, where you asked to give -the rules for \textit{nullable} and \textit{der}. +{\bf Important!} Your implementation should have explicit cases for +the basic regular expressions, but also explicit cases for the +extended regular expressions. That means do not treat the extended +regular expressions by just translating them into the basic ones. See +also Question 2, where you are asked to explicitly give the rules for +\textit{nullable} and \textit{der} for the extended regular +expressions. \subsection*{Question 1 (unmarked)} @@ -94,7 +90,8 @@ This question does not require any implementation. From the lectures you have seen the definitions for the functions \textit{nullable} and -\textit{der}. Give the rules for the extended regular expressions: +\textit{der} for the basic regular expressions. Give the rules for +the extended regular expressions: \begin{center} \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} @@ -103,11 +100,11 @@ $nullable(r^?)$ & $\dn$ & $?$\\ $nullable(r^{\{n,m\}})$ & $\dn$ & $?$\\ $nullable(\sim{}r)$ & $\dn$ & $?$\medskip\\ -$der c ([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\ -$der c (r^+)$ & $\dn$ & $?$\\ -$der c (r^?)$ & $\dn$ & $?$\\ -$der c (r^{\{n,m\}})$ & $\dn$ & $?$\\ -$der c (\sim{}r)$ & $\dn$ & $?$\\ +$der\, c\, ([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\ +$der\, c\, (r^+)$ & $\dn$ & $?$\\ +$der\, c\, (r^?)$ & $\dn$ & $?$\\ +$der\, c\, (r^{\{n,m\}})$ & $\dn$ & $?$\\ +$der\, c\, (\sim{}r)$ & $\dn$ & $?$\\ \end{tabular} \end{center} @@ -142,7 +139,8 @@ \subsection*{Question 4 (marked with 1\%)} -Consider the regular expression $/ \cdot * \cdot +Suppose \textit{[a-z]} stands for the range regular expression +$[a,b,c,\ldots,z]$. Consider the regular expression $/ \cdot * \cdot (\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot * \cdot /$ and decide wether the following four strings are matched by this regular expression. Answer yes or no. diff -r e5f4b8ff23b8 -r 65d1ea0e989f coursework/cw04.pdf Binary file coursework/cw04.pdf has changed diff -r e5f4b8ff23b8 -r 65d1ea0e989f coursework/cw04.tex --- a/coursework/cw04.tex Fri Sep 26 14:40:49 2014 +0100 +++ b/coursework/cw04.tex Sat Sep 27 00:37:02 2014 +0100 @@ -4,20 +4,25 @@ \begin{document} -\section*{Coursework Alternative} +\section*{Coursework (Strand 2)} -\noindent -This coursework is worth 25\% and is due on 12 December at 16:00. You are -asked to prove the correctness of a regular expression matcher with the -Isabelle theorem prover. This theorem prover is available from +\noindent This coursework is worth 25\% and is due on 12 +December at 16:00. You are asked to prove the correctness of a +regular expression matcher from the lectures using the +Isabelle theorem prover. You need to submit a theory file +containing this proof. The Isabelle theorem prover is +available from \begin{center} \url{http://isabelle.in.tum.de} \end{center} -\noindent -There is a shorte user guide for Isabelle called -``Programming and Proving in Isabelle/HOL'' at +\noindent This is an interactive theorem prover, meaning that +you can make definitions and state properties, and then help +the system with proving these properties. Sometimes the proofs +are also automatic. There is a shortish user guide for +Isabelle, called ``Programming and Proving in Isabelle/HOL'' +at \begin{center} \url{http://isabelle.in.tum.de/documentation.html} @@ -30,13 +35,187 @@ \url{http://www.concrete-semantics.org} \end{center} -\noindent -If you need more help or you are stuck somewhere, please feel free -to contact me (christian.urban@kcl.ac.uk). I am a main developer of -Isabelle and have used it for approximately the last 14 years. +\noindent The Isabelle theorem prover is operated through the +jEdit IDE, which might not be an editor that is widely known. +JEdit is documented in + +\begin{center} +\url{http://isabelle.in.tum.de/dist/Isabelle2014/doc/jedit.pdf} +\end{center} + + +\noindent If you need more help or you are stuck somewhere, +please feel free to contact me (christian.urban@kcl.ac.uk). I +am a main developer of Isabelle and have used it for +approximately the 14 years. One of the success stories of +Isabelle is the recent verification of a microkernel operating +system by an Australian group, see \url{http://sel4.systems}. +Their operating system is the only one that has been proved +correct according to its specification and is used for +application where high assurance, security and reliability is +needed. -\subsection*{Problem} +\subsection*{The Task} + +In this coursework you are asked to prove the correctness of +the regular expression matcher from the lectures in Isabelle. +For this you need to first specify what the matcher is +supposed to do and then to implement the algorithm. Finally +you need to prove that the algorithm meets the specification. +The first two parts are relatively easy, because the +definitions in Isabelle will look very similar to the +mathematical definitions from the lectures or the Scala code +that is supplied at KEATS. For example very similar to Scala, +regular expressions are defined in Isabelle as an inductive +datatype: + +\begin{lstlisting}[language={},numbers=none] +datatype rexp = + NULL +| EMPTY +| CHAR char +| SEQ rexp rexp +| ALT rexp rexp +| STAR rexp +\end{lstlisting} + +\noindent The meaning of regular expressions is given as +usual: + +\begin{center} +\begin{tabular}{rcl@{\hspace{10mm}}l} +$L(\varnothing)$ & $\dn$ & $\varnothing$ & \pcode{NULL}\\ +$L(\epsilon)$ & $\dn$ & $\{[]\}$ & \pcode{EMPTY}\\ +$L(c)$ & $\dn$ & $\{[c]\}$ & \pcode{CHAR}\\ +$L(r_1 + r_2)$ & $\dn$ & $L(r_1) \cup L(r_2)$ & \pcode{ALT}\\ +$L(r_1 \cdot r_2)$ & $\dn$ & $L(r_1) \,@\, L(r_2)$ & \pcode{SEQ}\\ +$L(r^*)$ & $\dn$ & $(L(r))^*$ & \pcode{STAR}\\ +\end{tabular} +\end{center} + +\noindent You would need to implement this function in order +to state the theorem about the correctness of the algorithm. +The function $L$ should in Isabelle take a \pcode{rexp} as +input and return a set of strings. Its type is +therefore + +\begin{center} +\pcode{L} \pcode{::} \pcode{rexp} $\Rightarrow$ \pcode{string set} +\end{center} + +\noindent Isabelle treats strings as an abbreviation for lists +of characters. This means you can pattern-match strings like +lists. The union operation on sets (for the \pcode{ALT}-case) +is a standard definition in Isabelle, but not the +concatenation operation on sets and also not the +star-operation. You would have to supply these definitions. +The concatenation operation can be defined in terms of the +append function, written \code{_ @ _} in Isabelle, for lists. +The star-operation can be defined as a ``big-union'' of +powers, like in the lectures, or directly as an inductive set. + +The functions for the matcher are shown in +Figure~\ref{matcher}. The theorem that needs to be proved is + +\begin{lstlisting}[numbers=none,language={},keywordstyle=\color{black}\ttfamily,mathescape] +theorem + "matches r s $\longleftrightarrow$ s $\in$ L r" +\end{lstlisting} + +\noindent which states that the function \emph{matches} is +true if and only if the string is in the language of the +regular expression. A proof for this lemma will need +side-lemmas about \pcode{nullable} and \pcode{der}. An example +proof in Isabelle that will not be relevant for the theorem +above is given in Figure~\ref{proof}. + +\begin{figure}[p] +\begin{lstlisting}[language={},keywordstyle=\color{black}\ttfamily,mathescape] +fun + nullable :: "rexp $\Rightarrow$ bool" +where + "nullable NULL = False" +| "nullable EMPTY = True" +| "nullable (CHAR _) = False" +| "nullable (ALT r1 r2) = (nullable(r1) $\vee$ nullable(r2))" +| "nullable (SEQ r1 r2) = (nullable(r1) $\wedge$ nullable(r2))" +| "nullable (STAR _) = True" + +fun + der :: "char $\Rightarrow$ rexp $\Rightarrow$ rexp" +where + "der c NULL = NULL" +| "der c EMPTY = NULL" +| "der c (CHAR d) = (if c = d then EMPTY else NULL)" +| "der c (ALT r1 r2) = ALT (der c r1) (der c r2)" +| "der c (SEQ r1 r2) = + (if (nullable r1) then ALT (SEQ (der c r1) r2) (der c r2) + else SEQ (der c r1) r2)" +| "der c (STAR r) = SEQ (der c r) (STAR r)" + +fun + ders :: "rexp $\Rightarrow$ string $\Rightarrow$ rexp" +where + "ders r [] = r" +| "ders r (c # s) = ders (der c r) s" + +fun + matches :: "rexp $\Rightarrow$ string $\Rightarrow$ bool" +where + "matches r s = nullable (ders r s)" +\end{lstlisting} +\caption{The definition of the matcher algorithm in +Isabelle.\label{matcher}} +\end{figure} + +\begin{figure}[p] +\begin{lstlisting}[language={},keywordstyle=\color{black}\ttfamily,mathescape] +fun + zeroable :: "rexp $\Rightarrow$ bool" +where + "zeroable NULL = True" +| "zeroable EMPTY = False" +| "zeroable (CHAR _) = False" +| "zeroable (ALT r1 r2) = (zeroable(r1) $\wedge$ zeroable(r2))" +| "zeroable (SEQ r1 r2) = (zeroable(r1) $\vee$ zeroable(r2))" +| "zeroable (STAR _) = False" + +lemma + "zeroable r $\longleftrightarrow$ L r = {}" +proof (induct) + case (NULL) + have "zeroable NULL" "L NULL = {}" by simp_all + then show "zeroable NULL $\longleftrightarrow$ (L NULL = {})" by simp +next + case (EMPTY) + have "$\neg$ zeroable EMPTY" "L EMPTY = {[]}" by simp_all + then show "zeroable EMPTY $\longleftrightarrow$ (L EMPTY = {})" by simp +next + case (CHAR c) + have "$\neg$ zeroable (CHAR c)" "L (CHAR c) = {[c]}" by simp_all + then show "zeroable (CHAR c) $\longleftrightarrow$ (L (CHAR c) = {})" by simp +next + case (ALT r1 r2) + have ih1: "zeroable r1 $\longleftrightarrow$ L r1 = {}" by fact + have ih2: "zeroable r2 $\longleftrightarrow$ L r2 = {}" by fact + show "zeroable (ALT r1 r2) $\longleftrightarrow$ (L (ALT r1 r2) = {})" + using ih1 ih2 by simp +next + case (SEQ r1 r2) + have ih1: "zeroable r1 $\longleftrightarrow$ L r1 = {}" by fact + have ih2: "zeroable r2 $\longleftrightarrow$ L r2 = {}" by fact + show "zeroable (SEQ r1 r2) $\longleftrightarrow$ (L (SEQ r1 r2) = {})" + using ih1 ih2 by (auto simp add: Conc_def) +next + case (STAR r) + have "$\neg$ zeroable (STAR r)" "[] $\in$ L (r) ^ 0" by simp_all + then show "zeroable (STAR r) $\longleftrightarrow$ (L (STAR r) = {})" + by (simp (no_asm) add: Star_def) blast +qed +\end{lstlisting} +\caption{An Isabelle proof about the function \pcode{zeroable}.\label{proof}} +\end{figure} \end{document}