% !TEX program = xelatex\documentclass{article}\usepackage{../style}\usepackage{../langs}\begin{document}\section*{Coursework (Strand 2)}\noindent This coursework is worth 20\% and is due on \cwISABELLE{} at18:00. You are asked to prove the correctness of the regular expressionmatcher from the lectures using the Isabelle theorem prover. You need tosubmit a theory file containing this proof and also a documentdescribing your proof. The Isabelle theorem prover is available from\begin{center}\url{http://isabelle.in.tum.de}\end{center}\noindent This is an interactive theorem prover, meaning thatyou can make definitions and state properties, and then helpthe system with proving these properties. Sometimes the proofsare also completely automatic. There is a shortish user guide forIsabelle, called ``Programming and Proving in Isabelle/HOL''at\begin{center}\url{http://isabelle.in.tum.de/documentation.html}\end{center}\noindentand also a longer (free) book at\begin{center}\url{http://www.concrete-semantics.org}\end{center}\noindent The Isabelle theorem prover is operated through thejEdit IDE, which might not be an editor that is widely known.JEdit is documented in\begin{center}\url{http://isabelle.in.tum.de/dist/Isabelle2014/doc/jedit.pdf}\end{center}\noindent If you need more help or you are stuck somewhere,please feel free to contact me (christian.urban at kcl.ac.uk). Iam one of the main developers of Isabelle and have used it forapproximately 16 years. One of the success stories ofIsabelle is the recent verification of a microkernel operatingsystem by an Australian group, see \url{http://sel4.systems}.Their operating system is the only one that has been provedcorrect according to its specification and is used forapplication where high assurance, security and reliability isneeded (like in helicopters which fly over enemy territory).\subsection*{The Task}In this coursework you are asked to prove the correctness of theregular expression matcher from the lectures in Isabelle. The matchershould be able to deal with the usual (basic) regular expressions\[\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*\]\noindentbut also with the following extended regular expressions:\begin{center}\begin{tabular}{ll} $r^{\{n\}}$ & exactly $n$-times\\ $r^{\{..m\}}$ & zero or more times $r$ but no more than $m$-times\\ $r^{\{n..\}}$ & at least $n$-times $r$\\ $r^{\{n..m\}}$ & at least $n$-times $r$ but no more than $m$-times\\ $\sim{}r$ & not-regular-expression of $r$\\\end{tabular}\end{center}\noindentYou need to first specify what the matcher issupposed to do and then to implement the algorithm. Finally you needto prove that the algorithm meets the specification. The first twoparts are relatively easy, because the definitions in Isabelle willlook very similar to the mathematical definitions from the lectures orthe Scala code that is supplied at KEATS. For example very similar toScala, regular expressions are defined in Isabelle as an inductivedatatype:\begin{lstlisting}[language={},numbers=none]datatype rexp = ZERO| ONE| CHAR char| SEQ rexp rexp| ALT rexp rexp| STAR rexp\end{lstlisting}\noindent The meaning of regular expressions is given as usual:\begin{center}\begin{tabular}{rcl@{\hspace{10mm}}l}$L(\ZERO)$ & $\dn$ & $\varnothing$ & \pcode{ZERO}\\$L(\ONE)$ & $\dn$ & $\{[]\}$ & \pcode{ONE}\\ $L(c)$ & $\dn$ & $\{[c]\}$ & \pcode{CHAR}\\$L(r_1 + r_2)$ & $\dn$ & $L(r_1) \cup L(r_2)$ & \pcode{ALT}\\$L(r_1 \cdot r_2)$ & $\dn$ & $L(r_1) \,@\, L(r_2)$ & \pcode{SEQ}\\$L(r^*)$ & $\dn$ & $(L(r))^*$ & \pcode{STAR}\\\end{tabular}\end{center}\noindent You would need to implement this function in orderto state the theorem about the correctness of the algorithm.The function $L$ should in Isabelle take a \pcode{rexp} asinput and return a set of strings. Its type istherefore \begin{center}\pcode{L} \pcode{::} \pcode{rexp} $\Rightarrow$ \pcode{string set}\end{center}\noindent Isabelle treats strings as an abbreviation for listsof characters. This means you can pattern-match strings likelists. The union operation on sets (for the \pcode{ALT}-case)is a standard definition in Isabelle, but not theconcatenation operation on sets and also not thestar-operation. You would have to supply these definitions.The concatenation operation can be defined in terms of theappend function, written \code{_ @ _} in Isabelle, for lists.The star-operation can be defined as a ``big-union'' of powers, like in the lectures, or directly as an inductive set.The functions for the matcher are shown inFigure~\ref{matcher}. The theorem that needs to be proved is\begin{lstlisting}[numbers=none,language={},keywordstyle=\color{black}\ttfamily,mathescape]theorem "matches r s $\longleftrightarrow$ s $\in$ L r"\end{lstlisting}\noindent which states that the function \emph{matches} istrue if and only if the string is in the language of theregular expression. A proof for this lemma will needside-lemmas about \pcode{nullable} and \pcode{der}. An exampleproof in Isabelle that will not be relevant for the theoremabove is given in Figure~\ref{proof}.\begin{figure}[p]\begin{lstlisting}[language={},keywordstyle=\color{black}\ttfamily,mathescape]fun nullable :: "rexp $\Rightarrow$ bool"where "nullable ZERO = False"| "nullable ONE = True"| "nullable (CHAR _) = False"| "nullable (ALT r1 r2) = (nullable(r1) $\vee$ nullable(r2))"| "nullable (SEQ r1 r2) = (nullable(r1) $\wedge$ nullable(r2))"| "nullable (STAR _) = True"fun der :: "char $\Rightarrow$ rexp $\Rightarrow$ rexp"where "der c ZERO = ZERO"| "der c ONE = ZERO"| "der c (CHAR d) = (if c = d then ONE else ZERO)"| "der c (ALT r1 r2) = ALT (der c r1) (der c r2)"| "der c (SEQ r1 r2) = (if (nullable r1) then ALT (SEQ (der c r1) r2) (der c r2) else SEQ (der c r1) r2)"| "der c (STAR r) = SEQ (der c r) (STAR r)"fun ders :: "rexp $\Rightarrow$ string $\Rightarrow$ rexp"where "ders r [] = r"| "ders r (c # s) = ders (der c r) s"fun matches :: "rexp $\Rightarrow$ string $\Rightarrow$ bool"where "matches r s = nullable (ders r s)" \end{lstlisting}\caption{The definition of the matcher algorithm in Isabelle.\label{matcher}}\end{figure}\begin{figure}[p]\begin{lstlisting}[language={},keywordstyle=\color{black}\ttfamily,mathescape]fun zeroable :: "rexp $\Rightarrow$ bool"where "zeroable ZERO = True"| "zeroable ONE = False"| "zeroable (CHAR _) = False"| "zeroable (ALT r1 r2) = (zeroable(r1) $\wedge$ zeroable(r2))"| "zeroable (SEQ r1 r2) = (zeroable(r1) $\vee$ zeroable(r2))"| "zeroable (STAR _) = False"lemma "zeroable r $\longleftrightarrow$ L r = {}"proof (induct) case (ZERO) have "zeroable ZERO" "L ZERO = {}" by simp_all then show "zeroable ZERO $\longleftrightarrow$ (L ZERO = {})" by simpnext case (ONE) have "$\neg$ zeroable ONE" "L ONE = {[]}" by simp_all then show "zeroable ONE $\longleftrightarrow$ (L ONE = {})" by simpnext case (CHAR c) have "$\neg$ zeroable (CHAR c)" "L (CHAR c) = {[c]}" by simp_all then show "zeroable (CHAR c) $\longleftrightarrow$ (L (CHAR c) = {})" by simpnext case (ALT r1 r2) have ih1: "zeroable r1 $\longleftrightarrow$ L r1 = {}" by fact have ih2: "zeroable r2 $\longleftrightarrow$ L r2 = {}" by fact show "zeroable (ALT r1 r2) $\longleftrightarrow$ (L (ALT r1 r2) = {})" using ih1 ih2 by simpnext case (SEQ r1 r2) have ih1: "zeroable r1 $\longleftrightarrow$ L r1 = {}" by fact have ih2: "zeroable r2 $\longleftrightarrow$ L r2 = {}" by fact show "zeroable (SEQ r1 r2) $\longleftrightarrow$ (L (SEQ r1 r2) = {})" using ih1 ih2 by (auto simp add: Conc_def)next case (STAR r) have "$\neg$ zeroable (STAR r)" "[] $\in$ L (r) ^ 0" by simp_all then show "zeroable (STAR r) $\longleftrightarrow$ (L (STAR r) = {})" by (simp (no_asm) add: Star_def) blastqed\end{lstlisting}\caption{An Isabelle proof about the function \texttt{zeroable}.\label{proof}}\end{figure}\end{document}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End: