theory CookBookimports Mainuses "~/Theorem-Provers/Isabelle/Isabelle-CVS/Doc/antiquote_setup.ML"("comp_simproc")begin(*<*)ML {*local structure O = ThyOutputin fun check_exists f = if File.exists (Path.explode ("~~/src/" ^ f)) then () else error ("Source file " ^ quote f ^ " does not exist.") val _ = O.add_commands [("ML_file", O.args (Scan.lift Args.name) (O.output (fn _ => fn name => (check_exists name; Pretty.str name))))];end*}(*>*)section {* Introduction *}text {* The purpose of this document is to guide the reader through the first steps in Isabelle programming, and to provide recipes for solving common problems. *}subsection {* Intended Audience and Prior Knowledge *}text {* This cookbook targets an audience who already knows how to use the Isabelle system to write theories and proofs, but without using ML. You should also be familiar with the \emph{Standard ML} programming language, which is used for Isabelle programming. If you are unfamiliar with any of these two subjects, you should first work through the Isabelle/HOL tutorial \cite{isa-tutorial} and Paulson's book on Standard ML \cite{paulson-ml2}.*}subsection {* Primary Documentation *}text {* \begin{description} \item[The Implementation Manual \cite{isa-imp}] describes Isabelle from a programmer's perspective, documenting both the underlying concepts and the concrete interfaces. \item[The Isabelle Reference Manual \cite{isabelle-ref}] is an older document that used to be the main reference, when all reasoning happened on the ML level. Many parts of it are outdated now, but some parts, mainly the chapters on tactics, are still useful. \item[The code] is of course the ultimate reference for how things really work. Therefore you should not hesitate to look at the way things are actually implemented. More importantly, it is often good to look at code that does similar things as you want to do, to learn from other people's code. \end{description} Since Isabelle is not a finished product, these manuals, just like the implementation itself, are always under construction. This can be dificult and frustrating at times, when interfaces are changing frequently. But it is a reality that progress means changing things (FIXME: need some short and convincing comment that this is a strategy, not a problem that should be solved).*}section {* First Steps *}text {* Isabelle programming happens in an enhanced dialect of Standard ML, which adds antiquotations containing references to the logical context. Just like all lemmas or proofs, all ML code that you write lives in a theory, where it is embedded using the \isacommand{ML} command:*}ML {* 3 + 4*}text {* The \isacommand{ML} command takes an arbitrary ML expression, which is evaluated. It can also contain value or function bindings.*}subsection {* Antiquotations *}text {* The main advantage of embedding all code in a theory is that the code can contain references to entities that are defined in the theory. Let us for example, print out the name of the current theory:*}ML {* Context.theory_name @{theory} *}text {* The @{text "@{theory}"} antiquotation is substituted with the current theory, whose name can then be extracted using a the function @{ML "Context.theory_name"}. Note that antiquotations are statically scoped. The function*}ML {* fun current_thyname () = Context.theory_name @{theory}*}text {* does \emph{not} return the name of the current theory. Instead, we have defined the constant function that always returns the string @{ML "\"CookBook\""}, which is the name of @{text "@{theory}"} at the point where the code is embedded. Operationally speaking, @{text "@{theory}"} is \emph{not} replaced with code that will look up the current theory in some (destructive) data structure and return it. Instead, it is really replaced with the theory value. In the course of this introduction, we will learn about more of these antoquotations, which greatly simplify programming, since you can directly access all kinds of logical elements from ML.*}subsection {* Terms *}text {* We can simply quote Isabelle terms from ML using the @{text "@{term \<dots>}"} antiquotation:*}ML {* @{term "(a::nat) + b = c"} *}text {* This shows the term @{term "(a::nat) + b = c"} in the internal representation, with all gory details. Terms are just an ML datatype, and they are defined in @{ML_file "Pure/term.ML"}. The representation of terms uses deBruin indices: Bound variables are represented by the constructor @{ML Bound}, and the index refers to the number of lambdas we have to skip until we hit the lambda that binds the variable. The names of bound variables are kept at the abstractions, but they are just comments. See \ichcite{ch:logic} for more details. \begin{readmore} Terms are described in detail in \ichcite{ch:logic}. Their definition and many useful operations can be found in @{ML_file "Pure/term.ML"}. \end{readmore} In a similar way we can quote types and theorems:*}ML {* @{typ "(int * nat) list"} *}ML {* @{thm allI} *}text {* In the default setup, types and theorems are printed as strings. *}text {* Sometimes the internal representation can be surprisingly different from what you see at the user level, because the layer of parsing/type checking/pretty printing can be quite thick. \begin{exercise} Look at the internal term representation of the following terms, and find out why they are represented like this. \begin{itemize} \item @{term "case x of 0 \<Rightarrow> 0 | Suc y \<Rightarrow> y"} \item @{term "\<lambda>(x,y). P y x"} \item @{term "{ [x::int] | x. x \<le> -2 }"} \end{itemize} Hint: The third term is already quite big, and the pretty printer may omit parts of it by default. If you want to see all of it, you can use @{ML "print_depth 50"} to set the limit to a value high enough.\end{exercise}*}subsection {* Type checking *}text {* We can freely construct and manipulate terms, since they are just arbitrary unchecked trees. However, we eventually want to see if a term is wellformed in a certain context. Type checking is done via @{ML cterm_of}, which turns a @{ML_type term} into a @{ML_type cterm}, a \emph{certified} term. Unlike @{ML_type term}s, which are just trees, @{ML_type "cterm"}s are abstract objects that are guaranteed to be type-correct, and can only be constructed via the official interfaces. Type checking is always relative to a theory context. For now we can use the @{ML "@{theory}"} antiquotation to get hold of the theory at the current point:*}ML {* let val natT = @{typ "nat"} val zero = @{term "0::nat"}(*Const ("HOL.zero_class.zero", natT)*) in cterm_of @{theory} (Const ("HOL.plus_class.plus", natT --> natT --> natT) $ zero $ zero) end*}ML {* @{const_name plus}*}ML {* @{term "{ [x::int] | x. x \<le> -2 }"}*}text {* The internal names of constants like @{term "zero"} or @{text "+"} are often more complex than one first expects. Here, the extra prefixes @{text zero_class} and @{text plus_class} are present because the constants are defined within a type class. Guessing such internal names can be extremely hard, which is why the system provides another antiquotation: @{ML "@{const_name plus}"} gives just this name. \begin{exercise} Write a function @{ML_text "rev_sum : term -> term"} that takes a term of the form @{text "t\<^isub>1 + t\<^isub>2 + \<dots> + t\<^isub>n"} and returns the reversed sum @{text "t\<^isub>n + \<dots> + t\<^isub>2 + t\<^isub>1"}. Note that @{text "+"} associates to the left. Try your function on some examples, and see if the result typechecks. \end{exercise} \begin{exercise} Write a function which takes two terms representing natural numbers in unary (like @{term "Suc (Suc (Suc 0))"}), and produce the unary number representing their sum. \end{exercise} \begin{exercise} Look at the functions defined in @{ML_file "Pure/logic.ML"} and @{ML_file "HOL/hologic.ML"} and see if they can make your life easier. \end{exercise}*}subsection {* Theorems *}text {* Just like @{ML_type cterm}s, theorems (of type @{ML_type thm}) are abstract objects that can only be built by going through the kernel interfaces, which means that all your proofs will be checked. The basic rules of the Isabelle/Pure logical framework are defined in @{ML_file "Pure/thm.ML"}. Using these rules, which are just ML functions, you can do simple natural deduction proofs on the ML level. For example, the statement @{prop "(\<And>(x::nat). P x \<Longrightarrow> Q x) \<Longrightarrow> P t \<Longrightarrow> Q t"} can be proved like this\footnote{Note that @{text "|>"} is just reverse application. This combinator, and several variants are defined in @{ML_file "Pure/General/basics.ML"}}:*}ML {*let val thy = @{theory} val nat = HOLogic.natT val x = Free ("x", nat) val t = Free ("t", nat) val P = Free ("P", nat --> HOLogic.boolT) val Q = Free ("Q", nat --> HOLogic.boolT) val A1 = Logic.all x (Logic.mk_implies (HOLogic.mk_Trueprop (P $ x), HOLogic.mk_Trueprop (Q $ x))) |> cterm_of thy val A2 = HOLogic.mk_Trueprop (P $ t) |> cterm_of thy val Pt_implies_Qt = assume A1 |> forall_elim (cterm_of thy t) val Qt = implies_elim Pt_implies_Qt (assume A2)in Qt |> implies_intr A2 |> implies_intr A1end*}subsection {* Tactical reasoning *}text {* The goal-oriented tactical style is similar to the @{text apply} style at the user level. Reasoning is centered around a \emph{goal}, which is modified in a sequence of proof steps until it is solved. A goal (or goal state) is a special @{ML_type thm}, which by convention is an implication: @{text[display] "A\<^isub>1 \<Longrightarrow> \<dots> \<Longrightarrow> A\<^isub>n \<Longrightarrow> #(C)"} Since the final result @{term C} could again be an implication, there is the @{text "#"} around the final result, which protects its premises from being misinterpreted as open subgoals. The protection @{text "# :: prop \<Rightarrow> prop"} is just the identity and used as a syntactic marker. Now tactics are just functions that map a goal state to a (lazy) sequence of successor states, hence the type of a tactic is @{ML_type[display] "thm -> thm Seq.seq"} See @{ML_file "Pure/General/seq.ML"} for the implementation of lazy sequences. Of course, tactics are expected to behave nicely and leave the final conclusion @{term C} intact. In order to start a tactical proof for @{term A}, we just set up the trivial goal @{text "A \<Longrightarrow> #(A)"} and run the tactic on it. When the subgoal is solved, we have just @{text "#(A)"} and can remove the protection. The operations in @{ML_file "Pure/goal.ML"} do just that and we can use them. Let us transcribe a simple apply style proof from the tutorial\cite{isa-tutorial} into ML:*}lemma disj_swap: "P \<or> Q \<Longrightarrow> Q \<or> P"apply (erule disjE) apply (rule disjI2) apply assumptionapply (rule disjI1)apply assumptiondoneML {*let val ctxt = @{context} val goal = @{prop "P \<or> Q \<Longrightarrow> Q \<or> P"}in Goal.prove ctxt ["P", "Q"] [] goal (fn _ => eresolve_tac [disjE] 1 THEN resolve_tac [disjI2] 1 THEN assume_tac 1 THEN resolve_tac [disjI1] 1 THEN assume_tac 1)end*}text {* Tactics that affect only a certain subgoal, take a subgoal number as an integer parameter. Here we always work on the first subgoal, following exactly the @{text "apply"} script.*}section {* Case Study: Relation Composition *}text {* \emph{Note: This is completely unfinished. I hoped to have a section with a nontrivial example, but I ran into several problems.} Recall that HOL has special syntax for set comprehensions: @{term "{ f x y |x y. P x y}"} abbreviates @{term[source] "{u. \<exists>x y. u = f x y \<and> P x y}"}. We will automatically prove statements of the following form: @{lemma[display] "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2 x, r\<^isub>2 x) |x. P\<^isub>2 x} = {(l\<^isub>2 x, r\<^isub>1 y) |x y. r\<^isub>2 x = l\<^isub>1 y \<and> P\<^isub>2 x \<and> P\<^isub>1 y}" by auto} In Isabelle, relation composition is defined to be consistent with function composition, that is, the relation applied ``first'' is written on the right hand side. This different from what many textbooks do. The above statement about composition is not proved automatically by @{method simp}, and it cannot be solved by a fixed set of rewrite rules, since the number of (implicit) quantifiers may vary. Here, we only have one bound variable in each comprehension, but in general there can be more. On the other hand, @{method auto} proves the above statement quickly, by breaking the equality into two parts and proving them separately. However, if e.g.\ @{term "P\<^isub>1"} is a complicated expression, the automated tools may get confused. Our goal is now to develop a small procedure that can compute (with proof) the composition of two relation comprehensions, which can be used to extend the simplifier.*}subsection {*A tactic *}text {* Let's start with a step-by-step proof of the above statement *}lemma "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2 x, r\<^isub>2 x) |x. P\<^isub>2 x} = {(l\<^isub>2 x, r\<^isub>1 y) |x y. r\<^isub>2 x = l\<^isub>1 y \<and> P\<^isub>2 x \<and> P\<^isub>1 y}"apply (rule set_ext)apply (rule iffI) apply (erule rel_compE) -- {* @{text "\<subseteq>"} *} apply (erule CollectE) -- {* eliminate @{text "Collect"}, @{text "\<exists>"}, @{text "\<and>"}, and pairs *} apply (erule CollectE) apply (erule exE) apply (erule exE) apply (erule conjE) apply (erule conjE) apply (erule Pair_inject) apply (erule Pair_inject) apply (simp only:) apply (rule CollectI) -- {* introduce them again *} apply (rule exI) apply (rule exI) apply (rule conjI) apply (rule refl) apply (rule conjI) apply (rule sym) apply (assumption) apply (rule conjI) apply assumption apply assumptionapply (erule CollectE) -- {* @{text "\<subseteq>"} *}apply (erule exE)+apply (erule conjE)+apply (simp only:)apply (rule rel_compI) apply (rule CollectI) apply (rule exI) apply (rule conjI) apply (rule refl) apply assumptionapply (rule CollectI)apply (rule exI)apply (rule conjI)apply (subst Pair_eq)apply (rule conjI) apply assumptionapply (rule refl)apply assumptiondonetext {* The reader will probably need to step through the proof and verify that there is nothing spectacular going on here. The @{text apply} script just applies the usual elimination and introduction rules in the right order. This script is of course totally unreadable. But we are not trying to produce pretty Isar proofs here. We just want to find out which rules are needed and how they must be applied to complete the proof. And a detailed apply-style proof can often be turned into a tactic quite easily. Of course we must resist the temptation to use @{method auto}, @{method blast} and friends, since their behaviour is not predictable enough. But the simple @{method rule} and @{method erule} methods are fine. Notice that this proof depends only in one detail on the concrete equation that we want to prove: The number of bound variables in the comprehension corresponds to the number of existential quantifiers that we have to eliminate and introduce again. In fact this is the only reason why the equations that we want to prove are not just instances of a single rule. Here is the ML equivalent of the tactic script above:*}ML {*val compr_compose_tac = rtac @{thm set_ext} THEN' rtac @{thm iffI} THEN' etac @{thm rel_compE} THEN' etac @{thm CollectE} THEN' etac @{thm CollectE} THEN' (fn i => REPEAT (etac @{thm exE} i)) THEN' etac @{thm conjE} THEN' etac @{thm conjE} THEN' etac @{thm Pair_inject} THEN' etac @{thm Pair_inject} THEN' asm_full_simp_tac HOL_basic_ss THEN' rtac @{thm CollectI} THEN' (fn i => REPEAT (rtac @{thm exI} i)) THEN' rtac @{thm conjI} THEN' rtac @{thm refl} THEN' rtac @{thm conjI} THEN' rtac @{thm sym} THEN' assume_tac THEN' rtac @{thm conjI} THEN' assume_tac THEN' assume_tac THEN' etac @{thm CollectE} THEN' (fn i => REPEAT (etac @{thm exE} i)) THEN' etac @{thm conjE} THEN' etac @{thm conjE} THEN' etac @{thm conjE} THEN' asm_full_simp_tac HOL_basic_ss THEN' rtac @{thm rel_compI} THEN' rtac @{thm CollectI} THEN' (fn i => REPEAT (rtac @{thm exI} i)) THEN' rtac @{thm conjI} THEN' rtac @{thm refl} THEN' assume_tac THEN' rtac @{thm CollectI} THEN' (fn i => REPEAT (rtac @{thm exI} i)) THEN' rtac @{thm conjI} THEN' simp_tac (HOL_basic_ss addsimps [@{thm Pair_eq}]) THEN' rtac @{thm conjI} THEN' assume_tac THEN' rtac @{thm refl} THEN' assume_tac*}lemma test1: "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2 x, r\<^isub>2 x) |x. P\<^isub>2 x} = {(l\<^isub>2 x, r\<^isub>1 y) |x y. r\<^isub>2 x = l\<^isub>1 y \<and> P\<^isub>2 x \<and> P\<^isub>1 y}"by (tactic "compr_compose_tac 1")lemma test3: "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2 x z, r\<^isub>2 x z) |x z. P\<^isub>2 x z} = {(l\<^isub>2 x z, r\<^isub>1 y) |x y z. r\<^isub>2 x z = l\<^isub>1 y \<and> P\<^isub>2 x z \<and> P\<^isub>1 y}"by (tactic "compr_compose_tac 1")text {* So we have a tactic that works on at least two examples. Getting it really right requires some more effort. Consider the goal*}lemma "{(n, Suc n) |n. n > 0} O {(n, Suc n) |n. P n} = {(n, Suc m)|n m. Suc n = m \<and> P n \<and> m > 0}"(*lemma "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2 x, r\<^isub>2 x) |x. P\<^isub>2 x} = {(l\<^isub>2 x, r\<^isub>1 y) |x y. r\<^isub>2 x = l\<^isub>1 y \<and> P\<^isub>2 x \<and> P\<^isub>1 y}"*)txt {* This is exactly an instance of @{fact test1}, but our tactic fails on it with the usual uninformative \emph{empty result requence}. We are now in the frequent situation that we need to debug. One simple instrument for this is @{ML "print_tac"}, which is the same as @{ML all_tac} (the identity for @{ML_text "THEN"}), i.e.\ it does nothing, but it prints the current goal state as a side effect. Another debugging option is of course to step through the interactive apply script. Finding the problem could be taken as an exercise for the patient reader, and we will go ahead with the solution. The problem is that in this instance the simplifier does more than it did in the general version of lemma @{fact test1}. Since @{text "l\<^isub>1"} and @{text "l\<^isub>2"} are just the identity function, the equation corresponding to @{text "l\<^isub>1 y = r\<^isub>2 x "} becomes @{text "m = Suc n"}. Then the simplifier eagerly replaces all occurences of @{term "m"} by @{term "Suc n"} which destroys the structure of the proof. This is perhaps the most important lesson to learn, when writing tactics: \textbf{Avoid automation at all cost!!!}. Let us look at the proof state at the point where the simplifier is invoked:*}(*<*)apply (rule set_ext)apply (rule iffI) apply (erule rel_compE) apply (erule CollectE) apply (erule CollectE) apply (erule exE) apply (erule exE) apply (erule conjE) apply (erule conjE) apply (erule Pair_inject) apply (erule Pair_inject)(*>*)txt {* @{subgoals[display]} Like in the apply proof, we now want to eliminate the equations that ``define'' @{term x}, @{term xa} and @{term z}. The other equations are just there by coincidence, and we must not touch them. For such purposes, there is the internal tactic @{text "hyp_subst_single"}. Its job is to take exactly one premise of the form @{term "v = t"}, where @{term v} is a variable, and replace @{term "v"} in the whole subgoal. The hypothesis to eliminate is given by its position. We can use this tactic to eliminate @{term x}:*}apply (tactic "single_hyp_subst_tac 0 1")txt {* @{subgoals[display]}*}apply (tactic "single_hyp_subst_tac 2 1")apply (tactic "single_hyp_subst_tac 2 1")apply (tactic "single_hyp_subst_tac 3 1") apply (rule CollectI) -- {* introduce them again *} apply (rule exI) apply (rule exI) apply (rule conjI) apply (rule refl) apply (rule conjI) apply (assumption) apply (rule conjI) apply assumption apply assumptionapply (erule CollectE) -- {* @{text "\<subseteq>"} *}apply (erule exE)+apply (erule conjE)+apply (tactic "single_hyp_subst_tac 0 1")apply (rule rel_compI) apply (rule CollectI) apply (rule exI) apply (rule conjI) apply (rule refl) apply assumptionapply (rule CollectI)apply (rule exI)apply (rule conjI)apply (subst Pair_eq)apply (rule conjI) apply assumptionapply (rule refl)apply assumptiondoneML {*val compr_compose_tac = rtac @{thm set_ext} THEN' rtac @{thm iffI} THEN' etac @{thm rel_compE} THEN' etac @{thm CollectE} THEN' etac @{thm CollectE} THEN' (fn i => REPEAT (etac @{thm exE} i)) THEN' etac @{thm conjE} THEN' etac @{thm conjE} THEN' etac @{thm Pair_inject} THEN' etac @{thm Pair_inject} THEN' single_hyp_subst_tac 0 THEN' single_hyp_subst_tac 2 THEN' single_hyp_subst_tac 2 THEN' single_hyp_subst_tac 3 THEN' rtac @{thm CollectI} THEN' (fn i => REPEAT (rtac @{thm exI} i)) THEN' rtac @{thm conjI} THEN' rtac @{thm refl} THEN' rtac @{thm conjI} THEN' assume_tac THEN' rtac @{thm conjI} THEN' assume_tac THEN' assume_tac THEN' etac @{thm CollectE} THEN' (fn i => REPEAT (etac @{thm exE} i)) THEN' etac @{thm conjE} THEN' etac @{thm conjE} THEN' etac @{thm conjE} THEN' single_hyp_subst_tac 0 THEN' rtac @{thm rel_compI} THEN' rtac @{thm CollectI} THEN' (fn i => REPEAT (rtac @{thm exI} i)) THEN' rtac @{thm conjI} THEN' rtac @{thm refl} THEN' assume_tac THEN' rtac @{thm CollectI} THEN' (fn i => REPEAT (rtac @{thm exI} i)) THEN' rtac @{thm conjI} THEN' stac @{thm Pair_eq} THEN' rtac @{thm conjI} THEN' assume_tac THEN' rtac @{thm refl} THEN' assume_tac*}lemma "{(n, Suc n) |n. n > 0 \<and> A} O {(n, Suc n) |n m. P m n} = {(n, Suc m)|n m' m. Suc n = m \<and> P m' n \<and> (m > 0 \<and> A)}"apply (tactic "compr_compose_tac 1")donetext {* The next step is now to turn this tactic into a simplification procedure. This just means that we need some code that builds the term of the composed relation.*}use "comp_simproc"(*<*)(*simproc_setup mysp ("x O y") = {* compose_simproc *}*)lemma "{(n, Suc n) |n. n > 0 \<and> A} O {(n, Suc n) |n m. P m n} = x"(*apply (simp del:ex_simps)*)oopslemma "({(g m, k) | m k. Q m k} O {(h j, f j) | j. R j}) = x"(*apply (simp del:ex_simps) *)oopslemma "{uu. \<exists>j m k. uu = (h j, k) \<and> f j = g m \<and> R j \<and> Q m k}O {(h j, f j) | j. R j} = x"(*apply (simp del:ex_simps)*)oopslemma " { (l x, r x) | x. P x \<and> Q x \<and> Q' x }O { (l1 x, r1 x) | x. P1 x \<and> Q1 x \<and> Q1' x }= A"(*apply (simp del:ex_simps)*)oopslemma " { (l x, r x) | x. P x }O { (l1 x, r1 x) | x. P1 x }O { (l2 x, r2 x) | x. P2 x }= A"(*apply (simp del:ex_simps)*)oopslemma "{(f n, m) |n m. P n m} O ({(g m, k) | m k. Q m k} O {(h j, f j) | j. R j}) = x"(*apply (simp del:ex_simps)*)oopslemma "{u. \<exists>n. u=(f n, g n)} O {u. \<exists>n. u=(h n, j n)} = A"oops(*>*)end