split the document into smaller pieces;
made it standalone by copying antiquote_setup.ML into the repository
added cover page
theory FirstSteps
imports Main
uses "antiquote_setup.ML"
("comp_simproc")
begin
(*<*)
ML {*
local structure O = ThyOutput
in
fun check_exists f =
if File.exists (Path.explode ("~~/src/" ^ f)) then ()
else error ("Source file " ^ quote f ^ " does not exist.")
val _ = O.add_commands
[("ML_file", O.args (Scan.lift Args.name) (O.output (fn _ => fn name =>
(check_exists name; Pretty.str name))))];
end
*}
(*>*)
chapter {* First Steps *}
text {*
Isabelle programming happens in an enhanced dialect of Standard ML,
which adds antiquotations containing references to the logical
context.
Just like all lemmas or proofs, all ML code that you write lives in
a theory, where it is embedded using the \isacommand{ML} command:
*}
ML {*
3 + 4
*}
text {*
The \isacommand{ML} command takes an arbitrary ML expression, which
is evaluated. It can also contain value or function bindings.
*}
section {* Antiquotations *}
text {*
The main advantage of embedding all code in a theory is that the
code can contain references to entities that are defined in the
theory. Let us for example, print out the name of the current
theory:
*}
ML {* Context.theory_name @{theory} *}
text {* The @{text "@{theory}"} antiquotation is substituted with the
current theory, whose name can then be extracted using a the
function @{ML "Context.theory_name"}. Note that antiquotations are
statically scoped. The function
*}
ML {*
fun current_thyname () = Context.theory_name @{theory}
*}
text {*
does \emph{not} return the name of the current theory. Instead, we have
defined the constant function that always returns the string @{ML "\"CookBook\""}, which is
the name of @{text "@{theory}"} at the point where the code
is embedded. Operationally speaking, @{text "@{theory}"} is \emph{not}
replaced with code that will look up the current theory in some
(destructive) data structure and return it. Instead, it is really
replaced with the theory value.
In the course of this introduction, we will learn about more of
these antoquotations, which greatly simplify programming, since you
can directly access all kinds of logical elements from ML.
*}
section {* Terms *}
text {*
We can simply quote Isabelle terms from ML using the @{text "@{term \<dots>}"}
antiquotation:
*}
ML {* @{term "(a::nat) + b = c"} *}
text {*
This shows the term @{term "(a::nat) + b = c"} in the internal
representation, with all gory details. Terms are just an ML
datatype, and they are defined in @{ML_file "Pure/term.ML"}.
The representation of terms uses deBruin indices: Bound variables
are represented by the constructor @{ML Bound}, and the index refers to
the number of lambdas we have to skip until we hit the lambda that
binds the variable. The names of bound variables are kept at the
abstractions, but they are just comments.
See \ichcite{ch:logic} for more details.
\begin{readmore}
Terms are described in detail in \ichcite{ch:logic}. Their
definition and many useful operations can be found in @{ML_file "Pure/term.ML"}.
\end{readmore}
In a similar way we can quote types and theorems:
*}
ML {* @{typ "(int * nat) list"} *}
ML {* @{thm allI} *}
text {*
In the default setup, types and theorems are printed as strings.
*}
text {*
Sometimes the internal representation can be surprisingly different
from what you see at the user level, because the layer of
parsing/type checking/pretty printing can be quite thick.
\begin{exercise}
Look at the internal term representation of the following terms, and
find out why they are represented like this.
\begin{itemize}
\item @{term "case x of 0 \<Rightarrow> 0 | Suc y \<Rightarrow> y"}
\item @{term "\<lambda>(x,y). P y x"}
\item @{term "{ [x::int] | x. x \<le> -2 }"}
\end{itemize}
Hint: The third term is already quite big, and the pretty printer
may omit parts of it by default. If you want to see all of it, you
can use @{ML "print_depth 50"} to set the limit to a value high enough.
\end{exercise}
*}
section {* Type checking *}
text {*
We can freely construct and manipulate terms, since they are just
arbitrary unchecked trees. However, we eventually want to see if a
term is wellformed in a certain context.
Type checking is done via @{ML cterm_of}, which turns a @{ML_type
term} into a @{ML_type cterm}, a \emph{certified} term. Unlike
@{ML_type term}s, which are just trees, @{ML_type
"cterm"}s are abstract objects that are guaranteed to be
type-correct, and can only be constructed via the official
interfaces.
Type
checking is always relative to a theory context. For now we can use
the @{ML "@{theory}"} antiquotation to get hold of the theory at the current
point:
*}
ML {*
let
val natT = @{typ "nat"}
val zero = @{term "0::nat"}(*Const ("HOL.zero_class.zero", natT)*)
in
cterm_of @{theory}
(Const ("HOL.plus_class.plus", natT --> natT --> natT)
$ zero $ zero)
end
*}
ML {*
@{const_name plus}
*}
ML {*
@{term "{ [x::int] | x. x \<le> -2 }"}
*}
text {*
The internal names of constants like @{term "zero"} or @{text "+"} are
often more complex than one first expects. Here, the extra prefixes
@{text zero_class} and @{text plus_class} are present because the
constants are defined within a type class. Guessing such internal
names can be extremely hard, which is why the system provides
another antiquotation: @{ML "@{const_name plus}"} gives just this
name.
\begin{exercise}
Write a function @{ML_text "rev_sum : term -> term"} that takes a
term of the form @{text "t\<^isub>1 + t\<^isub>2 + \<dots> + t\<^isub>n"}
and returns the reversed sum @{text "t\<^isub>n + \<dots> + t\<^isub>2 + t\<^isub>1"}.
Note that @{text "+"} associates to the left.
Try your function on some examples, and see if the result typechecks.
\end{exercise}
\begin{exercise}
Write a function which takes two terms representing natural numbers
in unary (like @{term "Suc (Suc (Suc 0))"}), and produce the unary
number representing their sum.
\end{exercise}
\begin{exercise}
Look at the functions defined in @{ML_file "Pure/logic.ML"} and
@{ML_file "HOL/hologic.ML"} and see if they can make your life
easier.
\end{exercise}
*}
section {* Theorems *}
text {*
Just like @{ML_type cterm}s, theorems (of type @{ML_type thm}) are
abstract objects that can only be built by going through the kernel
interfaces, which means that all your proofs will be checked. The
basic rules of the Isabelle/Pure logical framework are defined in
@{ML_file "Pure/thm.ML"}.
Using these rules, which are just ML functions, you can do simple
natural deduction proofs on the ML level. For example, the statement
@{prop "(\<And>(x::nat). P x \<Longrightarrow> Q x) \<Longrightarrow> P t \<Longrightarrow> Q t"} can be proved like
this\footnote{Note that @{text "|>"} is just reverse
application. This combinator, and several variants are defined in
@{ML_file "Pure/General/basics.ML"}}:
*}
ML {*
let
val thy = @{theory}
val nat = HOLogic.natT
val x = Free ("x", nat)
val t = Free ("t", nat)
val P = Free ("P", nat --> HOLogic.boolT)
val Q = Free ("Q", nat --> HOLogic.boolT)
val A1 = Logic.all x
(Logic.mk_implies (HOLogic.mk_Trueprop (P $ x),
HOLogic.mk_Trueprop (Q $ x)))
|> cterm_of thy
val A2 = HOLogic.mk_Trueprop (P $ t)
|> cterm_of thy
val Pt_implies_Qt =
assume A1
|> forall_elim (cterm_of thy t)
val Qt = implies_elim Pt_implies_Qt (assume A2)
in
Qt
|> implies_intr A2
|> implies_intr A1
end
*}
section {* Tactical reasoning *}
text {*
The goal-oriented tactical style is similar to the @{text apply}
style at the user level. Reasoning is centered around a \emph{goal},
which is modified in a sequence of proof steps until it is solved.
A goal (or goal state) is a special @{ML_type thm}, which by
convention is an implication:
@{text[display] "A\<^isub>1 \<Longrightarrow> \<dots> \<Longrightarrow> A\<^isub>n \<Longrightarrow> #(C)"}
Since the final result @{term C} could again be an implication, there is the
@{text "#"} around the final result, which protects its premises from being
misinterpreted as open subgoals. The protection @{text "# :: prop \<Rightarrow>
prop"} is just the identity and used as a syntactic marker.
Now tactics are just functions that map a goal state to a (lazy)
sequence of successor states, hence the type of a tactic is
@{ML_type[display] "thm -> thm Seq.seq"}
See @{ML_file "Pure/General/seq.ML"} for the implementation of lazy
sequences.
Of course, tactics are expected to behave nicely and leave the final
conclusion @{term C} intact. In order to start a tactical proof for
@{term A}, we
just set up the trivial goal @{text "A \<Longrightarrow> #(A)"} and run the tactic
on it. When the subgoal is solved, we have just @{text "#(A)"} and
can remove the protection.
The operations in @{ML_file "Pure/goal.ML"} do just that and we can use
them.
Let us transcribe a simple apply style proof from the
tutorial\cite{isa-tutorial} into ML:
*}
lemma disj_swap: "P \<or> Q \<Longrightarrow> Q \<or> P"
apply (erule disjE)
apply (rule disjI2)
apply assumption
apply (rule disjI1)
apply assumption
done
ML {*
let
val ctxt = @{context}
val goal = @{prop "P \<or> Q \<Longrightarrow> Q \<or> P"}
in
Goal.prove ctxt ["P", "Q"] [] goal (fn _ =>
eresolve_tac [disjE] 1
THEN resolve_tac [disjI2] 1
THEN assume_tac 1
THEN resolve_tac [disjI1] 1
THEN assume_tac 1)
end
*}
text {*
Tactics that affect only a certain subgoal, take a subgoal number as
an integer parameter. Here we always work on the first subgoal,
following exactly the @{text "apply"} script.
*}
section {* Case Study: Relation Composition *}
text {*
\emph{Note: This is completely unfinished. I hoped to have a section
with a nontrivial example, but I ran into several problems.}
Recall that HOL has special syntax for set comprehensions:
@{term "{ f x y |x y. P x y}"} abbreviates
@{term[source] "{u. \<exists>x y. u = f x y \<and> P x y}"}.
We will automatically prove statements of the following form:
@{lemma[display] "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2 x, r\<^isub>2 x) |x. P\<^isub>2 x}
= {(l\<^isub>2 x, r\<^isub>1 y) |x y. r\<^isub>2 x = l\<^isub>1 y \<and>
P\<^isub>2 x \<and> P\<^isub>1 y}" by auto}
In Isabelle, relation composition is defined to be consistent with
function composition, that is, the relation applied ``first'' is
written on the right hand side. This different from what many
textbooks do.
The above statement about composition is not proved automatically by
@{method simp}, and it cannot be solved by a fixed set of rewrite
rules, since the number of (implicit) quantifiers may vary. Here, we
only have one bound variable in each comprehension, but in general
there can be more. On the other hand, @{method auto} proves the
above statement quickly, by breaking the equality into two parts and
proving them separately. However, if e.g.\ @{term "P\<^isub>1"} is a
complicated expression, the automated tools may get confused.
Our goal is now to develop a small procedure that can compute (with proof) the
composition of two relation comprehensions, which can be used to
extend the simplifier.
*}
section {*A tactic *}
text {* Let's start with a step-by-step proof of the above statement *}
lemma "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2
x, r\<^isub>2 x) |x. P\<^isub>2 x}
= {(l\<^isub>2 x, r\<^isub>1 y) |x y. r\<^isub>2 x = l\<^isub>1 y \<and> P\<^isub>2 x \<and> P\<^isub>1 y}"
apply (rule set_ext)
apply (rule iffI)
apply (erule rel_compE) -- {* @{text "\<subseteq>"} *}
apply (erule CollectE) -- {* eliminate @{text "Collect"}, @{text "\<exists>"}, @{text "\<and>"}, and pairs *}
apply (erule CollectE)
apply (erule exE)
apply (erule exE)
apply (erule conjE)
apply (erule conjE)
apply (erule Pair_inject)
apply (erule Pair_inject)
apply (simp only:)
apply (rule CollectI) -- {* introduce them again *}
apply (rule exI)
apply (rule exI)
apply (rule conjI)
apply (rule refl)
apply (rule conjI)
apply (rule sym)
apply (assumption)
apply (rule conjI)
apply assumption
apply assumption
apply (erule CollectE) -- {* @{text "\<subseteq>"} *}
apply (erule exE)+
apply (erule conjE)+
apply (simp only:)
apply (rule rel_compI)
apply (rule CollectI)
apply (rule exI)
apply (rule conjI)
apply (rule refl)
apply assumption
apply (rule CollectI)
apply (rule exI)
apply (rule conjI)
apply (subst Pair_eq)
apply (rule conjI)
apply assumption
apply (rule refl)
apply assumption
done
text {*
The reader will probably need to step through the proof and verify
that there is nothing spectacular going on here.
The @{text apply} script just applies the usual elimination and introduction rules in the right order.
This script is of course totally unreadable. But we are not trying
to produce pretty Isar proofs here. We just want to find out which
rules are needed and how they must be applied to complete the
proof. And a detailed apply-style proof can often be turned into a
tactic quite easily. Of course we must resist the temptation to use
@{method auto}, @{method blast} and friends, since their behaviour
is not predictable enough. But the simple @{method rule} and
@{method erule} methods are fine.
Notice that this proof depends only in one detail on the
concrete equation that we want to prove: The number of bound
variables in the comprehension corresponds to the number of
existential quantifiers that we have to eliminate and introduce
again. In fact this is the only reason why the equations that we
want to prove are not just instances of a single rule.
Here is the ML equivalent of the tactic script above:
*}
ML {*
val compr_compose_tac =
rtac @{thm set_ext}
THEN' rtac @{thm iffI}
THEN' etac @{thm rel_compE}
THEN' etac @{thm CollectE}
THEN' etac @{thm CollectE}
THEN' (fn i => REPEAT (etac @{thm exE} i))
THEN' etac @{thm conjE}
THEN' etac @{thm conjE}
THEN' etac @{thm Pair_inject}
THEN' etac @{thm Pair_inject}
THEN' asm_full_simp_tac HOL_basic_ss
THEN' rtac @{thm CollectI}
THEN' (fn i => REPEAT (rtac @{thm exI} i))
THEN' rtac @{thm conjI}
THEN' rtac @{thm refl}
THEN' rtac @{thm conjI}
THEN' rtac @{thm sym}
THEN' assume_tac
THEN' rtac @{thm conjI}
THEN' assume_tac
THEN' assume_tac
THEN' etac @{thm CollectE}
THEN' (fn i => REPEAT (etac @{thm exE} i))
THEN' etac @{thm conjE}
THEN' etac @{thm conjE}
THEN' etac @{thm conjE}
THEN' asm_full_simp_tac HOL_basic_ss
THEN' rtac @{thm rel_compI}
THEN' rtac @{thm CollectI}
THEN' (fn i => REPEAT (rtac @{thm exI} i))
THEN' rtac @{thm conjI}
THEN' rtac @{thm refl}
THEN' assume_tac
THEN' rtac @{thm CollectI}
THEN' (fn i => REPEAT (rtac @{thm exI} i))
THEN' rtac @{thm conjI}
THEN' simp_tac (HOL_basic_ss addsimps [@{thm Pair_eq}])
THEN' rtac @{thm conjI}
THEN' assume_tac
THEN' rtac @{thm refl}
THEN' assume_tac
*}
lemma test1: "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2
x, r\<^isub>2 x) |x. P\<^isub>2 x}
= {(l\<^isub>2 x, r\<^isub>1 y) |x y. r\<^isub>2 x = l\<^isub>1 y \<and> P\<^isub>2 x \<and> P\<^isub>1 y}"
by (tactic "compr_compose_tac 1")
lemma test3: "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2 x z, r\<^isub>2 x z) |x z. P\<^isub>2 x z}
= {(l\<^isub>2 x z, r\<^isub>1 y) |x y z. r\<^isub>2 x z = l\<^isub>1 y \<and> P\<^isub>2 x z \<and> P\<^isub>1 y}"
by (tactic "compr_compose_tac 1")
text {*
So we have a tactic that works on at least two examples.
Getting it really right requires some more effort. Consider the goal
*}
lemma "{(n, Suc n) |n. n > 0} O {(n, Suc n) |n. P n}
= {(n, Suc m)|n m. Suc n = m \<and> P n \<and> m > 0}"
(*lemma "{(l\<^isub>1 x, r\<^isub>1 x) |x. P\<^isub>1 x} O {(l\<^isub>2
x, r\<^isub>2 x) |x. P\<^isub>2 x}
= {(l\<^isub>2 x, r\<^isub>1 y) |x y. r\<^isub>2 x = l\<^isub>1 y \<and> P\<^isub>2 x \<and> P\<^isub>1 y}"*)
txt {*
This is exactly an instance of @{fact test1}, but our tactic fails
on it with the usual uninformative
\emph{empty result requence}.
We are now in the frequent situation that we need to debug. One simple instrument for this is @{ML "print_tac"},
which is the same as @{ML all_tac} (the identity for @{ML_text "THEN"}),
i.e.\ it does nothing, but it prints the current goal state as a
side effect.
Another debugging option is of course to step through the interactive apply script.
Finding the problem could be taken as an exercise for the patient
reader, and we will go ahead with the solution.
The problem is that in this instance the simplifier does more than it did in the general version
of lemma @{fact test1}. Since @{text "l\<^isub>1"} and @{text "l\<^isub>2"} are just the identity function,
the equation corresponding to @{text "l\<^isub>1 y = r\<^isub>2 x "}
becomes @{text "m = Suc n"}. Then the simplifier eagerly replaces
all occurences of @{term "m"} by @{term "Suc n"} which destroys the
structure of the proof.
This is perhaps the most important lesson to learn, when writing tactics:
\textbf{Avoid automation at all cost!!!}.
Let us look at the proof state at the point where the simplifier is
invoked:
*}
(*<*)
apply (rule set_ext)
apply (rule iffI)
apply (erule rel_compE)
apply (erule CollectE)
apply (erule CollectE)
apply (erule exE)
apply (erule exE)
apply (erule conjE)
apply (erule conjE)
apply (erule Pair_inject)
apply (erule Pair_inject)(*>*)
txt {*
@{subgoals[display]}
Like in the apply proof, we now want to eliminate the equations that
``define'' @{term x}, @{term xa} and @{term z}. The other equations
are just there by coincidence, and we must not touch them.
For such purposes, there is the internal tactic @{text "hyp_subst_single"}.
Its job is to take exactly one premise of the form @{term "v = t"},
where @{term v} is a variable, and replace @{term "v"} in the whole
subgoal. The hypothesis to eliminate is given by its position.
We can use this tactic to eliminate @{term x}:
*}
apply (tactic "single_hyp_subst_tac 0 1")
txt {*
@{subgoals[display]}
*}
apply (tactic "single_hyp_subst_tac 2 1")
apply (tactic "single_hyp_subst_tac 2 1")
apply (tactic "single_hyp_subst_tac 3 1")
apply (rule CollectI) -- {* introduce them again *}
apply (rule exI)
apply (rule exI)
apply (rule conjI)
apply (rule refl)
apply (rule conjI)
apply (assumption)
apply (rule conjI)
apply assumption
apply assumption
apply (erule CollectE) -- {* @{text "\<subseteq>"} *}
apply (erule exE)+
apply (erule conjE)+
apply (tactic "single_hyp_subst_tac 0 1")
apply (rule rel_compI)
apply (rule CollectI)
apply (rule exI)
apply (rule conjI)
apply (rule refl)
apply assumption
apply (rule CollectI)
apply (rule exI)
apply (rule conjI)
apply (subst Pair_eq)
apply (rule conjI)
apply assumption
apply (rule refl)
apply assumption
done
ML {*
val compr_compose_tac =
rtac @{thm set_ext}
THEN' rtac @{thm iffI}
THEN' etac @{thm rel_compE}
THEN' etac @{thm CollectE}
THEN' etac @{thm CollectE}
THEN' (fn i => REPEAT (etac @{thm exE} i))
THEN' etac @{thm conjE}
THEN' etac @{thm conjE}
THEN' etac @{thm Pair_inject}
THEN' etac @{thm Pair_inject}
THEN' single_hyp_subst_tac 0
THEN' single_hyp_subst_tac 2
THEN' single_hyp_subst_tac 2
THEN' single_hyp_subst_tac 3
THEN' rtac @{thm CollectI}
THEN' (fn i => REPEAT (rtac @{thm exI} i))
THEN' rtac @{thm conjI}
THEN' rtac @{thm refl}
THEN' rtac @{thm conjI}
THEN' assume_tac
THEN' rtac @{thm conjI}
THEN' assume_tac
THEN' assume_tac
THEN' etac @{thm CollectE}
THEN' (fn i => REPEAT (etac @{thm exE} i))
THEN' etac @{thm conjE}
THEN' etac @{thm conjE}
THEN' etac @{thm conjE}
THEN' single_hyp_subst_tac 0
THEN' rtac @{thm rel_compI}
THEN' rtac @{thm CollectI}
THEN' (fn i => REPEAT (rtac @{thm exI} i))
THEN' rtac @{thm conjI}
THEN' rtac @{thm refl}
THEN' assume_tac
THEN' rtac @{thm CollectI}
THEN' (fn i => REPEAT (rtac @{thm exI} i))
THEN' rtac @{thm conjI}
THEN' stac @{thm Pair_eq}
THEN' rtac @{thm conjI}
THEN' assume_tac
THEN' rtac @{thm refl}
THEN' assume_tac
*}
lemma "{(n, Suc n) |n. n > 0 \<and> A} O {(n, Suc n) |n m. P m n}
= {(n, Suc m)|n m' m. Suc n = m \<and> P m' n \<and> (m > 0 \<and> A)}"
apply (tactic "compr_compose_tac 1")
done
text {*
The next step is now to turn this tactic into a simplification
procedure. This just means that we need some code that builds the
term of the composed relation.
*}
use "comp_simproc"
(*<*)
(*simproc_setup mysp ("x O y") = {* compose_simproc *}*)
lemma "{(n, Suc n) |n. n > 0 \<and> A} O {(n, Suc n) |n m. P m n} = x"
(*apply (simp del:ex_simps)*)
oops
lemma "({(g m, k) | m k. Q m k}
O {(h j, f j) | j. R j}) = x"
(*apply (simp del:ex_simps) *)
oops
lemma "{uu. \<exists>j m k. uu = (h j, k) \<and> f j = g m \<and> R j \<and> Q m k}
O {(h j, f j) | j. R j} = x"
(*apply (simp del:ex_simps)*)
oops
lemma "
{ (l x, r x) | x. P x \<and> Q x \<and> Q' x }
O { (l1 x, r1 x) | x. P1 x \<and> Q1 x \<and> Q1' x }
= A"
(*apply (simp del:ex_simps)*)
oops
lemma "
{ (l x, r x) | x. P x }
O { (l1 x, r1 x) | x. P1 x }
O { (l2 x, r2 x) | x. P2 x }
= A"
(*
apply (simp del:ex_simps)*)
oops
lemma "{(f n, m) |n m. P n m} O ({(g m, k) | m k. Q m k}
O {(h j, f j) | j. R j}) = x"
(*apply (simp del:ex_simps)*)
oops
lemma "{u. \<exists>n. u=(f n, g n)} O {u. \<exists>n. u=(h n, j n)} = A"
oops
(*>*)
end