theory First_Steps
imports Base
begin
(*<*)
setup{*
open_file_with_prelude
"First_Steps_Code.thy"
["theory First_Steps", "imports Main", "begin"]
*}
(*>*)
chapter {* First Steps\label{chp:firststeps} *}
text {*
\begin{flushright}
{\em
``We will most likely never realize the full importance of painting the Tower,\\
that it is the essential element in the conservation of metal works and the\\
more meticulous the paint job, the longer the tower shall endure.''} \\[1ex]
Gustave Eiffel, in his book {\em The 300-Meter Tower}.\footnote{The Eiffel Tower has been
re-painted 18 times since its initial construction, an average of once every
seven years. It takes more than one year for a team of 25 painters to paint the tower
from top to bottom.}
\end{flushright}
\medskip
Isabelle programming is done in ML. Just like lemmas and proofs, ML-code for
Isabelle must be part of a theory. If you want to follow the code given in
this chapter, we assume you are working inside the theory starting with
\begin{quote}
\begin{tabular}{@ {}l}
\isacommand{theory} First\_Steps\\
\isacommand{imports} Main\\
\isacommand{begin}\\
\ldots
\end{tabular}
\end{quote}
We also generally assume you are working with the logic HOL. The examples
that will be given might need to be adapted if you work in a different logic.
*}
section {* Including ML-Code\label{sec:include} *}
text {*
The easiest and quickest way to include code in a theory is by using the
\isacommand{ML}-command. For example:
\begin{isabelle}
\begin{graybox}
\isacommand{ML}~@{text "\<verbopen>"}\isanewline
\hspace{5mm}@{ML "3 + 4"}\isanewline
@{text "\<verbclose>"}\isanewline
@{text "> 7"}\smallskip
\end{graybox}
\end{isabelle}
If you work with ProofGeneral then like normal Isabelle scripts
\isacommand{ML}-commands can be evaluated by using the advance and
undo buttons of your Isabelle environment. If you work with the
Jedit GUI, then you just have to hover the cursor over the code
and you see the evaluated result in the ``Output'' window.
As mentioned in the Introduction, we will drop the \isacommand{ML}~@{text
"\<verbopen> \<dots> \<verbclose>"} scaffolding whenever we show code. The lines
prefixed with @{text [quotes] ">"} are not part of the code, rather they
indicate what the response is when the code is evaluated. There are also
the commands \isacommand{ML\_val} and \isacommand{ML\_prf} for including
ML-code. The first evaluates the given code, but any effect on the theory,
in which the code is embedded, is suppressed. The second needs to be used if
ML-code is defined inside a proof. For example
\begin{quote}
\begin{isabelle}
\isacommand{lemma}~@{text "test:"}\isanewline
\isacommand{shows}~@{text [quotes] "True"}\isanewline
\isacommand{ML\_prf}~@{text "\<verbopen>"}~@{ML "writeln \"Trivial!\""}~@{text "\<verbclose>"}\isanewline
\isacommand{oops}
\end{isabelle}
\end{quote}
However, both commands will only play minor roles in this tutorial (we will
always arrange that the ML-code is defined outside proofs).
Once a portion of code is relatively stable, you usually want to export it
to a separate ML-file. Such files can then be included somewhere inside a
theory by using the command \isacommand{use}. For example
\begin{quote}
\begin{tabular}{@ {}l}
\isacommand{theory} First\_Steps\\
\isacommand{imports} Main\\
\isacommand{uses}~@{text "(\"file_to_be_included.ML\")"} @{text "\<dots>"}\\
\isacommand{begin}\\
\ldots\\
\isacommand{use}~@{text "\"file_to_be_included.ML\""}\\
\ldots
\end{tabular}
\end{quote}
The \isacommand{uses}-command in the header of the theory is needed in order
to indicate the dependency of the theory on the ML-file. Alternatively, the
file can be included by just writing in the header
\begin{quote}
\begin{tabular}{@ {}l}
\isacommand{theory} First\_Steps\\
\isacommand{imports} Main\\
\isacommand{uses} @{text "\"file_to_be_included.ML\""} @{text "\<dots>"}\\
\isacommand{begin}\\
\ldots
\end{tabular}
\end{quote}
Note that no parentheses are given in this case. Note also that the included
ML-file should not contain any \isacommand{use} itself. Otherwise Isabelle
is unable to record all file dependencies, which is a nuisance if you have
to track down errors.
*}
section {* Printing and Debugging\label{sec:printing} *}
text {*
During development you might find it necessary to inspect data in your
code. This can be done in a ``quick-and-dirty'' fashion using the function
@{ML_ind writeln in Output}. For example
@{ML_response_fake [display,gray] "writeln \"any string\"" "\"any string\""}
will print out @{text [quotes] "any string"} inside the response buffer.
This function expects a string as argument. If you develop under
PolyML, then there is a convenient, though again ``quick-and-dirty'', method
for converting values into strings, namely the antiquotation
@{text "@{make_string}"}:
@{ML_response_fake [display,gray] "writeln (@{make_string} 1)" "\"1\""}
However, @{text "@{makes_tring}"} only works if the type of what
is converted is monomorphic and not a function.
You can print out error messages with the function @{ML_ind error in Library};
for example:
@{ML_response_fake [display,gray]
"if 0 = 1 then true else (error \"foo\")"
"Exception- ERROR \"foo\" raised
At command \"ML\"."}
This function raises the exception @{text ERROR}, which will then
be displayed by the infrastructure. Note that this exception is meant
for ``user-level'' error messages seen by the ``end-user''.
For messages where you want to indicate a genuine program error, then
use the exception @{text Fail}.
Most often you want to inspect data of Isabelle's basic data structures,
namely @{ML_type term}, @{ML_type typ}, @{ML_type cterm}, @{ML_type ctyp}
and @{ML_type thm}. Isabelle contains elaborate pretty-printing functions,
which we will explain in more detail in Section \ref{sec:pretty}. For now
we just use the functions @{ML_ind writeln in Pretty} from the structure
@{ML_struct Pretty} and @{ML_ind pretty_term in Syntax} from the structure
@{ML_struct Syntax}. For more convenience, we bind them to the toplevel.
*}
ML{*val pretty_term = Syntax.pretty_term
val pwriteln = Pretty.writeln*}
text {*
They can now be used as follows
@{ML_response_fake [display,gray]
"pwriteln (pretty_term @{context} @{term \"1::nat\"})"
"\"1\""}
If there is more than one term to be printed, you can use the
function @{ML_ind commas in Pretty} and @{ML_ind block in Pretty}
to separate them.
*}
ML{*fun pretty_terms ctxt trms =
Pretty.block (Pretty.commas (map (pretty_term ctxt) trms))*}
text {*
You can also print out terms together with their typing information.
For this you need to set the configuration value
@{ML_ind show_types in Syntax} to @{ML true}.
*}
ML{*val show_types_ctxt = Config.put show_types true @{context}*}
text {*
Now by using this context @{ML pretty_term} prints out
@{ML_response_fake [display, gray]
"pwriteln (pretty_term show_types_ctxt @{term \"(1::nat, x)\"})"
"(1::nat, x::'a)"}
where @{text 1} and @{text x} are displayed with their inferred type.
Even more type information can be printed by setting
the reference @{ML_ind show_all_types in Syntax} to @{ML true}.
In this case we obtain
*}
text {*
@{ML_response_fake [display, gray]
"let
val show_all_types_ctxt = Config.put show_all_types true @{context}
in
pwriteln (pretty_term show_all_types_ctxt @{term \"(1::nat, x)\"})
end"
"(Pair::nat \<Rightarrow> 'a \<Rightarrow> nat \<times> 'a) (1::nat) (x::'a)"}
where now even @{term Pair} is written with its type (@{term Pair} is the
term-constructor for products). Other configuration values that influence
printing of terms include @{ML_ind show_brackets in Syntax}, @{ML_ind
show_sorts in Syntax} and @{ML_ind eta_contract in Syntax}.
*}
text {*
A @{ML_type cterm} can be printed with the following function.
*}
ML{*fun pretty_cterm ctxt ctrm =
pretty_term ctxt (term_of ctrm)*}
text {*
Here the function @{ML_ind term_of in Thm} extracts the @{ML_type
term} from a @{ML_type cterm}. More than one @{ML_type cterm}s can be
printed again with @{ML commas in Pretty}.
*}
ML{*fun pretty_cterms ctxt ctrms =
Pretty.block (Pretty.commas (map (pretty_cterm ctxt) ctrms))*}
text {*
The easiest way to get the string of a theorem is to transform it
into a @{ML_type term} using the function @{ML_ind prop_of in Thm}.
*}
ML{*fun pretty_thm ctxt thm =
pretty_term ctxt (prop_of thm)*}
text {*
Theorems include schematic variables, such as @{text "?P"},
@{text "?Q"} and so on. They are needed in Isabelle in order to able to
instantiate theorems when they are applied. For example the theorem
@{thm [source] conjI} shown below can be used for any (typable)
instantiation of @{text "?P"} and @{text "?Q"}.
@{ML_response_fake [display, gray]
"pwriteln (pretty_thm @{context} @{thm conjI})"
"\<lbrakk>?P; ?Q\<rbrakk> \<Longrightarrow> ?P \<and> ?Q"}
However, in order to improve the readability when printing theorems, we
can switch off the question marks as follows:
*}
ML{*fun pretty_thm_no_vars ctxt thm =
let
val ctxt' = Config.put show_question_marks false ctxt
in
pretty_term ctxt' (prop_of thm)
end*}
text {*
With this function, theorem @{thm [source] conjI} is now printed as follows:
@{ML_response_fake [display, gray]
"pwriteln (pretty_thm_no_vars @{context} @{thm conjI})"
"\<lbrakk>P; Q\<rbrakk> \<Longrightarrow> P \<and> Q"}
Again the functions @{ML commas} and @{ML block in Pretty} help
with printing more than one theorem.
*}
ML{*fun pretty_thms ctxt thms =
Pretty.block (Pretty.commas (map (pretty_thm ctxt) thms))
fun pretty_thms_no_vars ctxt thms =
Pretty.block (Pretty.commas (map (pretty_thm_no_vars ctxt) thms))*}
text {*
Printing functions for @{ML_type typ} are
*}
ML{*fun pretty_typ ctxt ty = Syntax.pretty_typ ctxt ty
fun pretty_typs ctxt tys =
Pretty.block (Pretty.commas (map (pretty_typ ctxt) tys))*}
text {*
respectively @{ML_type ctyp}
*}
ML{*fun pretty_ctyp ctxt cty = pretty_typ ctxt (typ_of cty)
fun pretty_ctyps ctxt ctys =
Pretty.block (Pretty.commas (map (pretty_ctyp ctxt) ctys))*}
text {*
\begin{readmore}
The simple conversion functions from Isabelle's main datatypes to
@{ML_type string}s are implemented in @{ML_file "Pure/Syntax/syntax.ML"}.
The configuration values that change the printing information are declared in
@{ML_file "Pure/Syntax/printer.ML"}.
\end{readmore}
Note that for printing out several ``parcels'' of information that belong
together, like a warning message consisting of a term and its type, you
should try to print these parcels together in a single string. Therefore do
\emph{not} print out information as
@{ML_response_fake [display,gray]
"pwriteln (Pretty.str \"First half,\");
pwriteln (Pretty.str \"and second half.\")"
"First half,
and second half."}
but as a single string with appropriate formatting. For example
@{ML_response_fake [display,gray]
"pwriteln (Pretty.str (\"First half,\" ^ \"\\n\" ^ \"and second half.\"))"
"First half,
and second half."}
To ease this kind of string manipulations, there are a number
of library functions in Isabelle. For example, the function
@{ML_ind cat_lines in Library} concatenates a list of strings
and inserts newlines in between each element.
@{ML_response_fake [display, gray]
"pwriteln (Pretty.str (cat_lines [\"foo\", \"bar\"]))"
"foo
bar"}
Section \ref{sec:pretty} will explain the infrastructure that Isabelle
provides for more elaborate pretty printing.
\begin{readmore}
Most of the basic string functions of Isabelle are defined in
@{ML_file "Pure/library.ML"}.
\end{readmore}
*}
section {* Combinators\label{sec:combinators} *}
text {*
For beginners perhaps the most puzzling parts in the existing code of
Isabelle are the combinators. At first they seem to greatly obstruct the
comprehension of code, but after getting familiar with them and handled with
care, they actually ease the understanding and also the programming.
The simplest combinator is @{ML_ind I in Library}, which is just the
identity function defined as
*}
ML{*fun I x = x*}
text {*
Another simple combinator is @{ML_ind K in Library}, defined as
*}
ML{*fun K x = fn _ => x*}
text {*
@{ML K} ``wraps'' a function around @{text "x"} that ignores its argument. As a
result, @{ML K} defines a constant function always returning @{text x}.
The next combinator is reverse application, @{ML_ind "|>" in Basics}, defined as:
*}
ML{*fun x |> f = f x*}
text {* While just syntactic sugar for the usual function application,
the purpose of this combinator is to implement functions in a
``waterfall fashion''. Consider for example the function *}
ML %linenosgray{*fun inc_by_five x =
x |> (fn x => x + 1)
|> (fn x => (x, x))
|> fst
|> (fn x => x + 4)*}
text {*
which increments its argument @{text x} by 5. It does this by first
incrementing the argument by 1 (Line 2); then storing the result in a pair
(Line 3); taking the first component of the pair (Line 4) and finally
incrementing the first component by 4 (Line 5). This kind of cascading
manipulations of values is quite common when dealing with theories. The
reverse application allows you to read what happens in a top-down
manner. This kind of coding should be familiar, if you have been exposed to
Haskell's {\it do}-notation. Writing the function @{ML inc_by_five} using
the reverse application is much clearer than writing
*}
ML{*fun inc_by_five x = fst ((fn x => (x, x)) (x + 1)) + 4*}
text {* or *}
ML{*fun inc_by_five x =
((fn x => x + 4) o fst o (fn x => (x, x)) o (fn x => x + 1)) x*}
text {* and typographically more economical than *}
ML{*fun inc_by_five x =
let val y1 = x + 1
val y2 = (y1, y1)
val y3 = fst y2
val y4 = y3 + 4
in y4 end*}
text {*
Another reason why the let-bindings in the code above are better to be
avoided: it is more than easy to get the intermediate values wrong, not to
mention the nightmares the maintenance of this code causes!
In Isabelle a ``real world'' example for a function written in
the waterfall fashion might be the following code:
*}
ML %linenosgray{*fun apply_fresh_args f ctxt =
f |> fastype_of
|> binder_types
|> map (pair "z")
|> Variable.variant_frees ctxt [f]
|> map Free
|> curry list_comb f *}
text {*
This function takes a term and a context as argument. If the term is of function
type, then @{ML "apply_fresh_args"} returns the term with distinct variables
applied to it. For example below three variables are applied to the term
@{term [show_types] "P::nat \<Rightarrow> int \<Rightarrow> unit \<Rightarrow> bool"}:
@{ML_response_fake [display,gray]
"let
val trm = @{term \"P::nat \<Rightarrow> int \<Rightarrow> unit \<Rightarrow> bool\"}
val ctxt = @{context}
in
apply_fresh_args trm ctxt
|> pretty_term ctxt
|> pwriteln
end"
"P z za zb"}
You can read off this behaviour from how @{ML apply_fresh_args} is coded: in
Line 2, the function @{ML_ind fastype_of in Term} calculates the type of the
term; @{ML_ind binder_types in Term} in the next line produces the list of
argument types (in the case above the list @{text "[nat, int, unit]"}); Line
4 pairs up each type with the string @{text "z"}; the function @{ML_ind
variant_frees in Variable} generates for each @{text "z"} a unique name
avoiding the given @{text f}; the list of name-type pairs is turned into a
list of variable terms in Line 6, which in the last line is applied by the
function @{ML_ind list_comb in Term} to the original term. In this last step we have
to use the function @{ML_ind curry in Library}, because @{ML list_comb}
expects the function and the variables list as a pair.
Functions like @{ML apply_fresh_args} are often needed when constructing
terms involving fresh variables. For this the infrastructure helps
tremendously to avoid any name clashes. Consider for example:
@{ML_response_fake [display,gray]
"let
val trm = @{term \"za::'a \<Rightarrow> 'b \<Rightarrow> 'c\"}
val ctxt = @{context}
in
apply_fresh_args trm ctxt
|> pretty_term ctxt
|> pwriteln
end"
"za z zb"}
where the @{text "za"} is correctly avoided.
The combinator @{ML_ind "#>" in Basics} is the reverse function composition.
It can be used to define the following function
*}
ML{*val inc_by_six =
(fn x => x + 1) #>
(fn x => x + 2) #>
(fn x => x + 3)*}
text {*
which is the function composed of first the increment-by-one function and
then increment-by-two, followed by increment-by-three. Again, the reverse
function composition allows you to read the code top-down. This combinator
is often used for setup functions inside the
\isacommand{setup}- or \isacommand{local\_setup}-command. These functions
have to be of type @{ML_type "theory -> theory"}, respectively
@{ML_type "local_theory -> local_theory"}. More than one such setup function
can be composed with @{ML "#>"}. Consider for example the following code,
where we store the theorems @{thm [source] conjI}, @{thm [source] conjunct1}
and @{thm [source] conjunct2} under alternative names.
*}
local_setup %graylinenos {*
let
fun my_note name thm = Local_Theory.note ((name, []), [thm]) #> snd
in
my_note @{binding "foo_conjI"} @{thm conjI} #>
my_note @{binding "bar_conjunct1"} @{thm conjunct1} #>
my_note @{binding "bar_conjunct2"} @{thm conjunct2}
end *}
text {*
The function @{ML_text "my_note"} in line 3 is just a wrapper for the function
@{ML_ind note in Local_Theory} in the structure @{ML_struct Local_Theory};
its purpose is to store a theorem under a name.
In lines 5 to 6 we call this function to give alternative names for the three
theorems. The point of @{ML "#>"} is that you can sequence such function calls.
The remaining combinators we describe in this section add convenience for
the ``waterfall method'' of writing functions. The combinator @{ML_ind tap
in Basics} allows you to get hold of an intermediate result (to do some
side-calculations or print out an intermediate result, for instance). The
function
*}
ML %linenosgray{*fun inc_by_three x =
x |> (fn x => x + 1)
|> tap (fn x => pwriteln (Pretty.str (@{make_string} x)))
|> (fn x => x + 2)*}
text {*
increments the argument first by @{text "1"} and then by @{text "2"}. In the
middle (Line 3), however, it uses @{ML tap} for printing the ``plus-one''
intermediate result. The function @{ML tap} can only be used for
side-calculations, because any value that is computed cannot be merged back
into the ``main waterfall''. To do this, you can use the next combinator.
The combinator @{ML_ind "`" in Basics} (a backtick) is similar to @{ML tap},
but applies a function to the value and returns the result together with the
value (as a pair). It is defined as
*}
ML{*fun `f = fn x => (f x, x)*}
text {*
An example for this combinator is the function
*}
ML{*fun inc_as_pair x =
x |> `(fn x => x + 1)
|> (fn (x, y) => (x, y + 1))*}
text {*
which takes @{text x} as argument, and then increments @{text x}, but also keeps
@{text x}. The intermediate result is therefore the pair @{ML "(x + 1, x)"
for x}. After that, the function increments the right-hand component of the
pair. So finally the result will be @{ML "(x + 1, x + 1)" for x}.
The combinators @{ML_ind "|>>" in Basics} and @{ML_ind "||>" in Basics} are
defined for functions manipulating pairs. The first applies the function to
the first component of the pair, defined as
*}
ML{*fun (x, y) |>> f = (f x, y)*}
text {*
and the second combinator to the second component, defined as
*}
ML{*fun (x, y) ||> f = (x, f y)*}
text {*
These two functions can, for example, be used to avoid explicit @{text "lets"} for
intermediate values in functions that return pairs. As an example, suppose you
want to separate a list of integers into two lists according to a
threshold. If the threshold is @{ML "5"}, the list @{ML "[1,6,2,5,3,4]"}
should be separated as @{ML "([1,2,3,4], [6,5])"}. Such a function can be
implemented as
*}
ML{*fun separate i [] = ([], [])
| separate i (x::xs) =
let
val (los, grs) = separate i xs
in
if i <= x then (los, x::grs) else (x::los, grs)
end*}
text {*
where the return value of the recursive call is bound explicitly to
the pair @{ML "(los, grs)" for los grs}. However, this function
can be implemented more concisely as
*}
ML{*fun separate i [] = ([], [])
| separate i (x::xs) =
if i <= x
then separate i xs ||> cons x
else separate i xs |>> cons x*}
text {*
avoiding the explicit @{text "let"}. While in this example the gain in
conciseness is only small, in more complicated situations the benefit of
avoiding @{text "lets"} can be substantial.
With the combinator @{ML_ind "|->" in Basics} you can re-combine the
elements from a pair. This combinator is defined as
*}
ML{*fun (x, y) |-> f = f x y*}
text {*
and can be used to write the following roundabout version
of the @{text double} function:
*}
ML{*fun double x =
x |> (fn x => (x, x))
|-> (fn x => fn y => x + y)*}
text {*
The combinator @{ML_ind ||>> in Basics} plays a central rôle whenever your
task is to update a theory and the update also produces a side-result (for
example a theorem). Functions for such tasks return a pair whose second
component is the theory and the fist component is the side-result. Using
@{ML ||>>}, you can do conveniently the update and also
accumulate the side-results. Consider the following simple function.
*}
ML %linenosgray{*fun acc_incs x =
x |> (fn x => ("", x))
||>> (fn x => (x, x + 1))
||>> (fn x => (x, x + 1))
||>> (fn x => (x, x + 1))*}
text {*
The purpose of Line 2 is to just pair up the argument with a dummy value (since
@{ML ||>>} operates on pairs). Each of the next three lines just increment
the value by one, but also nest the intermediate results to the left. For example
@{ML_response [display,gray]
"acc_incs 1"
"((((\"\", 1), 2), 3), 4)"}
You can continue this chain with:
@{ML_response [display,gray]
"acc_incs 1 ||>> (fn x => (x, x + 2))"
"(((((\"\", 1), 2), 3), 4), 6)"}
An example where this combinator is useful is as follows
@{ML_response_fake [display, gray]
"let
val ((names1, names2), _) =
@{context}
|> Variable.variant_fixes (replicate 4 \"x\")
||>> Variable.variant_fixes (replicate 5 \"x\")
in
(names1, names2)
end"
"([\"x\", \"xa\", \"xb\", \"xc\"], [\"xd\", \"xe\", \"xf\", \"xg\", \"xh\"])"}
Its purpose is to create nine variants of the string @{ML "\"x\""} so
that no variant will clash with another. Suppose for some reason we want
to bind four variants to the lists @{ML_text "name1"} and the
rest to @{ML_text "name2"}. In order to obtain non-clashing
variants we have to thread the context through the function calls
(the context records which variants have been previously created).
For the first call we can use @{ML "|>"}, but in the
second and any further call to @{ML_ind variant_fixes in Variable} we
have to use @{ML "||>>"} in order to account for the result(s)
obtained by previous calls.
A more realistic example for this combinator is the following code
*}
ML{*val (((one_def, two_def), three_def), ctxt') =
@{context}
|> Local_Defs.add_def ((@{binding "one"}, NoSyn), @{term "1::nat"})
||>> Local_Defs.add_def ((@{binding "Two"}, NoSyn), @{term "2::nat"})
||>> Local_Defs.add_def ((@{binding "Three"}, NoSyn), @{term "3::nat"})*}
text {*
where we make three definitions, namely @{term "one \<equiv> 1::nat"}, @{term "two \<equiv> 2::nat"}
and @{term "three \<equiv> 3::nat"}. The point of this code is that we augment the initial
context with the definitions. The result we are interested in is the
augmented context, that is @{ML_text "ctxt'"}, but also the side-results containing
information about the definitions---the function @{ML_ind add_def in Local_Defs} returns
both as pairs. We can use this information for example to print out the definiens and
the theorem corresponding to the definitions. For example for the first definition:
@{ML_response_fake [display, gray]
"let
val (one_trm, one_thm) = one_def
in
pwriteln (pretty_term ctxt' one_trm);
pwriteln (pretty_thm ctxt' one_thm)
end"
"one
one \<equiv> 1"}
Recall that @{ML "|>"} is the reverse function application. Recall also that
the related reverse function composition is @{ML "#>"}. In fact all the
combinators @{ML "|->"}, @{ML "|>>"} , @{ML "||>"} and @{ML "||>>"}
described above have related combinators for function composition, namely
@{ML_ind "#->" in Basics}, @{ML_ind "#>>" in Basics}, @{ML_ind "##>" in
Basics} and @{ML_ind "##>>" in Basics}. Using @{ML "#->"}, for example, the
function @{text double} can also be written as:
*}
ML{*val double =
(fn x => (x, x))
#-> (fn x => fn y => x + y)*}
text {*
When using combinators for writing functions in waterfall fashion, it is
sometimes necessary to do some ``plumbing'' in order to fit functions
together. We have already seen such plumbing in the function @{ML
apply_fresh_args}, where @{ML curry} is needed for making the function @{ML
list_comb}, which works over pairs, to fit with the combinator @{ML "|>"}. Such
plumbing is also needed in situations where a function operates over lists,
but one calculates only with a single element. An example is the function
@{ML_ind check_terms in Syntax}, whose purpose is to simultaneously type-check
a list of terms. Consider the code:
@{ML_response_fake [display, gray]
"let
val ctxt = @{context}
in
map (Syntax.parse_term ctxt) [\"m + n\", \"m * n\", \"m - (n::nat)\"]
|> Syntax.check_terms ctxt
|> pretty_terms ctxt
|> pwriteln
end"
"m + n, m * n, m - n"}
*}
text {*
In this example we obtain three terms (using the function
@{ML_ind parse_term in Syntax}) whose variables @{text m} and @{text n}
are of type @{typ "nat"}. If you have only a single term, then @{ML
check_terms in Syntax} needs plumbing. This can be done with the function
@{ML_ind singleton in Library}.\footnote{There is already a function @{ML check_term in
Syntax} in the file @{ML_file "Pure/Syntax/syntax.ML"} that is implemented
in terms of @{ML singleton} and @{ML check_terms in Syntax}.} For example
@{ML_response_fake [display, gray, linenos]
"let
val ctxt = @{context}
in
Syntax.parse_term ctxt \"m - (n::nat)\"
|> singleton (Syntax.check_terms ctxt)
|> pretty_term ctxt
|> pwriteln
end"
"m - n"}
where in Line 5, the function operating over lists fits with the
single term generated in Line 4.
\begin{readmore}
The most frequently used combinators are defined in the files @{ML_file
"Pure/library.ML"}
and @{ML_file "Pure/General/basics.ML"}. Also \isccite{sec:ML-linear-trans}
contains further information about combinators.
\end{readmore}
\begin{exercise}
Find out what the combinator @{ML "K I"} does.
\end{exercise}
*}
section {* ML-Antiquotations\label{sec:antiquote} *}
text {*
Recall from Section \ref{sec:include} that code in Isabelle is always
embedded in a theory. The main advantage of this is that the code can
contain references to entities defined on the logical level of Isabelle. By
this we mean references to definitions, theorems, terms and so on. These
reference are realised in Isabelle with ML-antiquotations, often just called
antiquotations.\footnote{Note that there are two kinds of antiquotations in
Isabelle, which have very different purposes and infrastructures. The first
kind, described in this section, are \emph{\index*{ML-antiquotation}}. They
are used to refer to entities (like terms, types etc) from Isabelle's logic
layer inside ML-code. The other kind of antiquotations are
\emph{document}\index{document antiquotation} antiquotations. They are used
only in the text parts of Isabelle and their purpose is to print logical
entities inside \LaTeX-documents. Document antiquotations are part of the
user level and therefore we are not interested in them in this Tutorial,
except in Appendix \ref{rec:docantiquotations} where we show how to
implement your own document antiquotations.} Syntactically antiquotations
are indicated by the @{ML_text @}-sign followed by text wrapped in @{text
"{\<dots>}"}. For example, one can print out the name of the current theory with
the code
@{ML_response [display,gray] "Context.theory_name @{theory}" "\"First_Steps\""}
where @{text "@{theory}"} is an antiquotation that is substituted with the
current theory (remember that we assumed we are inside the theory
@{text First_Steps}). The name of this theory can be extracted using
the function @{ML_ind theory_name in Context}.
Note, however, that antiquotations are statically linked, that is their value is
determined at ``compile-time'', not at ``run-time''. For example the function
*}
ML{*fun not_current_thyname () = Context.theory_name @{theory} *}
text {*
does \emph{not} return the name of the current theory, if it is run in a
different theory. Instead, the code above defines the constant function
that always returns the string @{text [quotes] "First_Steps"}, no matter where the
function is called. Operationally speaking, the antiquotation @{text "@{theory}"} is
\emph{not} replaced with code that will look up the current theory in
some data structure and return it. Instead, it is literally
replaced with the value representing the theory.
Another important antiquotation is @{text "@{context}"}. (What the
difference between a theory and a context is will be described in Chapter
\ref{chp:advanced}.) A context is for example needed in order to use the
function @{ML print_abbrevs in Proof_Context} that list of all currently
defined abbreviations.
@{ML_response_fake [display, gray]
"Proof_Context.print_abbrevs @{context}"
"Code_Evaluation.valtermify \<equiv> \<lambda>x. (x, \<lambda>u. Code_Evaluation.termify x)
INTER \<equiv> INFI
Inter \<equiv> Inf
\<dots>"}
You can also use antiquotations to refer to proved theorems:
@{text "@{thm \<dots>}"} for a single theorem
@{ML_response_fake [display,gray] "@{thm allI}" "(\<And>x. ?P x) \<Longrightarrow> \<forall>x. ?P x"}
and @{text "@{thms \<dots>}"} for more than one
@{ML_response_fake [display,gray]
"@{thms conj_ac}"
"(?P \<and> ?Q) = (?Q \<and> ?P)
(?P \<and> ?Q \<and> ?R) = (?Q \<and> ?P \<and> ?R)
((?P \<and> ?Q) \<and> ?R) = (?P \<and> ?Q \<and> ?R)"}
The thm-antiquotations can also be used for manipulating theorems. For
example, if you need the version of the theorem @{thm [source] refl} that
has a meta-equality instead of an equality, you can write
@{ML_response_fake [display,gray]
"@{thm refl[THEN eq_reflection]}"
"?x \<equiv> ?x"}
The point of these antiquotations is that referring to theorems in this way
makes your code independent from what theorems the user might have stored
under this name (this becomes especially important when you deal with
theorem lists; see Section \ref{sec:storing}).
It is also possible to prove lemmas with the antiquotation @{text "@{lemma \<dots> by \<dots>}"}
whose first argument is a statement (possibly many of them separated by @{text "and"})
and the second is a proof. For example
*}
ML{*val foo_thm = @{lemma "True" and "False \<Longrightarrow> P" by simp_all} *}
text {*
The result can be printed out as follows.
@{ML_response_fake [gray,display]
"foo_thm |> pretty_thms_no_vars @{context}
|> pwriteln"
"True, False \<Longrightarrow> P"}
You can also refer to the current simpset via an antiquotation. To illustrate
this we implement the function that extracts the theorem names stored in a
simpset.
*}
ML{*fun get_thm_names_from_ss simpset =
let
val {simps,...} = Raw_Simplifier.dest_ss simpset
in
map #1 simps
end*}
text {*
The function @{ML_ind dest_ss in Raw_Simplifier} returns a record containing all
information stored in the simpset, but here we are only interested in the names of the
simp-rules. Now you can feed in the current simpset into this function.
The current simpset can be referred to using the antiquotation @{text "@{simpset}"}.
@{ML_response_fake [display,gray]
"get_thm_names_from_ss @{simpset}"
"[\"Nat.of_nat_eq_id\", \"Int.of_int_eq_id\", \"Nat.One_nat_def\", \<dots>]"}
Again, this way of referencing simpsets makes you independent from additions
of lemmas to the simpset by the user, which can potentially cause loops in your
code.
It is also possible to define your own antiquotations. But you should
exercise care when introducing new ones, as they can also make your code
also difficult to read. In the next chapter we describe how to construct
terms with the (build in) antiquotation @{text "@{term \<dots>}"}. A restriction
of this antiquotation is that it does not allow you to use schematic
variables in terms. If you want to have an antiquotation that does not have
this restriction, you can implement your own using the function @{ML_ind
inline in ML_Antiquote} from the structure @{ML_struct ML_Antiquote}. The code
for the antiquotation @{text "term_pat"} is as follows.
*}
ML %linenosgray{*val term_pat_setup =
let
val parser = Args.context -- Scan.lift Args.name_source
fun term_pat (ctxt, str) =
str |> Proof_Context.read_term_pattern ctxt
|> ML_Syntax.print_term
|> ML_Syntax.atomic
in
ML_Antiquote.inline @{binding "term_pat"} (parser >> term_pat)
end*}
text {*
To use it you also have to install it using \isacommand{setup} like so
*}
setup %gray {* term_pat_setup *}
text {*
The parser in Line 2 provides us with a context and a string; this string is
transformed into a term using the function @{ML_ind read_term_pattern in
Proof_Context} (Line 5); the next two lines transform the term into a string
so that the ML-system can understand it. (All these functions will be explained
in more detail in later sections.) An example for this antiquotation is:
@{ML_response_fake [display,gray]
"@{term_pat \"Suc (?x::nat)\"}"
"Const (\"Suc\", \"nat \<Rightarrow> nat\") $ Var ((\"x\", 0), \"nat\")"}
which shows the internal representation of the term @{text "Suc ?x"}. Similarly
we can write an antiquotation for type patterns. Its code is
*}
ML{*val type_pat_setup =
let
val parser = Args.context -- Scan.lift Args.name_source
fun typ_pat (ctxt, str) =
str |> Syntax.parse_typ ctxt
|> ML_Syntax.print_typ
|> ML_Syntax.atomic
in
ML_Antiquote.inline @{binding "typ_pat"} (parser >> typ_pat)
end*}
text {*
which can be installed with
*}
setup %gray {* type_pat_setup *}
text {*
However, a word of warning is in order: Introducing new antiquotations
should be done only after careful deliberations. They can make your
code harder to read, than making it easier.
\begin{readmore}
The file @{ML_file "Pure/ML/ml_antiquote.ML"} contains the the definitions
for most antiquotations. Most of the basic operations on ML-syntax are implemented
in @{ML_file "Pure/ML/ml_syntax.ML"}.
\end{readmore}
*}
section {* Storing Data in Isabelle\label{sec:storing} *}
text {*
Isabelle provides mechanisms for storing (and retrieving) arbitrary
data. Before we delve into the details, let us digress a bit. Conventional
wisdom has it that the type-system of ML ensures that an
@{ML_type "'a list"}, say, can only hold elements of the same type, namely
@{ML_type "'a"} (or whatever is substitued for it). Despite this common
wisdom, however, it is possible to implement a
universal type in ML, although by some arguably accidental features of ML.
This universal type can be used to store data of different type into a single list.
In fact, it allows one to inject and to project data of \emph{arbitrary} type. This is
in contrast to datatypes, which only allow injection and projection of data for
some \emph{fixed} collection of types. In light of the conventional wisdom cited
above it is important to keep in mind that the universal type does not
destroy type-safety of ML: storing and accessing the data can only be done
in a type-safe manner...though run-time checks are needed for that.
\begin{readmore}
In Isabelle the universal type is implemented as the type @{ML_type
Universal.universal} in the file
@{ML_file "Pure/ML-Systems/universal.ML"}.
\end{readmore}
We will show the usage of the universal type by storing an integer and
a boolean into a single list. Let us first define injection and projection
functions for booleans and integers into and from the type @{ML_type Universal.universal}.
*}
ML{*local
val fn_int = Universal.tag () : int Universal.tag
val fn_bool = Universal.tag () : bool Universal.tag
in
val inject_int = Universal.tagInject fn_int;
val inject_bool = Universal.tagInject fn_bool;
val project_int = Universal.tagProject fn_int;
val project_bool = Universal.tagProject fn_bool
end*}
text {*
Using the injection functions, we can inject the integer @{ML_text "13"}
and the boolean value @{ML_text "true"} into @{ML_type Universal.universal}, and
then store them in a @{ML_type "Universal.universal list"} as follows:
*}
ML{*val foo_list =
let
val thirteen = inject_int 13
val truth_val = inject_bool true
in
[thirteen, truth_val]
end*}
text {*
The data can be retrieved with the projection functions defined above.
@{ML_response_fake [display, gray]
"project_int (nth foo_list 0);
project_bool (nth foo_list 1)"
"13
true"}
Notice that we access the integer as an integer and the boolean as
a boolean. If we attempt to access the integer as a boolean, then we get
a runtime error.
@{ML_response_fake [display, gray]
"project_bool (nth foo_list 0)"
"*** Exception- Match raised"}
This runtime error is the reason why ML is still type-sound despite
containing a universal type.
Now, Isabelle heavily uses this mechanism for storing all sorts of
data: theorem lists, simpsets, facts etc. Roughly speaking, there are two
places where data can be stored in Isabelle: in \emph{theories} and in \emph{proof
contexts}. Data such as simpsets are ``global'' and therefore need to be stored
in a theory (simpsets need to be maintained across proofs and even across
theories). On the other hand, data such as facts change inside a proof and
are only relevant to the proof at hand. Therefore such data needs to be
maintained inside a proof context, which represents ``local'' data.
You can think of a theory as the ``longterm memory'' of Isabelle (nothing will
be deleted from it), and a proof-context as a ``shortterm memory'' (it dynamically
changes according to what is needed at the time).
For theories and proof contexts there are, respectively, the functors
@{ML_funct_ind Theory_Data} and @{ML_funct_ind Proof_Data} that help
with the data storage. Below we show how to implement a table in which you
can store theorems and look them up according to a string key. The
intention in this example is to be able to look up introduction rules for logical
connectives. Such a table might be useful in an automatic proof procedure
and therefore it makes sense to store this data inside a theory.
Consequently we use the functor @{ML_funct Theory_Data}.
The code for the table is:
*}
ML %linenosgray{*structure Data = Theory_Data
(type T = thm Symtab.table
val empty = Symtab.empty
val extend = I
val merge = Symtab.merge (K true))*}
text {*
In order to store data in a theory, we have to specify the type of the data
(Line 2). In this case we specify the type @{ML_type "thm Symtab.table"},
which stands for a table in which @{ML_type string}s can be looked up
producing an associated @{ML_type thm}. We also have to specify four
functions to use this functor: namely how to initialise the data storage
(Line 3), how to extend it (Line 4) and how two
tables should be merged (Line 5). These functions correspond roughly to the
operations performed on theories and we just give some sensible
defaults.\footnote{\bf FIXME: Say more about the
assumptions of these operations.} The result structure @{ML_text Data}
contains functions for accessing the table (@{ML Data.get}) and for updating
it (@{ML Data.map}). There is also the functions @{ML Data.put}, which however is
not relevant here. Below we define two
auxiliary functions, which help us with accessing the table.
*}
ML{*val lookup = Symtab.lookup o Data.get
fun update k v = Data.map (Symtab.update (k, v))*}
text {*
Since we want to store introduction rules associated with their
logical connective, we can fill the table as follows.
*}
setup %gray {*
update "conj" @{thm conjI} #>
update "imp" @{thm impI} #>
update "all" @{thm allI}
*}
text {*
The use of the command \isacommand{setup} makes sure the table in the
\emph{current} theory is updated (this is explained further in
section~\ref{sec:theories}). The lookup can now be performed as follows.
@{ML_response_fake [display, gray]
"lookup @{theory} \"conj\""
"SOME \"\<lbrakk>?P; ?Q\<rbrakk> \<Longrightarrow> ?P \<and> ?Q\""}
An important point to note is that these tables (and data in general)
need to be treated in a purely functional fashion. Although
we can update the table as follows
*}
setup %gray {* update "conj" @{thm TrueI} *}
text {*
and accordingly, @{ML lookup} now produces the introduction rule for @{term "True"}
@{ML_response_fake [display, gray]
"lookup @{theory} \"conj\""
"SOME \"True\""}
there are no references involved. This is one of the most fundamental
coding conventions for programming in Isabelle. References
interfere with the multithreaded execution model of Isabelle and also
defeat its undo-mechanism. To see the latter, consider the
following data container where we maintain a reference to a list of
integers.
*}
ML{*structure WrongRefData = Theory_Data
(type T = (int list) Unsynchronized.ref
val empty = Unsynchronized.ref []
val extend = I
val merge = fst)*}
text {*
We initialise the reference with the empty list. Consequently a first
lookup produces @{ML "ref []" in Unsynchronized}.
@{ML_response_fake [display,gray]
"WrongRefData.get @{theory}"
"ref []"}
For updating the reference we use the following function
*}
ML{*fun ref_update n = WrongRefData.map
(fn r => let val _ = r := n::(!r) in r end)*}
text {*
which takes an integer and adds it to the content of the reference.
As before, we update the reference with the command
\isacommand{setup}.
*}
setup %gray {* ref_update 1 *}
text {*
A lookup in the current theory gives then the expected list
@{ML "ref [1]" in Unsynchronized}.
@{ML_response_fake [display,gray]
"WrongRefData.get @{theory}"
"ref [1]"}
So far everything is as expected. But, the trouble starts if we attempt to
backtrack to the ``point'' before the \isacommand{setup}-command. There, we
would expect that the list is empty again. But since it is stored in a
reference, Isabelle has no control over it. So it is not empty, but still
@{ML "ref [1]" in Unsynchronized}. Adding to the trouble, if we execute the
\isacommand{setup}-command again, we do not obtain @{ML "ref [1]" in
Unsynchronized}, but
@{ML_response_fake [display,gray]
"WrongRefData.get @{theory}"
"ref [1, 1]"}
Now imagine how often you go backwards and forwards in your proof
scripts.\footnote{The same problem can be triggered in the Jedit GUI by
making the parser to go over and over again over the \isacommand{setup} command.}
By using references in Isabelle code, you are bound to cause all
hell to break loose. Therefore observe the coding convention:
Do not use references for storing data!
\begin{readmore}
The functors for data storage are defined in @{ML_file "Pure/context.ML"}.
Isabelle contains implementations of several container data structures,
including association lists in @{ML_file "Pure/General/alist.ML"},
directed graphs in @{ML_file "Pure/General/graph.ML"}, and
tables and symtables in @{ML_file "Pure/General/table.ML"}.
\end{readmore}
Storing data in a proof context is done in a similar fashion. As mentioned
before, the corresponding functor is @{ML_funct_ind Proof_Data}. With the
following code we can store a list of terms in a proof context.
*}
ML{*structure Data = Proof_Data
(type T = term list
fun init _ = [])*}
text {*
The init-function we have to specify must produce a list for when a context
is initialised (possibly taking the theory into account from which the
context is derived). We choose here to just return the empty list. Next
we define two auxiliary functions for updating the list with a given
term and printing the list.
*}
ML{*fun update trm = Data.map (fn trms => trm::trms)
fun print ctxt =
case (Data.get ctxt) of
[] => pwriteln (Pretty.str "Empty!")
| trms => pwriteln (pretty_terms ctxt trms)*}
text {*
Next we start with the context generated by the antiquotation
@{text "@{context}"} and update it in various ways.
@{ML_response_fake [display,gray]
"let
val ctxt0 = @{context}
val ctxt1 = ctxt0 |> update @{term \"False\"}
|> update @{term \"True \<and> True\"}
val ctxt2 = ctxt0 |> update @{term \"1::nat\"}
val ctxt3 = ctxt2 |> update @{term \"2::nat\"}
in
print ctxt0;
print ctxt1;
print ctxt2;
print ctxt3
end"
"Empty!
True \<and> True, False
1
2, 1"}
Many functions in Isabelle manage and update data in a similar
fashion. Consequently, such calculations with contexts occur frequently in
Isabelle code, although the ``context flow'' is usually only linear.
Note also that the calculation above has no effect on the underlying
theory. Once we throw away the contexts, we have no access to their
associated data. This is different for theories, where the command
\isacommand{setup} registers the data with the current and future
theories, and therefore one can access the data potentially
indefinitely.
Move elsewhere
For convenience there is an abstract layer, namely the type @{ML_type Context.generic},
for treating theories and proof contexts more uniformly. This type is defined as follows
*}
ML_val{*datatype generic =
Theory of theory
| Proof of proof*}
text {*
\footnote{\bf FIXME: say more about generic contexts.}
There are two special instances of the data storage mechanism described
above. The first instance implements named theorem lists using the functor
@{ML_funct_ind Named_Thms}. This is because storing theorems in a list
is such a common task. To obtain a named theorem list, you just declare
*}
ML{*structure FooRules = Named_Thms
(val name = @{binding "foo"}
val description = "Theorems for foo") *}
text {*
and set up the @{ML_struct FooRules} with the command
*}
setup %gray {* FooRules.setup *}
text {*
This code declares a data container where the theorems are stored,
an attribute @{text foo} (with the @{text add} and @{text del} options
for adding and deleting theorems) and an internal ML-interface for retrieving and
modifying the theorems.
Furthermore, the theorems are made available on the user-level under the name
@{text foo}. For example you can declare three lemmas to be a member of the
theorem list @{text foo} by:
*}
lemma rule1[foo]: "A" sorry
lemma rule2[foo]: "B" sorry
lemma rule3[foo]: "C" sorry
text {* and undeclare the first one by: *}
declare rule1[foo del]
text {* You can query the remaining ones with:
\begin{isabelle}
\isacommand{thm}~@{text "foo"}\\
@{text "> ?C"}\\
@{text "> ?B"}
\end{isabelle}
On the ML-level, we can add theorems to the list with @{ML FooRules.add_thm}:
*}
setup %gray {* Context.theory_map (FooRules.add_thm @{thm TrueI}) *}
text {*
The rules in the list can be retrieved using the function
@{ML FooRules.get}:
@{ML_response_fake [display,gray]
"FooRules.get @{context}"
"[\"True\", \"?C\",\"?B\"]"}
Note that this function takes a proof context as argument. This might be
confusing, since the theorem list is stored as theory data. It becomes clear by knowing
that the proof context contains the information about the current theory and so the function
can access the theorem list in the theory via the context.
\begin{readmore}
For more information about named theorem lists see
@{ML_file "Pure/Tools/named_thms.ML"}.
\end{readmore}
The second special instance of the data storage mechanism are configuration
values. They are used to enable users to configure tools without having to
resort to the ML-level (and also to avoid references). Assume you want the
user to control three values, say @{text bval} containing a boolean, @{text
ival} containing an integer and @{text sval} containing a string. These
values can be declared by
*}
ML{*val bval = Attrib.setup_config_bool @{binding "bval"} (K false)
val ival = Attrib.setup_config_int @{binding "ival"} (K 0)
val sval = Attrib.setup_config_string @{binding "sval"} (K "some string") *}
text {*
where each value needs to be given a default.
The user can now manipulate the values from the user-level of Isabelle
with the command
*}
declare [[bval = true, ival = 3]]
text {*
On the ML-level these values can be retrieved using the
function @{ML_ind get in Config} from a proof context
@{ML_response [display,gray]
"Config.get @{context} bval"
"true"}
or directly from a theory using the function @{ML_ind get_global in Config}
@{ML_response [display,gray]
"Config.get_global @{theory} bval"
"true"}
It is also possible to manipulate the configuration values
from the ML-level with the functions @{ML_ind put in Config}
and @{ML_ind put_global in Config}. For example
@{ML_response [display,gray]
"let
val ctxt = @{context}
val ctxt' = Config.put sval \"foo\" ctxt
val ctxt'' = Config.put sval \"bar\" ctxt'
in
(Config.get ctxt sval,
Config.get ctxt' sval,
Config.get ctxt'' sval)
end"
"(\"some string\", \"foo\", \"bar\")"}
A concrete example for a configuration value is
@{ML_ind simp_trace in Raw_Simplifier}, which switches on trace information
in the simplifier. This can be used for example in the following proof
*}
lemma
shows "(False \<or> True) \<and> True"
proof (rule conjI)
show "False \<or> True" using [[simp_trace = true]] by simp
next
show "True" by simp
qed
text {*
in order to inspect how the simplifier solves the first subgoal.
\begin{readmore}
For more information about configuration values see
the files @{ML_file "Pure/Isar/attrib.ML"} and
@{ML_file "Pure/config.ML"}.
\end{readmore}
*}
section {* Summary *}
text {*
This chapter describes the combinators that are used in Isabelle, as well
as a simple printing infrastructure for @{ML_type term}, @{ML_type cterm}
and @{ML_type thm}. The section on ML-antiquotations shows how to refer
statically to entities from the logic level of Isabelle. Isabelle also
contains mechanisms for storing arbitrary data in theory and proof
contexts.
\begin{conventions}
\begin{itemize}
\item Print messages that belong together in a single string.
\item Do not use references in Isabelle code.
\end{itemize}
\end{conventions}
*}
end