theory FirstStepsimports Basebeginchapter {* First Steps *}text {* Isabelle programming is done in ML. Just like lemmas and proofs, ML-code in Isabelle is part of a theory. If you want to follow the code written in this chapter, we assume you are working inside the theory starting with \begin{center} \begin{tabular}{@ {}l} \isacommand{theory} FirstSteps\\ \isacommand{imports} Main\\ \isacommand{begin}\\ \ldots \end{tabular} \end{center}*}section {* Including ML-Code *}text {* The easiest and quickest way to include code in a theory is by using the \isacommand{ML}-command. For example\smallskip\begin{isabelle}\begin{graybox}\isacommand{ML}~@{text "\<verbopen>"}\isanewline\hspace{5mm}@{ML "3 + 4"}\isanewline@{text "\<verbclose>"}\isanewline@{text "> 7"}\smallskip\end{graybox}\end{isabelle} Like ``normal'' Isabelle proof scripts, \isacommand{ML}-commands can be evaluated by using the advance and undo buttons of your Isabelle environment. The code inside the \isacommand{ML}-command can also contain value and function bindings, and even those can be undone when the proof script is retracted. As mentioned earlier, we will drop the \isacommand{ML}~@{text "\<verbopen> \<dots> \<verbclose>"} whenever we show code. Once a portion of code is relatively stable, one usually wants to export it to a separate ML-file. Such files can then be included in a theory by using \isacommand{uses} in the header of the theory, like \begin{center} \begin{tabular}{@ {}l} \isacommand{theory} FirstSteps\\ \isacommand{imports} Main\\ \isacommand{uses} @{text "\"file_to_be_included.ML\""} @{text "\<dots>"}\\ \isacommand{begin}\\ \ldots \end{tabular} \end{center}*}section {* Debugging and Printing *}text {* During development you might find it necessary to inspect some data in your code. This can be done in a ``quick-and-dirty'' fashion using the function @{ML "warning"}. For example @{ML_response_fake [display,gray] "warning \"any string\"" "\"any string\""} will print out @{text [quotes] "any string"} inside the response buffer of Isabelle. This function expects a string as argument. If you develop under PolyML, then there is a convenient, though again ``quick-and-dirty'', method for converting values into strings, namely using the function @{ML makestring}: @{ML_response_fake [display,gray] "warning (makestring 1)" "\"1\""} However @{ML makestring} only works if the type of what is converted is monomorphic and not a function. The function @{ML "warning"} should only be used for testing purposes, because any output this function generates will be overwritten as soon as an error is raised. For printing anything more serious and elaborate, the function @{ML tracing} is more appropriate. This function writes all output into a separate tracing buffer. For example @{ML_response_fake [display,gray] "tracing \"foo\"" "\"foo\""} It is also possible to redirect the ``channel'' where the string @{text "foo"} is printed to a separate file, e.g.~to prevent ProofGeneral from choking on massive amounts of trace output. This redirection can be achieved using the code*}ML{*val strip_specials =let fun strip ("\^A" :: _ :: cs) = strip cs | strip (c :: cs) = c :: strip cs | strip [] = [];in implode o strip o explode end;fun redirect_tracing stream = Output.tracing_fn := (fn s => (TextIO.output (stream, (strip_specials s)); TextIO.output (stream, "\n"); TextIO.flushOut stream)) *}text {* Calling @{ML "redirect_tracing"} with @{ML "(TextIO.openOut \"foo.bar\")"} will cause that all tracing information is printed into the file @{text "foo.bar"}. Error messages can be printed using the function @{ML error}, as in @{ML_response_fake [display,gray] "if 0=1 then 1 else (error \"foo\")" "\"foo\""}*}section {* Antiquotations *}text {* The main advantage of embedding all code in a theory is that the code can contain references to entities defined on the logical level of Isabelle. By this we mean definitions, theorems, terms and so on. This kind of reference is realised with antiquotations. For example, one can print out the name of the current theory by typing @{ML_response [display,gray] "Context.theory_name @{theory}" "\"FirstSteps\""} where @{text "@{theory}"} is an antiquotation that is substituted with the current theory (remember that we assumed we are inside the theory @{text FirstSteps}). The name of this theory can be extracted with the function @{ML "Context.theory_name"}. Note, however, that antiquotations are statically scoped, that is their value is determined at ``compile-time'', not ``run-time''. For example the function*}ML{*fun not_current_thyname () = Context.theory_name @{theory} *}text {* does, as its name suggest, \emph{not} return the name of the current theory, if it is run in a different theory. Instead, the code above defines the constant function that always returns the string @{text [quotes] "FirstSteps"}, no matter where the function is called. Operationally speaking, the antiquotation @{text "@{theory}"} is \emph{not} replaced with code that will look up the current theory in some data structure and return it. Instead, it is literally replaced with the value representing the theory name. In a similar way you can use antiquotations to refer to proved theorems: @{ML_response_fake [display,gray] "@{thm allI}" "(\<And>x. ?P x) \<Longrightarrow> \<forall>x. ?P x"} or simpsets: @{ML_response_fake [display,gray] "let val ({rules,...},_) = MetaSimplifier.rep_ss @{simpset}in map #name (Net.entries rules)end" "[\"Nat.of_nat_eq_id\", \"Int.of_int_eq_id\", \"Nat.One_nat_def\", \<dots>]"} The code about simpsets extracts the theorem names that are stored in the current simpset. We get hold of the current simpset with the antiquotation @{text "@{simpset}"}. The function @{ML rep_ss in MetaSimplifier} returns a record containing all information about the simpset. The rules of a simpset are stored in a \emph{discrimination net} (a datastructure for fast indexing). From this net we can extract the entries using the function @{ML Net.entries}. \begin{readmore} The infrastructure for simpsets is implemented in @{ML_file "Pure/meta_simplifier.ML"} and @{ML_file "Pure/simplifier.ML"}. Discrimination nets are implemented in @{ML_file "Pure/net.ML"}. \end{readmore} While antiquotations have many applications, they were originally introduced in order to avoid explicit bindings for theorems such as*}ML{*val allI = thm "allI" *}text {* These bindings are difficult to maintain and also can be accidentally overwritten by the user. This often breakes definitional packages. Antiquotations solve this problem, since they are ``linked'' statically at compile-time. However, this static linkage also limits their usefulness in cases where data needs to be build up dynamically. In the course of this introduction, we will learn more about these antiquotations: they greatly simplify Isabelle programming since one can directly access all kinds of logical elements from ML.*}section {* Terms and Types *}text {* One way to construct terms of Isabelle on the ML-level is by using the antiquotation \mbox{@{text "@{term \<dots>}"}}. For example @{ML_response [display,gray] "@{term \"(a::nat) + b = c\"}" "Const (\"op =\", \<dots>) $ (Const (\"HOL.plus_class.plus\", \<dots>) $ \<dots> $ \<dots>) $ \<dots>"} This will show the term @{term "(a::nat) + b = c"}, but printed using the internal representation of this term. This internal representation corresponds to the datatype @{ML_type "term"}. The internal representation of terms uses the usual de Bruijn index mechanism where bound variables are represented by the constructor @{ML Bound}. The index in @{ML Bound} refers to the number of Abstractions (@{ML Abs}) we have to skip until we hit the @{ML Abs} that binds the corresponding variable. However, in Isabelle the names of bound variables are kept at abstractions for printing purposes, and so should be treated only as comments. \begin{readmore} Terms are described in detail in \isccite{sec:terms}. Their definition and many useful operations are implemented in @{ML_file "Pure/term.ML"}. \end{readmore} Sometimes the internal representation of terms can be surprisingly different from what you see at the user level, because the layers of parsing/type-checking/pretty printing can be quite elaborate. \begin{exercise} Look at the internal term representation of the following terms, and find out why they are represented like this. \begin{itemize} \item @{term "case x of 0 \<Rightarrow> 0 | Suc y \<Rightarrow> y"} \item @{term "\<lambda>(x,y). P y x"} \item @{term "{ [x::int] | x. x \<le> -2 }"} \end{itemize} Hint: The third term is already quite big, and the pretty printer may omit parts of it by default. If you want to see all of it, you can use the following ML function to set the limit to a value high enough: @{ML [display,gray] "print_depth 50"} \end{exercise} The antiquotation @{text "@{prop \<dots>}"} constructs terms of propositional type, inserting the invisible @{text "Trueprop"}-coercions whenever necessary. Consider for example the pairs @{ML_response [display,gray] "(@{term \"P x\"}, @{prop \"P x\"})" "(Free (\"P\", \<dots>) $ Free (\"x\", \<dots>), Const (\"Trueprop\", \<dots>) $ (Free (\"P\", \<dots>) $ Free (\"x\", \<dots>)))"} where an coercion is inserted in the second component and @{ML_response [display,gray] "(@{term \"P x \<Longrightarrow> Q x\"}, @{prop \"P x \<Longrightarrow> Q x\"})" "(Const (\"==>\", \<dots>) $ \<dots> $ \<dots>, Const (\"==>\", \<dots>) $ \<dots> $ \<dots>)"} where it is not (since it is already constructed by a meta-implication). Types can be constructed using the antiquotation @{text "@{typ \<dots>}"}. For example @{ML_response_fake [display,gray] "@{typ \"bool \<Rightarrow> nat\"}" "bool \<Rightarrow> nat"} \begin{readmore} Types are described in detail in \isccite{sec:types}. Their definition and many useful operations are implemented in @{ML_file "Pure/type.ML"}. \end{readmore}*}section {* Constructing Terms and Types Manually *} text {* While antiquotations are very convenient for constructing terms, they can only construct fixed terms (remember they are ``linked'' at compile-time). See Recipe~\ref{rec:external} on Page~\pageref{rec:external} for a function that pattern-matches over terms and where the pattern are constructed from antiquotations. However, one often needs to construct terms dynamically. For example, a function that returns the implication @{text "\<And>(x::\<tau>). P x \<Longrightarrow> Q x"} taking @{term P}, @{term Q} and the type @{term "\<tau>"} as arguments can only be written as*}ML{*fun make_imp P Q tau = let val x = Free ("x",tau) in Logic.all x (Logic.mk_implies (P $ x, Q $ x)) end *}text {* The reason is that one cannot pass the arguments @{term P}, @{term Q} and @{term "tau"} into an antiquotation. For example the following does \emph{not} work:*}ML{*fun make_wrong_imp P Q tau = @{prop "\<And>x. P x \<Longrightarrow> Q x"} *}text {* To see this apply @{text "@{term S}"}, @{text "@{term T}"} and @{text "@{typ nat}"} to both functions. With @{ML make_imp} we obtain the intended term involving @{term "S"}, @{text "T"} and @{text "@{typ nat}"} @{ML_response [display,gray] "make_imp @{term S} @{term T} @{typ nat}" "Const \<dots> $ Abs (\"x\", Type (\"nat\",[]), Const \<dots> $ (Free (\"S\",\<dots>) $ \<dots>) $ (Free (\"T\",\<dots>) $ \<dots>))"} whereas with @{ML make_wrong_imp} we obtain a term involving the @{term "P"} and @{text "Q"} from the antiquotation. @{ML_response [display,gray] "make_wrong_imp @{term S} @{term T} @{typ nat}" "Const \<dots> $ Abs (\"x\", \<dots>, Const \<dots> $ (Const \<dots> $ (Free (\"P\",\<dots>) $ \<dots>)) $ (Const \<dots> $ (Free (\"Q\",\<dots>) $ \<dots>)))"} (FIXME: expand the following point) One tricky point in constructing terms by hand is to obtain the fully qualified name for constants. For example the names for @{text "zero"} and @{text "+"} are more complex than one first expects, namely \begin{center} @{text "HOL.zero_class.zero"} and @{text "HOL.plus_class.plus"}. \end{center} The extra prefixes @{text zero_class} and @{text plus_class} are present because these constants are defined within type classes; the prefix @{text "HOL"} indicates in which theory they are defined. Guessing such internal names can sometimes be quite hard. Therefore Isabelle provides the antiquotation @{text "@{const_name \<dots>}"} which does the expansion automatically, for example: @{ML_response_fake [display,gray] "@{const_name \"Nil\"}" "List.list.Nil"} (FIXME: Is it useful to explain @{text "@{const_syntax}"}?) Similarly, one can construct types manually. For example the function returning a function type is as follows:*} ML{*fun make_fun_type tau1 tau2 = Type ("fun",[tau1,tau2]) *}text {* This can be equally written as *}ML{*fun make_fun_type tau1 tau2 = tau1 --> tau2 *}text {* \begin{readmore} There are many functions in @{ML_file "Pure/logic.ML"} and @{ML_file "HOL/hologic.ML"} that make such manual constructions of terms and types easier.\end{readmore} Have a look at these files and try to solve the following two exercises:*}text {* \begin{exercise}\label{fun:revsum} Write a function @{text "rev_sum : term -> term"} that takes a term of the form @{text "t\<^isub>1 + t\<^isub>2 + \<dots> + t\<^isub>n"} (whereby @{text "i"} might be zero) and returns the reversed sum @{text "t\<^isub>n + \<dots> + t\<^isub>2 + t\<^isub>1"}. Assume the @{text "t\<^isub>i"} can be arbitrary expressions and also note that @{text "+"} associates to the left. Try your function on some examples. \end{exercise} \begin{exercise}\label{fun:makesum} Write a function which takes two terms representing natural numbers in unary notation (like @{term "Suc (Suc (Suc 0))"}), and produce the number representing their sum. \end{exercise} (FIXME: operation from page 19 of the implementation manual)*}section {* Type-Checking *}text {* We can freely construct and manipulate terms, since they are just arbitrary unchecked trees. However, we eventually want to see if a term is well-formed, or type-checks, relative to a theory. Type-checking is done via the function @{ML cterm_of}, which converts a @{ML_type term} into a @{ML_type cterm}, a \emph{certified} term. Unlike @{ML_type term}s, which are just trees, @{ML_type "cterm"}s are abstract objects that are guaranteed to be type-correct, and they can only be constructed via ``official interfaces''. Type-checking is always relative to a theory context. For now we use the @{ML "@{theory}"} antiquotation to get hold of the current theory. For example we can write @{ML_response_fake [display,gray] "cterm_of @{theory} @{term \"a + b = c\"}" "a + b = c"} This can also be wirtten with an antiquotation @{ML_response_fake [display,gray] "@{cterm \"(a::nat) + b = c\"}" "a + b = c"} Attempting to obtain the certified term for @{ML_response_fake_both [display,gray] "@{cterm \"1 + True\"}" "Type unification failed \<dots>"} yields an error (since the term is not typable). A slightly more elaborate example that type-checks is@{ML_response_fake [display,gray] "let val natT = @{typ \"nat\"} val zero = @{term \"0::nat\"}in cterm_of @{theory} (Const (@{const_name plus}, natT --> natT --> natT) $ zero $ zero)end" "0 + 0"} \begin{exercise} Check that the function defined in Exercise~\ref{fun:revsum} returns a result that type-checks. \end{exercise} (FIXME: @{text "ctyp_of"})*}section {* Theorems *}text {* Just like @{ML_type cterm}s, theorems are abstract objects of type @{ML_type thm} that can only be built by going through interfaces. As a consequence, every proof in Isabelle is correct by construction (FIXME reference LCF-philosophy) To see theorems in ``action'', let us give a proof on the ML-level for the following statement:*} lemma assumes assm\<^isub>1: "\<And>(x::nat). P x \<Longrightarrow> Q x" and assm\<^isub>2: "P t" shows "Q t" (*<*)oops(*>*) text {* The corresponding ML-code is as follows:\footnote{Note that @{text "|>"} is reverse application. See Section~\ref{sec:combinators}.}@{ML_response_fake [display,gray]"let val thy = @{theory} val assm1 = cterm_of thy @{prop \"\<And>(x::nat). P x \<Longrightarrow> Q x\"} val assm2 = cterm_of thy @{prop \"(P::nat\<Rightarrow>bool) t\"} val Pt_implies_Qt = assume assm1 |> forall_elim (cterm_of thy @{term \"t::nat\"}); val Qt = implies_elim Pt_implies_Qt (assume assm2);in Qt |> implies_intr assm2 |> implies_intr assm1end" "\<lbrakk>\<And>x. P x \<Longrightarrow> Q x; P t\<rbrakk> \<Longrightarrow> Q t"} This code-snippet constructs the following proof: \[ \infer[(@{text "\<Longrightarrow>"}$-$intro)]{\vdash @{prop "(\<And>x. P x \<Longrightarrow> Q x) \<Longrightarrow> P t \<Longrightarrow> Q t"}} {\infer[(@{text "\<Longrightarrow>"}$-$intro)]{@{prop "\<And>x. P x \<Longrightarrow> Q x"} \vdash @{prop "P t \<Longrightarrow> Q t"}} {\infer[(@{text "\<Longrightarrow>"}$-$elim)]{@{prop "\<And>x. P x \<Longrightarrow> Q x"}, @{prop "P t"} \vdash @{prop "Q t"}} {\infer[(@{text "\<And>"}$-$elim)]{@{prop "\<And>x. P x \<Longrightarrow> Q x"} \vdash @{prop "P t \<Longrightarrow> Q t"}} {\infer[(assume)]{@{prop "\<And>x. P x \<Longrightarrow> Q x"} \vdash @{prop "\<And>x. P x \<Longrightarrow> Q x"}}{}} & \infer[(assume)]{@{prop "P t"} \vdash @{prop "P t"}}{} } } } \] \begin{readmore} For the functions @{text "assume"}, @{text "forall_elim"} etc see \isccite{sec:thms}. The basic functions for theorems are defined in @{ML_file "Pure/thm.ML"}. \end{readmore}*}section {* Storing Theorems *}section {* Theorem Attributes *}section {* Operations on Constants (Names) *}text {*@{ML_response [display] "Sign.base_name \"List.list.Nil\"" "\"Nil\""} authentic syntax*}section {* Combinators\label{sec:combinators} *}text {* For beginners, perhaps the most puzzling parts in the existing code of Isabelle are the combinators. At first they seem to greatly obstruct the comprehension of the code, but after getting familiar with them, they actually ease the understanding and also the programming. \begin{readmore} The most frequently used combinator are defined in the files @{ML_file "Pure/library.ML"} and @{ML_file "Pure/General/basics.ML"}. Also \isccite{sec:ML-linear-trans} contains further information about combinators. \end{readmore} The simplest combinator is @{ML I}, which is just the identity function.*}ML{*fun I x = x*}text {* Another simple combinator is @{ML K}, defined as *}ML{*fun K x = fn _ => x*}text {* @{ML K} ``wraps'' a function around the argument @{text "x"}. However, this function ignores its argument. As a result @{ML K} defines a constant function returning @{text x}. The next combinator is reverse application, @{ML "|>"}, defined as *}ML{*fun x |> f = f x*}text {* While just syntactic sugar for the usual function application, the purpose of this combinator is to implement functions in a ``waterfall fashion''. Consider for example the function *}ML %linenumbers{*fun inc_by_five x = x |> (fn x => x + 1) |> (fn x => (x, x)) |> fst |> (fn x => x + 4)*}text {* which increments the argument @{text x} by 5. It does this by first incrementing the argument by 1 (Line 2); then storing the result in a pair (Line 3); taking the first component of the pair (Line 4) and finally incrementing the first component by 4 (Line 5). This kind of cascading manipulations of values is quite common when dealing with theories (for example by adding a definition, followed by lemmas and so on). It should also be familiar to anyone who has used Haskell's do-notation. Writing the function @{ML inc_by_five} using the reverse application is much clearer than writing*}ML{*fun inc_by_five x = fst ((fn x => (x, x)) (x + 1)) + 4*}text {* or *}ML{*fun inc_by_five x = ((fn x => x + 4) o fst o (fn x => (x, x)) o (fn x => x + 1)) x*}text {* and typographically more economical than *}ML{*fun inc_by_five x = let val y1 = x + 1 val y2 = (y1, y1) val y3 = fst y2 val y4 = y3 + 4 in y4 end*}text {* Another reason why the let-bindings in the code above are better to be avoided: it is more than easy to get the intermediate values wrong, not to mention the nightmares the maintenance of this code causes! (FIXME: give a real world example involving theories) Similarly, the combinator @{ML "#>"} is the reverse function composition. It can be used to define the following function*}ML{*val inc_by_six = (fn x => x + 1) #> (fn x => x + 2) #> (fn x => x + 3)*}text {* which is the function composed of first the increment-by-one function and then increment-by-two, followed by increment-by-three. Again, the reverse function composition allows one to read the code top-down. The remaining combinators described in this section add convenience for the ``waterfall method'' of writing functions. The combinator @{ML tap} allows one to get hold of an intermediate result (to do some side-calculations for instance). The function *}ML %linenumbers{*fun inc_by_three x = x |> (fn x => x + 1) |> tap (fn x => tracing (makestring x)) |> (fn x => x + 2)*}text {* increments the argument first by one and then by two. In the middle (Line 3), however, it uses @{ML tap} for printing the ``plus-one'' intermediate result inside the tracing buffer. The function @{ML tap} can only be used for side-calculations, because any value that is computed cannot be merged back into the ``main waterfall''. To do this, the next combinator can be used. The combinator @{ML "`"} is similar to @{ML tap}, but applies a function to the value and returns the result together with the value (as a pair). For example the function *}ML{*fun inc_as_pair x = x |> `(fn x => x + 1) |> (fn (x, y) => (x, y + 1))*}text {* takes @{text x} as argument, and then first increments @{text x}, but also keeps @{text x}. The intermediate result is therefore the pair @{ML "(x + 1, x)" for x}. The function then increments the right-hand component of the pair. So finally the result will be @{ML "(x + 1, x + 1)" for x}. The combinators @{ML "|>>"} and @{ML "||>"} are defined for functions manipulating pairs. The first applies the function to the first component of the pair, defined as:*}ML{*fun (x, y) |>> f = (f x, y)*}text {* and the second combinator to the second component, defined as*}ML{*fun (x, y) ||> f = (x, f y)*}text {* With the combinator @{ML "|->"} you can re-combine the elements from a pair. This combinator is defined as*}ML{*fun (x, y) |-> f = f x y*}text {* and can be used to write the following version of the @{text double} function *}ML{*fun double x = x |> (fn x => (x, x)) |-> (fn x => fn y => x + y)*}text {* Recall that @{ML "|>"} is the reverse function applications. Recall also that the related reverse function composition is @{ML "#>"}. In fact all the combinators @{ML "|->"}, @{ML "|>>"} and @{ML "||>"} described above have related combinators for function composition, namely @{ML "#->"}, @{ML "#>>"} and @{ML "##>"}. Using @{ML "#->"}, for example, the function @{text double} can also be written as*}ML{*val double = (fn x => (x, x)) #-> (fn x => fn y => x + y)*}text {* (FIXME: find a good exercise for combinators)*}end