theory First_Stepsimports Basebeginchapter {* First Steps\label{chp:firststeps} *}text {* \begin{flushright} {\em ``The most effective debugging tool is still careful thought,\\ coupled with judiciously placed print statements.''} \\[1ex] Brian Kernighan, in {\em Unix for Beginners}, 1979 \end{flushright} \medskip Isabelle programming is done in ML. Just like lemmas and proofs, ML-code for Isabelle must be part of a theory. If you want to follow the code given in this chapter, we assume you are working inside the theory starting with \begin{quote} \begin{tabular}{@ {}l} \isacommand{theory} First\_Steps\\ \isacommand{imports} Main\\ \isacommand{begin}\\ \ldots \end{tabular} \end{quote} We also generally assume you are working with the logic HOL. The examples that will be given might need to be adapted if you work in a different logic.*}section {* Including ML-Code\label{sec:include} *}text {* The easiest and quickest way to include code in a theory is by using the \isacommand{ML}-command. For example: \begin{isabelle} \begin{graybox} \isacommand{ML}~@{text "\<verbopen>"}\isanewline \hspace{5mm}@{ML "3 + 4"}\isanewline @{text "\<verbclose>"}\isanewline @{text "> 7"}\smallskip \end{graybox} \end{isabelle} If you work with ProofGeneral then like normal Isabelle scripts \isacommand{ML}-commands can be evaluated by using the advance and undo buttons of your Isabelle environment. If you work with the Jedit GUI, then you just have to hover the cursor over the code and you see the evaluated result in the ``Output'' window. As mentioned in the Introduction, we will drop the \isacommand{ML}~@{text "\<verbopen> \<dots> \<verbclose>"} scaffolding whenever we show code. The lines prefixed with @{text [quotes] ">"} are not part of the code, rather they indicate what the response is when the code is evaluated. There are also the commands \isacommand{ML\_val} and \isacommand{ML\_prf} for including ML-code. The first evaluates the given code, but any effect on the theory, in which the code is embedded, is suppressed. The second needs to be used if ML-code is defined inside a proof. For example \begin{quote} \begin{isabelle} \isacommand{lemma}~@{text "test:"}\isanewline \isacommand{shows}~@{text [quotes] "True"}\isanewline \isacommand{ML\_prf}~@{text "\<verbopen>"}~@{ML "writeln \"Trivial!\""}~@{text "\<verbclose>"}\isanewline \isacommand{oops} \end{isabelle} \end{quote} However, both commands will only play minor roles in this tutorial (we most of the time make sure that the ML-code is defined outside proofs). Once a portion of code is relatively stable, you usually want to export it to a separate ML-file. Such files can then be included somewhere inside a theory by using the command \isacommand{use}. For example \begin{quote} \begin{tabular}{@ {}l} \isacommand{theory} First\_Steps\\ \isacommand{imports} Main\\ \isacommand{uses}~@{text "(\"file_to_be_included.ML\")"} @{text "\<dots>"}\\ \isacommand{begin}\\ \ldots\\ \isacommand{use}~@{text "\"file_to_be_included.ML\""}\\ \ldots \end{tabular} \end{quote} The \isacommand{uses}-command in the header of the theory is needed in order to indicate the dependency of the theory on the ML-file. Alternatively, the file can be included by just writing in the header \begin{quote} \begin{tabular}{@ {}l} \isacommand{theory} First\_Steps\\ \isacommand{imports} Main\\ \isacommand{uses} @{text "\"file_to_be_included.ML\""} @{text "\<dots>"}\\ \isacommand{begin}\\ \ldots \end{tabular} \end{quote} Note that no parentheses are given in this case. Note also that the included ML-file should not contain any \isacommand{use} itself. Otherwise Isabelle is unable to record all file dependencies, which is a nuisance if you have to track down errors.*}section {* Printing and Debugging\label{sec:printing} *}text {* During development you might find it necessary to inspect data in your code. This can be done in a ``quick-and-dirty'' fashion using the function @{ML_ind writeln in Output}. For example @{ML_response_fake [display,gray] "writeln \"any string\"" "\"any string\""} will print out @{text [quotes] "any string"} inside the response buffer. This function expects a string as argument. If you develop under PolyML, then there is a convenient, though again ``quick-and-dirty'', method for converting values into strings, namely the antiquotation @{text "@{make_string}"}: @{ML_response_fake [display,gray] "writeln (@{make_string} 1)" "\"1\""} However, @{text "@{makes_tring}"} only works if the type of what is converted is monomorphic and not a function. You can print out error messages with the function @{ML_ind error in Library}; for example: @{ML_response_fake [display,gray] "if 0 = 1 then true else (error \"foo\")" "Exception- ERROR \"foo\" raisedAt command \"ML\"."} This function raises the exception @{text ERROR}, which will then be displayed by the infrastructure. Note that this exception is meant for ``user-level'' error messages seen by the ``end-user''. For messages where you want to indicate a genuine program error, then use the exception @{text Fail}. Most often you want to inspect data of Isabelle's basic data structures, namely @{ML_type term}, @{ML_type typ}, @{ML_type cterm}, @{ML_type ctyp} and @{ML_type thm}. Isabelle contains elaborate pretty-printing functions, which we will explain in more detail in Section \ref{sec:pretty}. For now we just use the functions @{ML_ind writeln in Pretty} from the structure @{ML_struct Pretty} and @{ML_ind pretty_term in Syntax} from the structure @{ML_struct Syntax}. For more convenience, we bind them to the toplevel.*}ML %grayML{*val pretty_term = Syntax.pretty_termval pwriteln = Pretty.writeln*}text {* They can now be used as follows @{ML_response_fake [display,gray] "pwriteln (pretty_term @{context} @{term \"1::nat\"})" "\"1\""} If there is more than one term to be printed, you can use the function @{ML_ind commas in Pretty} and @{ML_ind block in Pretty} to separate them.*} ML %grayML{*fun pretty_terms ctxt trms = Pretty.block (Pretty.commas (map (pretty_term ctxt) trms))*}text {* You can also print out terms together with their typing information. For this you need to set the configuration value @{ML_ind show_types in Syntax} to @{ML true}.*}ML %grayML{*val show_types_ctxt = Config.put show_types true @{context}*}text {* Now by using this context @{ML pretty_term} prints out @{ML_response_fake [display, gray] "pwriteln (pretty_term show_types_ctxt @{term \"(1::nat, x)\"})" "(1::nat, x::'a)"} where @{text 1} and @{text x} are displayed with their inferred type. Even more type information can be printed by setting the reference @{ML_ind show_all_types in Syntax} to @{ML true}. In this case we obtain*}text {* @{ML_response_fake [display, gray] "let val show_all_types_ctxt = Config.put show_all_types true @{context}in pwriteln (pretty_term show_all_types_ctxt @{term \"(1::nat, x)\"})end" "(Pair::nat \<Rightarrow> 'a \<Rightarrow> nat \<times> 'a) (1::nat) (x::'a)"} where now even @{term Pair} is written with its type (@{term Pair} is the term-constructor for products). Other configuration values that influence printing of terms include \begin{itemize} \item @{ML_ind show_brackets in Syntax} \item @{ML_ind show_sorts in Syntax} \item @{ML_ind eta_contract in Syntax} \end{itemize} A @{ML_type cterm} can be printed with the following function.*}ML %grayML %grayML{*fun pretty_cterm ctxt ctrm = pretty_term ctxt (term_of ctrm)*}text {* Here the function @{ML_ind term_of in Thm} extracts the @{ML_type term} from a @{ML_type cterm}. More than one @{ML_type cterm}s can be printed again with @{ML commas in Pretty}.*} ML %grayML{*fun pretty_cterms ctxt ctrms = Pretty.block (Pretty.commas (map (pretty_cterm ctxt) ctrms))*}text {* The easiest way to get the string of a theorem is to transform it into a @{ML_type term} using the function @{ML_ind prop_of in Thm}. *}ML %grayML{*fun pretty_thm ctxt thm = pretty_term ctxt (prop_of thm)*}text {* Theorems include schematic variables, such as @{text "?P"}, @{text "?Q"} and so on. They are needed in Isabelle in order to able to instantiate theorems when they are applied. For example the theorem @{thm [source] conjI} shown below can be used for any (typable) instantiation of @{text "?P"} and @{text "?Q"}. @{ML_response_fake [display, gray] "pwriteln (pretty_thm @{context} @{thm conjI})" "\<lbrakk>?P; ?Q\<rbrakk> \<Longrightarrow> ?P \<and> ?Q"} However, in order to improve the readability when printing theorems, we can switch off the question marks as follows:*}ML %grayML{*fun pretty_thm_no_vars ctxt thm =let val ctxt' = Config.put show_question_marks false ctxtin pretty_term ctxt' (prop_of thm)end*}text {* With this function, theorem @{thm [source] conjI} is now printed as follows: @{ML_response_fake [display, gray] "pwriteln (pretty_thm_no_vars @{context} @{thm conjI})" "\<lbrakk>P; Q\<rbrakk> \<Longrightarrow> P \<and> Q"} Again the functions @{ML commas} and @{ML block in Pretty} help with printing more than one theorem. *}ML %grayML{*fun pretty_thms ctxt thms = Pretty.block (Pretty.commas (map (pretty_thm ctxt) thms))fun pretty_thms_no_vars ctxt thms = Pretty.block (Pretty.commas (map (pretty_thm_no_vars ctxt) thms))*}text {* Printing functions for @{ML_type typ} are*}ML %grayML{*fun pretty_typ ctxt ty = Syntax.pretty_typ ctxt tyfun pretty_typs ctxt tys = Pretty.block (Pretty.commas (map (pretty_typ ctxt) tys))*}text {* respectively @{ML_type ctyp}*}ML %grayML{*fun pretty_ctyp ctxt cty = pretty_typ ctxt (typ_of cty)fun pretty_ctyps ctxt ctys = Pretty.block (Pretty.commas (map (pretty_ctyp ctxt) ctys))*}text {* \begin{readmore} The simple conversion functions from Isabelle's main datatypes to @{ML_type string}s are implemented in @{ML_file "Pure/Syntax/syntax.ML"}. The configuration values that change the printing information are declared in @{ML_file "Pure/Syntax/printer.ML"}. \end{readmore} Note that for printing out several ``parcels'' of information that belong together, like a warning message consisting of a term and its type, you should try to print these parcels together in a single string. Therefore do \emph{not} print out information as@{ML_response_fake [display,gray]"pwriteln (Pretty.str \"First half,\"); pwriteln (Pretty.str \"and second half.\")""First half,and second half."} but as a single string with appropriate formatting. For example@{ML_response_fake [display,gray]"pwriteln (Pretty.str (\"First half,\" ^ \"\\n\" ^ \"and second half.\"))""First half,and second half."} To ease this kind of string manipulations, there are a number of library functions in Isabelle. For example, the function @{ML_ind cat_lines in Library} concatenates a list of strings and inserts newlines in between each element. @{ML_response_fake [display, gray] "pwriteln (Pretty.str (cat_lines [\"foo\", \"bar\"]))" "foobar"} Section \ref{sec:pretty} will explain the infrastructure that Isabelle provides for more elaborate pretty printing. \begin{readmore} Most of the basic string functions of Isabelle are defined in @{ML_file "Pure/library.ML"}. \end{readmore}*}section {* Combinators\label{sec:combinators} *}text {* For beginners perhaps the most puzzling parts in the existing code of Isabelle are the combinators. At first they seem to greatly obstruct the comprehension of code, but after getting familiar with them and handled with care, they actually ease the understanding and also the programming. The simplest combinator is @{ML_ind I in Library}, which is just the identity function defined as*}ML %grayML{*fun I x = x*}text {* Another simple combinator is @{ML_ind K in Library}, defined as *}ML %grayML{*fun K x = fn _ => x*}text {* @{ML K} ``wraps'' a function around @{text "x"} that ignores its argument. As a result, @{ML K} defines a constant function always returning @{text x}. The next combinator is reverse application, @{ML_ind "|>" in Basics}, defined as: *}ML %grayML{*fun x |> f = f x*}text {* While just syntactic sugar for the usual function application, the purpose of this combinator is to implement functions in a ``waterfall fashion''. Consider for example the function *}ML %linenosgray{*fun inc_by_five x = x |> (fn x => x + 1) |> (fn x => (x, x)) |> fst |> (fn x => x + 4)*}text {* which increments its argument @{text x} by 5. It does this by first incrementing the argument by 1 (Line 2); then storing the result in a pair (Line 3); taking the first component of the pair (Line 4) and finally incrementing the first component by 4 (Line 5). This kind of cascading manipulations of values is quite common when dealing with theories. The reverse application allows you to read what happens in a top-down manner. This kind of coding should be familiar, if you have been exposed to Haskell's {\it do}-notation. Writing the function @{ML inc_by_five} using the reverse application is much clearer than writing*}ML %grayML{*fun inc_by_five x = fst ((fn x => (x, x)) (x + 1)) + 4*}text {* or *}ML %grayML{*fun inc_by_five x = ((fn x => x + 4) o fst o (fn x => (x, x)) o (fn x => x + 1)) x*}text {* and typographically more economical than *}ML %grayML{*fun inc_by_five x =let val y1 = x + 1 val y2 = (y1, y1) val y3 = fst y2 val y4 = y3 + 4in y4 end*}text {* Another reason why the let-bindings in the code above are better to be avoided: it is more than easy to get the intermediate values wrong, not to mention the nightmares the maintenance of this code causes! In Isabelle a ``real world'' example for a function written in the waterfall fashion might be the following code:*}ML %linenosgray{*fun apply_fresh_args f ctxt = f |> fastype_of |> binder_types |> map (pair "z") |> Variable.variant_frees ctxt [f] |> map Free |> curry list_comb f *}text {* This function takes a term and a context as argument. If the term is of function type, then @{ML "apply_fresh_args"} returns the term with distinct variables applied to it. For example below three variables are applied to the term @{term [show_types] "P::nat \<Rightarrow> int \<Rightarrow> unit \<Rightarrow> bool"}: @{ML_response_fake [display,gray]"let val trm = @{term \"P::nat \<Rightarrow> int \<Rightarrow> unit \<Rightarrow> bool\"} val ctxt = @{context}in apply_fresh_args trm ctxt |> pretty_term ctxt |> pwritelnend" "P z za zb"} You can read off this behaviour from how @{ML apply_fresh_args} is coded: in Line 2, the function @{ML_ind fastype_of in Term} calculates the type of the term; @{ML_ind binder_types in Term} in the next line produces the list of argument types (in the case above the list @{text "[nat, int, unit]"}); Line 4 pairs up each type with the string @{text "z"}; the function @{ML_ind variant_frees in Variable} generates for each @{text "z"} a unique name avoiding the given @{text f}; the list of name-type pairs is turned into a list of variable terms in Line 6, which in the last line is applied by the function @{ML_ind list_comb in Term} to the original term. In this last step we have to use the function @{ML_ind curry in Library}, because @{ML list_comb} expects the function and the variables list as a pair. Functions like @{ML apply_fresh_args} are often needed when constructing terms involving fresh variables. For this the infrastructure helps tremendously to avoid any name clashes. Consider for example: @{ML_response_fake [display,gray]"let val trm = @{term \"za::'a \<Rightarrow> 'b \<Rightarrow> 'c\"} val ctxt = @{context}in apply_fresh_args trm ctxt |> pretty_term ctxt |> pwritelnend" "za z zb"} where the @{text "za"} is correctly avoided. The combinator @{ML_ind "#>" in Basics} is the reverse function composition. It can be used to define the following function*}ML %grayML{*val inc_by_six = (fn x => x + 1) #> (fn x => x + 2) #> (fn x => x + 3)*}text {* which is the function composed of first the increment-by-one function and then increment-by-two, followed by increment-by-three. Again, the reverse function composition allows you to read the code top-down. This combinator is often used for setup functions inside the \isacommand{setup}- or \isacommand{local\_setup}-command. These functions have to be of type @{ML_type "theory -> theory"}, respectively @{ML_type "local_theory -> local_theory"}. More than one such setup function can be composed with @{ML "#>"}. Consider for example the following code, where we store the theorems @{thm [source] conjI}, @{thm [source] conjunct1} and @{thm [source] conjunct2} under alternative names.*} local_setup %graylinenos {* let fun my_note name thm = Local_Theory.note ((name, []), [thm]) #> sndin my_note @{binding "foo_conjI"} @{thm conjI} #> my_note @{binding "bar_conjunct1"} @{thm conjunct1} #> my_note @{binding "bar_conjunct2"} @{thm conjunct2}end *}text {* The function @{ML_text "my_note"} in line 3 is just a wrapper for the function @{ML_ind note in Local_Theory} in the structure @{ML_struct Local_Theory}; its purpose is to store a theorem under a name. In lines 5 to 6 we call this function to give alternative names for the three theorems. The point of @{ML "#>"} is that you can sequence such function calls. The remaining combinators we describe in this section add convenience for the ``waterfall method'' of writing functions. The combinator @{ML_ind tap in Basics} allows you to get hold of an intermediate result (to do some side-calculations or print out an intermediate result, for instance). The function *}ML %linenosgray{*fun inc_by_three x = x |> (fn x => x + 1) |> tap (fn x => pwriteln (Pretty.str (@{make_string} x))) |> (fn x => x + 2)*}text {* increments the argument first by @{text "1"} and then by @{text "2"}. In the middle (Line 3), however, it uses @{ML tap} for printing the ``plus-one'' intermediate result. The function @{ML tap} can only be used for side-calculations, because any value that is computed cannot be merged back into the ``main waterfall''. To do this, you can use the next combinator. The combinator @{ML_ind "`" in Basics} (a backtick) is similar to @{ML tap}, but applies a function to the value and returns the result together with the value (as a pair). It is defined as*}ML %grayML{*fun `f = fn x => (f x, x)*}text {* An example for this combinator is the function*}ML %grayML{*fun inc_as_pair x = x |> `(fn x => x + 1) |> (fn (x, y) => (x, y + 1))*}text {* which takes @{text x} as argument, and then increments @{text x}, but also keeps @{text x}. The intermediate result is therefore the pair @{ML "(x + 1, x)" for x}. After that, the function increments the right-hand component of the pair. So finally the result will be @{ML "(x + 1, x + 1)" for x}. The combinators @{ML_ind "|>>" in Basics} and @{ML_ind "||>" in Basics} are defined for functions manipulating pairs. The first applies the function to the first component of the pair, defined as*}ML %grayML{*fun (x, y) |>> f = (f x, y)*}text {* and the second combinator to the second component, defined as*}ML %grayML{*fun (x, y) ||> f = (x, f y)*}text {* These two functions can, for example, be used to avoid explicit @{text "lets"} for intermediate values in functions that return pairs. As an example, suppose you want to separate a list of integers into two lists according to a threshold. If the threshold is @{ML "5"}, the list @{ML "[1,6,2,5,3,4]"} should be separated as @{ML "([1,2,3,4], [6,5])"}. Such a function can be implemented as*}ML %grayML{*fun separate i [] = ([], []) | separate i (x::xs) = let val (los, grs) = separate i xs in if i <= x then (los, x::grs) else (x::los, grs) end*}text {* where the return value of the recursive call is bound explicitly to the pair @{ML "(los, grs)" for los grs}. However, this function can be implemented more concisely as*}ML %grayML{*fun separate i [] = ([], []) | separate i (x::xs) = if i <= x then separate i xs ||> cons x else separate i xs |>> cons x*}text {* avoiding the explicit @{text "let"}. While in this example the gain in conciseness is only small, in more complicated situations the benefit of avoiding @{text "lets"} can be substantial. With the combinator @{ML_ind "|->" in Basics} you can re-combine the elements from a pair. This combinator is defined as*}ML %grayML{*fun (x, y) |-> f = f x y*}text {* and can be used to write the following roundabout version of the @{text double} function: *}ML %grayML{*fun double x = x |> (fn x => (x, x)) |-> (fn x => fn y => x + y)*}text {* The combinator @{ML_ind ||>> in Basics} plays a central rôle whenever your task is to update a theory and the update also produces a side-result (for example a theorem). Functions for such tasks return a pair whose second component is the theory and the fist component is the side-result. Using @{ML ||>>}, you can do conveniently the update and also accumulate the side-results. Consider the following simple function.*}ML %linenosgray{*fun acc_incs x = x |> (fn x => ("", x)) ||>> (fn x => (x, x + 1)) ||>> (fn x => (x, x + 1)) ||>> (fn x => (x, x + 1))*}text {* The purpose of Line 2 is to just pair up the argument with a dummy value (since @{ML ||>>} operates on pairs). Each of the next three lines just increment the value by one, but also nest the intermediate results to the left. For example @{ML_response [display,gray] "acc_incs 1" "((((\"\", 1), 2), 3), 4)"} You can continue this chain with: @{ML_response [display,gray] "acc_incs 1 ||>> (fn x => (x, x + 2))" "(((((\"\", 1), 2), 3), 4), 6)"} An example where this combinator is useful is as follows @{ML_response_fake [display, gray] "let val ((names1, names2), _) = @{context} |> Variable.variant_fixes (replicate 4 \"x\") ||>> Variable.variant_fixes (replicate 5 \"x\")in (names1, names2)end" "([\"x\", \"xa\", \"xb\", \"xc\"], [\"xd\", \"xe\", \"xf\", \"xg\", \"xh\"])"} Its purpose is to create nine variants of the string @{ML "\"x\""} so that no variant will clash with another. Suppose for some reason we want to bind four variants to the lists @{ML_text "name1"} and the rest to @{ML_text "name2"}. In order to obtain non-clashing variants we have to thread the context through the function calls (the context records which variants have been previously created). For the first call we can use @{ML "|>"}, but in the second and any further call to @{ML_ind variant_fixes in Variable} we have to use @{ML "||>>"} in order to account for the result(s) obtained by previous calls. A more realistic example for this combinator is the following code*}ML %grayML{*val (((one_def, two_def), three_def), ctxt') = @{context} |> Local_Defs.add_def ((@{binding "One"}, NoSyn), @{term "1::nat"}) ||>> Local_Defs.add_def ((@{binding "Two"}, NoSyn), @{term "2::nat"}) ||>> Local_Defs.add_def ((@{binding "Three"}, NoSyn), @{term "3::nat"})*}text {* where we make three definitions, namely @{term "One \<equiv> 1::nat"}, @{term "Two \<equiv> 2::nat"} and @{term "Three \<equiv> 3::nat"}. The point of this code is that we augment the initial context with the definitions. The result we are interested in is the augmented context, that is @{ML_text "ctxt'"}, but also the side-results containing information about the definitions---the function @{ML_ind add_def in Local_Defs} returns both as pairs. We can use this information for example to print out the definiens and the theorem corresponding to the definitions. For example for the first definition: @{ML_response_fake [display, gray] "let val (one_trm, one_thm) = one_defin pwriteln (pretty_term ctxt' one_trm); pwriteln (pretty_thm ctxt' one_thm)end" "OneOne \<equiv> 1"} Recall that @{ML "|>"} is the reverse function application. Recall also that the related reverse function composition is @{ML "#>"}. In fact all the combinators @{ML "|->"}, @{ML "|>>"} , @{ML "||>"} and @{ML "||>>"} described above have related combinators for function composition, namely @{ML_ind "#->" in Basics}, @{ML_ind "#>>" in Basics}, @{ML_ind "##>" in Basics} and @{ML_ind "##>>" in Basics}. Using @{ML "#->"}, for example, the function @{text double} can also be written as:*}ML %grayML{*val double = (fn x => (x, x)) #-> (fn x => fn y => x + y)*}text {* When using combinators for writing functions in waterfall fashion, it is sometimes necessary to do some ``plumbing'' in order to fit functions together. We have already seen such plumbing in the function @{ML apply_fresh_args}, where @{ML curry} is needed for making the function @{ML list_comb}, which works over pairs, to fit with the combinator @{ML "|>"}. Such plumbing is also needed in situations where a function operates over lists, but one calculates only with a single element. An example is the function @{ML_ind check_terms in Syntax}, whose purpose is to simultaneously type-check a list of terms. Consider the code: @{ML_response_fake [display, gray] "let val ctxt = @{context}in map (Syntax.parse_term ctxt) [\"m + n\", \"m * n\", \"m - (n::nat)\"] |> Syntax.check_terms ctxt |> pretty_terms ctxt |> pwritelnend" "m + n, m * n, m - n"}*}text {* In this example we obtain three terms (using the function @{ML_ind parse_term in Syntax}) whose variables @{text m} and @{text n} are of type @{typ "nat"}. If you have only a single term, then @{ML check_terms in Syntax} needs plumbing. This can be done with the function @{ML_ind singleton in Library}.\footnote{There is already a function @{ML check_term in Syntax} in the file @{ML_file "Pure/Syntax/syntax.ML"} that is implemented in terms of @{ML singleton} and @{ML check_terms in Syntax}.} For example @{ML_response_fake [display, gray, linenos] "let val ctxt = @{context}in Syntax.parse_term ctxt \"m - (n::nat)\" |> singleton (Syntax.check_terms ctxt) |> pretty_term ctxt |> pwritelnend" "m - n"} where in Line 5, the function operating over lists fits with the single term generated in Line 4. \begin{readmore} The most frequently used combinators are defined in the files @{ML_file "Pure/library.ML"} and @{ML_file "Pure/General/basics.ML"}. Also \isccite{sec:ML-linear-trans} contains further information about combinators. \end{readmore} \begin{exercise} Find out what the combinator @{ML "K I"} does. \end{exercise}*}section {* ML-Antiquotations\label{sec:antiquote} *}text {* Recall from Section \ref{sec:include} that code in Isabelle is always embedded in a theory. The main advantage of this is that the code can contain references to entities defined on the logical level of Isabelle. By this we mean references to definitions, theorems, terms and so on. These reference are realised in Isabelle with ML-antiquotations, often just called antiquotations.\footnote{Note that there are two kinds of antiquotations in Isabelle, which have very different purposes and infrastructures. The first kind, described in this section, are \emph{\index*{ML-antiquotation}}. They are used to refer to entities (like terms, types etc) from Isabelle's logic layer inside ML-code. The other kind of antiquotations are \emph{document}\index{document antiquotation} antiquotations. They are used only in the text parts of Isabelle and their purpose is to print logical entities inside \LaTeX-documents. Document antiquotations are part of the user level and therefore we are not interested in them in this Tutorial, except in Appendix \ref{rec:docantiquotations} where we show how to implement your own document antiquotations.} Syntactically antiquotations are indicated by the @{ML_text @}-sign followed by text wrapped in @{text "{\<dots>}"}. For example, one can print out the name of the current theory with the code @{ML_response [display,gray] "Context.theory_name @{theory}" "\"First_Steps\""} where @{text "@{theory}"} is an antiquotation that is substituted with the current theory (remember that we assumed we are inside the theory @{text First_Steps}). The name of this theory can be extracted using the function @{ML_ind theory_name in Context}. Note, however, that antiquotations are statically linked, that is their value is determined at ``compile-time'', not at ``run-time''. For example the function*}ML %grayML{*fun not_current_thyname () = Context.theory_name @{theory} *}text {* does \emph{not} return the name of the current theory, if it is run in a different theory. Instead, the code above defines the constant function that always returns the string @{text [quotes] "First_Steps"}, no matter where the function is called. Operationally speaking, the antiquotation @{text "@{theory}"} is \emph{not} replaced with code that will look up the current theory in some data structure and return it. Instead, it is literally replaced with the value representing the theory. Another important antiquotation is @{text "@{context}"}. (What the difference between a theory and a context is will be described in Chapter \ref{chp:advanced}.) A context is for example needed in order to use the function @{ML print_abbrevs in Proof_Context} that list of all currently defined abbreviations. @{ML_response_fake [display, gray] "Proof_Context.print_abbrevs @{context}""Code_Evaluation.valtermify \<equiv> \<lambda>x. (x, \<lambda>u. Code_Evaluation.termify x)INTER \<equiv> INFIInter \<equiv> Inf\<dots>"} You can also use antiquotations to refer to proved theorems: @{text "@{thm \<dots>}"} for a single theorem @{ML_response_fake [display,gray] "@{thm allI}" "(\<And>x. ?P x) \<Longrightarrow> \<forall>x. ?P x"} and @{text "@{thms \<dots>}"} for more than one@{ML_response_fake [display,gray] "@{thms conj_ac}" "(?P \<and> ?Q) = (?Q \<and> ?P)(?P \<and> ?Q \<and> ?R) = (?Q \<and> ?P \<and> ?R)((?P \<and> ?Q) \<and> ?R) = (?P \<and> ?Q \<and> ?R)"} The thm-antiquotations can also be used for manipulating theorems. For example, if you need the version of the theorem @{thm [source] refl} that has a meta-equality instead of an equality, you can write@{ML_response_fake [display,gray] "@{thm refl[THEN eq_reflection]}" "?x \<equiv> ?x"} The point of these antiquotations is that referring to theorems in this way makes your code independent from what theorems the user might have stored under this name (this becomes especially important when you deal with theorem lists; see Section \ref{sec:storing}). It is also possible to prove lemmas with the antiquotation @{text "@{lemma \<dots> by \<dots>}"} whose first argument is a statement (possibly many of them separated by @{text "and"}) and the second is a proof. For example*}ML %grayML{*val foo_thm = @{lemma "True" and "False \<Longrightarrow> P" by simp_all} *}text {* The result can be printed out as follows. @{ML_response_fake [gray,display]"foo_thm |> pretty_thms_no_vars @{context} |> pwriteln" "True, False \<Longrightarrow> P"} You can also refer to the current simpset via an antiquotation. To illustrate this we implement the function that extracts the theorem names stored in a simpset.*}ML %grayML{*fun get_thm_names_from_ss simpset =let val {simps,...} = Raw_Simplifier.dest_ss simpsetin map #1 simpsend*}text {* The function @{ML_ind dest_ss in Raw_Simplifier} returns a record containing all information stored in the simpset, but here we are only interested in the names of the simp-rules. Now you can feed in the current simpset into this function. The current simpset can be referred to using the antiquotation @{text "@{simpset}"}. @{ML_response_fake [display,gray] "get_thm_names_from_ss @{simpset}" "[\"Nat.of_nat_eq_id\", \"Int.of_int_eq_id\", \"Nat.One_nat_def\", \<dots>]"} Again, this way of referencing simpsets makes you independent from additions of lemmas to the simpset by the user, which can potentially cause loops in your code. It is also possible to define your own antiquotations. But you should exercise care when introducing new ones, as they can also make your code also difficult to read. In the next chapter we describe how to construct terms with the (build in) antiquotation @{text "@{term \<dots>}"}. A restriction of this antiquotation is that it does not allow you to use schematic variables in terms. If you want to have an antiquotation that does not have this restriction, you can implement your own using the function @{ML_ind inline in ML_Antiquote} from the structure @{ML_struct ML_Antiquote}. The code for the antiquotation @{text "term_pat"} is as follows.*}ML %linenosgray{*val term_pat_setup = let val parser = Args.context -- Scan.lift Args.name_source fun term_pat (ctxt, str) = str |> Proof_Context.read_term_pattern ctxt |> ML_Syntax.print_term |> ML_Syntax.atomicin ML_Antiquote.inline @{binding "term_pat"} (parser >> term_pat)end*}text {* To use it you also have to install it using \isacommand{setup} like so*}setup %gray {* term_pat_setup *}text {* The parser in Line 2 provides us with a context and a string; this string is transformed into a term using the function @{ML_ind read_term_pattern in Proof_Context} (Line 5); the next two lines transform the term into a string so that the ML-system can understand it. (All these functions will be explained in more detail in later sections.) An example for this antiquotation is: @{ML_response_fake [display,gray] "@{term_pat \"Suc (?x::nat)\"}" "Const (\"Suc\", \"nat \<Rightarrow> nat\") $ Var ((\"x\", 0), \"nat\")"} which shows the internal representation of the term @{text "Suc ?x"}. Similarly we can write an antiquotation for type patterns. Its code is*}ML %grayML{*val type_pat_setup = let val parser = Args.context -- Scan.lift Args.name_source fun typ_pat (ctxt, str) = let val ctxt' = Proof_Context.set_mode Proof_Context.mode_schematic ctxt in str |> Syntax.read_typ ctxt' |> ML_Syntax.print_typ |> ML_Syntax.atomic endin ML_Antiquote.inline @{binding "typ_pat"} (parser >> typ_pat)end*}text {* which can be installed with*}setup %gray {* type_pat_setup *}text {* However, a word of warning is in order: Introducing new antiquotations should be done only after careful deliberations. They can make your code harder to read, than making it easier. \begin{readmore} The file @{ML_file "Pure/ML/ml_antiquote.ML"} contains the the definitions for most antiquotations. Most of the basic operations on ML-syntax are implemented in @{ML_file "Pure/ML/ml_syntax.ML"}. \end{readmore}*}section {* Storing Data in Isabelle\label{sec:storing} *}text {* Isabelle provides mechanisms for storing (and retrieving) arbitrary data. Before we delve into the details, let us digress a bit. Conventional wisdom has it that the type-system of ML ensures that an @{ML_type "'a list"}, say, can only hold elements of the same type, namely @{ML_type "'a"} (or whatever is substitued for it). Despite this common wisdom, however, it is possible to implement a universal type in ML, although by some arguably accidental features of ML. This universal type can be used to store data of different type into a single list. In fact, it allows one to inject and to project data of \emph{arbitrary} type. This is in contrast to datatypes, which only allow injection and projection of data for some \emph{fixed} collection of types. In light of the conventional wisdom cited above it is important to keep in mind that the universal type does not destroy type-safety of ML: storing and accessing the data can only be done in a type-safe manner...though run-time checks are needed for that. \begin{readmore} In Isabelle the universal type is implemented as the type @{ML_type Universal.universal} in the file @{ML_file "Pure/ML-Systems/universal.ML"}. \end{readmore} We will show the usage of the universal type by storing an integer and a boolean into a single list. Let us first define injection and projection functions for booleans and integers into and from the type @{ML_type Universal.universal}.*}ML %grayML{*local val fn_int = Universal.tag () : int Universal.tag val fn_bool = Universal.tag () : bool Universal.tagin val inject_int = Universal.tagInject fn_int; val inject_bool = Universal.tagInject fn_bool; val project_int = Universal.tagProject fn_int; val project_bool = Universal.tagProject fn_boolend*}text {* Using the injection functions, we can inject the integer @{ML_text "13"} and the boolean value @{ML_text "true"} into @{ML_type Universal.universal}, and then store them in a @{ML_type "Universal.universal list"} as follows:*}ML %grayML{*val foo_list = let val thirteen = inject_int 13 val truth_val = inject_bool truein [thirteen, truth_val]end*}text {* The data can be retrieved with the projection functions defined above. @{ML_response_fake [display, gray]"project_int (nth foo_list 0); project_bool (nth foo_list 1)" "13true"} Notice that we access the integer as an integer and the boolean as a boolean. If we attempt to access the integer as a boolean, then we get a runtime error. @{ML_response_fake [display, gray]"project_bool (nth foo_list 0)" "*** Exception- Match raised"} This runtime error is the reason why ML is still type-sound despite containing a universal type. Now, Isabelle heavily uses this mechanism for storing all sorts of data: theorem lists, simpsets, facts etc. Roughly speaking, there are two places where data can be stored in Isabelle: in \emph{theories} and in \emph{proof contexts}. Data such as simpsets are ``global'' and therefore need to be stored in a theory (simpsets need to be maintained across proofs and even across theories). On the other hand, data such as facts change inside a proof and are only relevant to the proof at hand. Therefore such data needs to be maintained inside a proof context, which represents ``local'' data. You can think of a theory as the ``longterm memory'' of Isabelle (nothing will be deleted from it), and a proof-context as a ``shortterm memory'' (it dynamically changes according to what is needed at the time). For theories and proof contexts there are, respectively, the functors @{ML_funct_ind Theory_Data} and @{ML_funct_ind Proof_Data} that help with the data storage. Below we show how to implement a table in which you can store theorems and look them up according to a string key. The intention in this example is to be able to look up introduction rules for logical connectives. Such a table might be useful in an automatic proof procedure and therefore it makes sense to store this data inside a theory. Consequently we use the functor @{ML_funct Theory_Data}. The code for the table is:*}ML %linenosgray{*structure Data = Theory_Data (type T = thm Symtab.table val empty = Symtab.empty val extend = I val merge = Symtab.merge (K true))*}text {* In order to store data in a theory, we have to specify the type of the data (Line 2). In this case we specify the type @{ML_type "thm Symtab.table"}, which stands for a table in which @{ML_type string}s can be looked up producing an associated @{ML_type thm}. We also have to specify four functions to use this functor: namely how to initialise the data storage (Line 3), how to extend it (Line 4) and how two tables should be merged (Line 5). These functions correspond roughly to the operations performed on theories and we just give some sensible defaults.\footnote{\bf FIXME: Say more about the assumptions of these operations.} The result structure @{ML_text Data} contains functions for accessing the table (@{ML Data.get}) and for updating it (@{ML Data.map}). There is also the functions @{ML Data.put}, which however is not relevant here. Below we define two auxiliary functions, which help us with accessing the table.*}ML %grayML{*val lookup = Symtab.lookup o Data.getfun update k v = Data.map (Symtab.update (k, v))*}text {* Since we want to store introduction rules associated with their logical connective, we can fill the table as follows.*}setup %gray {* update "conj" @{thm conjI} #> update "imp" @{thm impI} #> update "all" @{thm allI}*}text {* The use of the command \isacommand{setup} makes sure the table in the \emph{current} theory is updated (this is explained further in section~\ref{sec:theories}). The lookup can now be performed as follows. @{ML_response_fake [display, gray]"lookup @{theory} \"conj\"""SOME \"\<lbrakk>?P; ?Q\<rbrakk> \<Longrightarrow> ?P \<and> ?Q\""} An important point to note is that these tables (and data in general) need to be treated in a purely functional fashion. Although we can update the table as follows*}setup %gray {* update "conj" @{thm TrueI} *}text {* and accordingly, @{ML lookup} now produces the introduction rule for @{term "True"}@{ML_response_fake [display, gray]"lookup @{theory} \"conj\"""SOME \"True\""} there are no references involved. This is one of the most fundamental coding conventions for programming in Isabelle. References interfere with the multithreaded execution model of Isabelle and also defeat its undo-mechanism. To see the latter, consider the following data container where we maintain a reference to a list of integers.*}ML %grayML{*structure WrongRefData = Theory_Data (type T = (int list) Unsynchronized.ref val empty = Unsynchronized.ref [] val extend = I val merge = fst)*}text {* We initialise the reference with the empty list. Consequently a first lookup produces @{ML "ref []" in Unsynchronized}. @{ML_response_fake [display,gray] "WrongRefData.get @{theory}" "ref []"} For updating the reference we use the following function*}ML %grayML{*fun ref_update n = WrongRefData.map (fn r => let val _ = r := n::(!r) in r end)*}text {* which takes an integer and adds it to the content of the reference. As before, we update the reference with the command \isacommand{setup}. *}setup %gray {* ref_update 1 *}text {* A lookup in the current theory gives then the expected list @{ML "ref [1]" in Unsynchronized}. @{ML_response_fake [display,gray] "WrongRefData.get @{theory}" "ref [1]"} So far everything is as expected. But, the trouble starts if we attempt to backtrack to the ``point'' before the \isacommand{setup}-command. There, we would expect that the list is empty again. But since it is stored in a reference, Isabelle has no control over it. So it is not empty, but still @{ML "ref [1]" in Unsynchronized}. Adding to the trouble, if we execute the \isacommand{setup}-command again, we do not obtain @{ML "ref [1]" in Unsynchronized}, but @{ML_response_fake [display,gray] "WrongRefData.get @{theory}" "ref [1, 1]"} Now imagine how often you go backwards and forwards in your proof scripts.\footnote{The same problem can be triggered in the Jedit GUI by making the parser to go over and over again over the \isacommand{setup} command.} By using references in Isabelle code, you are bound to cause all hell to break loose. Therefore observe the coding convention: Do not use references for storing data! \begin{readmore} The functors for data storage are defined in @{ML_file "Pure/context.ML"}. Isabelle contains implementations of several container data structures, including association lists in @{ML_file "Pure/General/alist.ML"}, directed graphs in @{ML_file "Pure/General/graph.ML"}, and tables and symtables in @{ML_file "Pure/General/table.ML"}. \end{readmore} Storing data in a proof context is done in a similar fashion. As mentioned before, the corresponding functor is @{ML_funct_ind Proof_Data}. With the following code we can store a list of terms in a proof context.*}ML %grayML{*structure Data = Proof_Data (type T = term list fun init _ = [])*}text {* The init-function we have to specify must produce a list for when a context is initialised (possibly taking the theory into account from which the context is derived). We choose here to just return the empty list. Next we define two auxiliary functions for updating the list with a given term and printing the list. *}ML %grayML{*fun update trm = Data.map (fn trms => trm::trms)fun print ctxt = case (Data.get ctxt) of [] => pwriteln (Pretty.str "Empty!") | trms => pwriteln (pretty_terms ctxt trms)*}text {* Next we start with the context generated by the antiquotation @{text "@{context}"} and update it in various ways. @{ML_response_fake [display,gray]"let val ctxt0 = @{context} val ctxt1 = ctxt0 |> update @{term \"False\"} |> update @{term \"True \<and> True\"} val ctxt2 = ctxt0 |> update @{term \"1::nat\"} val ctxt3 = ctxt2 |> update @{term \"2::nat\"}in print ctxt0; print ctxt1; print ctxt2; print ctxt3end""Empty!True \<and> True, False12, 1"} Many functions in Isabelle manage and update data in a similar fashion. Consequently, such calculations with contexts occur frequently in Isabelle code, although the ``context flow'' is usually only linear. Note also that the calculation above has no effect on the underlying theory. Once we throw away the contexts, we have no access to their associated data. This is different for theories, where the command \isacommand{setup} registers the data with the current and future theories, and therefore one can access the data potentially indefinitely. Move elsewhere For convenience there is an abstract layer, namely the type @{ML_type Context.generic}, for treating theories and proof contexts more uniformly. This type is defined as follows*}ML_val{*datatype generic = Theory of theory | Proof of proof*}text {* \footnote{\bf FIXME: say more about generic contexts.} There are two special instances of the data storage mechanism described above. The first instance implements named theorem lists using the functor @{ML_funct_ind Named_Thms}. This is because storing theorems in a list is such a common task. To obtain a named theorem list, you just declare*}ML %grayML{*structure FooRules = Named_Thms (val name = @{binding "foo"} val description = "Theorems for foo") *}text {* and set up the @{ML_struct FooRules} with the command*}setup %gray {* FooRules.setup *}text {* This code declares a data container where the theorems are stored, an attribute @{text foo} (with the @{text add} and @{text del} options for adding and deleting theorems) and an internal ML-interface for retrieving and modifying the theorems. Furthermore, the theorems are made available on the user-level under the name @{text foo}. For example you can declare three lemmas to be a member of the theorem list @{text foo} by:*}lemma rule1[foo]: "A" sorrylemma rule2[foo]: "B" sorrylemma rule3[foo]: "C" sorrytext {* and undeclare the first one by: *}declare rule1[foo del]text {* You can query the remaining ones with: \begin{isabelle} \isacommand{thm}~@{text "foo"}\\ @{text "> ?C"}\\ @{text "> ?B"} \end{isabelle} On the ML-level, we can add theorems to the list with @{ML FooRules.add_thm}:*} setup %gray {* Context.theory_map (FooRules.add_thm @{thm TrueI}) *}text {* The rules in the list can be retrieved using the function @{ML FooRules.get}: @{ML_response_fake [display,gray] "FooRules.get @{context}" "[\"True\", \"?C\",\"?B\"]"} Note that this function takes a proof context as argument. This might be confusing, since the theorem list is stored as theory data. It becomes clear by knowing that the proof context contains the information about the current theory and so the function can access the theorem list in the theory via the context. \begin{readmore} For more information about named theorem lists see @{ML_file "Pure/Tools/named_thms.ML"}. \end{readmore} The second special instance of the data storage mechanism are configuration values. They are used to enable users to configure tools without having to resort to the ML-level (and also to avoid references). Assume you want the user to control three values, say @{text bval} containing a boolean, @{text ival} containing an integer and @{text sval} containing a string. These values can be declared by*}ML %grayML{*val bval = Attrib.setup_config_bool @{binding "bval"} (K false)val ival = Attrib.setup_config_int @{binding "ival"} (K 0)val sval = Attrib.setup_config_string @{binding "sval"} (K "some string") *}text {* where each value needs to be given a default. The user can now manipulate the values from the user-level of Isabelle with the command*}declare [[bval = true, ival = 3]]text {* On the ML-level these values can be retrieved using the function @{ML_ind get in Config} from a proof context @{ML_response [display,gray] "Config.get @{context} bval" "true"} or directly from a theory using the function @{ML_ind get_global in Config} @{ML_response [display,gray] "Config.get_global @{theory} bval" "true"} It is also possible to manipulate the configuration values from the ML-level with the functions @{ML_ind put in Config} and @{ML_ind put_global in Config}. For example @{ML_response [display,gray] "let val ctxt = @{context} val ctxt' = Config.put sval \"foo\" ctxt val ctxt'' = Config.put sval \"bar\" ctxt'in (Config.get ctxt sval, Config.get ctxt' sval, Config.get ctxt'' sval)end" "(\"some string\", \"foo\", \"bar\")"} A concrete example for a configuration value is @{ML_ind simp_trace in Raw_Simplifier}, which switches on trace information in the simplifier. This can be used for example in the following proof*}lemma shows "(False \<or> True) \<and> True"proof (rule conjI) show "False \<or> True" using [[simp_trace = true]] by simpnext show "True" by simpqedtext {* in order to inspect how the simplifier solves the first subgoal. \begin{readmore} For more information about configuration values see the files @{ML_file "Pure/Isar/attrib.ML"} and @{ML_file "Pure/config.ML"}. \end{readmore}*}section {* Summary *}text {* This chapter describes the combinators that are used in Isabelle, as well as a simple printing infrastructure for @{ML_type term}, @{ML_type cterm} and @{ML_type thm}. The section on ML-antiquotations shows how to refer statically to entities from the logic level of Isabelle. Isabelle also contains mechanisms for storing arbitrary data in theory and proof contexts. \begin{conventions} \begin{itemize} \item Print messages that belong together in a single string. \item Do not use references in Isabelle code. \end{itemize} \end{conventions}*}end