theory FirstStepsimports Baseuses "FirstSteps.ML"beginchapter {* First Steps *}text {* \begin{flushright} {\em ``We will most likely never realize the full importance of painting the Tower,\\ that it is the essential element in the conservation of metal works and the\\ more meticulous the paint job, the longer the Tower shall endure.''} \\[1ex] Gustave Eiffel, In his book {\em The 300-Meter Tower}.\footnote{The Eiffel Tower has been re-painted 18 times since its initial construction, an average of once every seven years. It takes more than a year for a team of 25 painters to paint the Tower from top to bottom.} \end{flushright} \medskip Isabelle programming is done in ML. Just like lemmas and proofs, ML-code for Isabelle must be part of a theory. If you want to follow the code given in this chapter, we assume you are working inside the theory starting with \begin{quote} \begin{tabular}{@ {}l} \isacommand{theory} FirstSteps\\ \isacommand{imports} Main\\ \isacommand{begin}\\ \ldots \end{tabular} \end{quote} We also generally assume you are working with the logic HOL. The examples that will be given might need to be adapted if you work in a different logic.*}section {* Including ML-Code *}text {* The easiest and quickest way to include code in a theory is by using the \isacommand{ML}-command. For example: \begin{isabelle} \begin{graybox} \isacommand{ML}~@{text "\<verbopen>"}\isanewline \hspace{5mm}@{ML "3 + 4"}\isanewline @{text "\<verbclose>"}\isanewline @{text "> 7"}\smallskip \end{graybox} \end{isabelle} Like normal Isabelle scripts, \isacommand{ML}-commands can be evaluated by using the advance and undo buttons of your Isabelle environment. The code inside the \isacommand{ML}-command can also contain value and function bindings, for example*}ML %gray {* val r = Unsynchronized.ref 0 fun f n = n + 1 *}text {* and even those can be undone when the proof script is retracted. As mentioned in the Introduction, we will drop the \isacommand{ML}~@{text "\<verbopen> \<dots> \<verbclose>"} scaffolding whenever we show code. The lines prefixed with @{text [quotes] ">"} are not part of the code, rather they indicate what the response is when the code is evaluated. There are also the commands \isacommand{ML\_val} and \isacommand{ML\_prf} for including ML-code. The first evaluates the given code, but any effect on the theory, in which the code is embedded, is suppressed. The second needs to be used if ML-code is defined inside a proof. For example \begin{quote} \begin{isabelle} \isacommand{lemma}~@{text "test:"}\isanewline \isacommand{shows}~@{text [quotes] "True"}\isanewline \isacommand{ML\_prf}~@{text "\<verbopen>"}~@{ML "writeln \"Trivial!\""}~@{text "\<verbclose>"}\isanewline \isacommand{oops} \end{isabelle} \end{quote} However, both commands will only play minor roles in this tutorial (we will always arrange that the ML-code is defined outside of proofs, for example). Once a portion of code is relatively stable, you usually want to export it to a separate ML-file. Such files can then be included somewhere inside a theory by using the command \isacommand{use}. For example \begin{quote} \begin{tabular}{@ {}l} \isacommand{theory} FirstSteps\\ \isacommand{imports} Main\\ \isacommand{uses}~@{text "(\"file_to_be_included.ML\")"} @{text "\<dots>"}\\ \isacommand{begin}\\ \ldots\\ \isacommand{use}~@{text "\"file_to_be_included.ML\""}\\ \ldots \end{tabular} \end{quote} The \isacommand{uses}-command in the header of the theory is needed in order to indicate the dependency of the theory on the ML-file. Alternatively, the file can be included by just writing in the header \begin{quote} \begin{tabular}{@ {}l} \isacommand{theory} FirstSteps\\ \isacommand{imports} Main\\ \isacommand{uses} @{text "\"file_to_be_included.ML\""} @{text "\<dots>"}\\ \isacommand{begin}\\ \ldots \end{tabular} \end{quote} Note that no parentheses are given this time. Note also that the included ML-file should not contain any \isacommand{use} itself. Otherwise Isabelle is unable to record all file dependencies, which is a nuisance if you have to track down errors.*}section {* Printing and Debugging\label{sec:printing} *}text {* During development you might find it necessary to inspect some data in your code. This can be done in a ``quick-and-dirty'' fashion using the function @{ML_ind "writeln"}. For example @{ML_response_eq [display,gray] "writeln \"any string\"" "\"any string\"" with "(op =)"} will print out @{text [quotes] "any string"} inside the response buffer of Isabelle. This function expects a string as argument. If you develop under PolyML, then there is a convenient, though again ``quick-and-dirty'', method for converting values into strings, namely the function @{ML_ind makestring in PolyML}: @{ML_response_eq [display,gray] "writeln (PolyML.makestring 1)" "\"1\"" with "(op =)"} However, @{ML makestring in PolyML} only works if the type of what is converted is monomorphic and not a function. The function @{ML "writeln"} should only be used for testing purposes, because any output this function generates will be overwritten as soon as an error is raised. For printing anything more serious and elaborate, the function @{ML_ind tracing} is more appropriate. This function writes all output into a separate tracing buffer. For example: @{ML_response_eq [display,gray] "tracing \"foo\"" "\"foo\"" with "(op =)"} It is also possible to redirect the ``channel'' where the string @{text "foo"} is printed to a separate file, e.g., to prevent ProofGeneral from choking on massive amounts of trace output. This redirection can be achieved with the code:*}ML{*val strip_specials =let fun strip ("\^A" :: _ :: cs) = strip cs | strip (c :: cs) = c :: strip cs | strip [] = [];in implode o strip o explode endfun redirect_tracing stream = Output.tracing_fn := (fn s => (TextIO.output (stream, (strip_specials s)); TextIO.output (stream, "\n"); TextIO.flushOut stream)) *}text {* Calling now @{ML [display,gray] "redirect_tracing (TextIO.openOut \"foo.bar\")"} will cause that all tracing information is printed into the file @{text "foo.bar"}. You can print out error messages with the function @{ML_ind error}; for example:@{ML_response_fake [display,gray] "if 0=1 then true else (error \"foo\")" "Exception- ERROR \"foo\" raisedAt command \"ML\"."} This function raises the exception @{text ERROR}, which will then be displayed by the infrastructure. \footnote{\bf FIXME Mention how to work with @{ML_ind debug in Toplevel} and @{ML_ind profiling in Toplevel}.}*}(* FIXMEML {* reset Toplevel.debug *}ML {* fun dodgy_fun () = (raise TYPE ("",[],[]); 1) *}ML {* fun innocent () = dodgy_fun () *}ML {* exception_trace (fn () => cterm_of @{theory} (Bound 0)) *}ML {* exception_trace (fn () => innocent ()) *}ML {* Toplevel.program (fn () => cterm_of @{theory} (Bound 0)) *}ML {* Toplevel.program (fn () => innocent ()) *}*)text {* Most often you want to inspect data of Isabelle's most basic data structures, namely @{ML_type term}, @{ML_type cterm} and @{ML_type thm}. Isabelle contains elaborate pretty-printing functions for printing them (see Section \ref{sec:pretty}), but for quick-and-dirty solutions they are a bit unwieldy. One way to transform a term into a string is to use the function @{ML_ind string_of_term in Syntax} in the structure @{ML_struct Syntax}, which we bind for more convenience to the toplevel.*}ML{*val string_of_term = Syntax.string_of_term*}text {* It can now be used as follows @{ML_response_fake [display,gray] "string_of_term @{context} @{term \"1::nat\"}" "\"\\^E\\^Fterm\\^E\\^E\\^Fconst\\^Fname=HOL.one_class.one\\^E1\\^E\\^F\\^E\\^E\\^F\\^E\""} This produces a string corrsponding to the term @{term [show_types] "1::nat"} with some additional information encoded in it. The string can be properly printed by using either the function @{ML_ind writeln} or @{ML_ind tracing}: @{ML_response_fake [display,gray] "writeln (string_of_term @{context} @{term \"1::nat\"})" "\"1\""} or @{ML_response_fake [display,gray] "tracing (string_of_term @{context} @{term \"1::nat\"})" "\"1\""} If there are more than one @{ML_type term}s to be printed, you can use the function @{ML_ind commas} to separate them.*} ML{*fun string_of_terms ctxt ts = commas (map (string_of_term ctxt) ts)*}text {* A @{ML_type cterm} can be transformed into a string by the following function.*}ML{*fun string_of_cterm ctxt ct = string_of_term ctxt (term_of ct)*}text {* In this example the function @{ML_ind term_of} extracts the @{ML_type term} from a @{ML_type cterm}. More than one @{ML_type cterm}s can again be printed with @{ML_ind commas}.*} ML{*fun string_of_cterms ctxt cts = commas (map (string_of_cterm ctxt) cts)*}text {* The easiest way to get the string of a theorem is to transform it into a @{ML_type term} using the function @{ML_ind prop_of}. *}ML{*fun string_of_thm ctxt thm = string_of_term ctxt (prop_of thm)*}text {* Theorems also include schematic variables, such as @{text "?P"}, @{text "?Q"} and so on. They are needed in order to able to instantiate theorems when they are applied. For example the theorem @{thm [source] conjI} shown below can be used for any (typable) instantiation of @{text "?P"} and @{text "?Q"}. @{ML_response_fake [display, gray] "tracing (string_of_thm @{context} @{thm conjI})" "\<lbrakk>?P; ?Q\<rbrakk> \<Longrightarrow> ?P \<and> ?Q"} However, in order to improve the readability when printing theorems, we convert these schematic variables into free variables using the function @{ML_ind import in Variable}. This is similar to statements like @{text "conjI[no_vars]"} from Isabelle's user-level.*}ML{*fun no_vars ctxt thm =let val ((_, [thm']), _) = Variable.import true [thm] ctxtin thm'endfun string_of_thm_no_vars ctxt thm = string_of_term ctxt (prop_of (no_vars ctxt thm))*}text {* Theorem @{thm [source] conjI} is now printed as follows: @{ML_response_fake [display, gray] "tracing (string_of_thm_no_vars @{context} @{thm conjI})" "\<lbrakk>P; Q\<rbrakk> \<Longrightarrow> P \<and> Q"} Again the function @{ML commas} helps with printing more than one theorem. *}ML{*fun string_of_thms ctxt thms = commas (map (string_of_thm ctxt) thms)fun string_of_thms_no_vars ctxt thms = commas (map (string_of_thm_no_vars ctxt) thms) *}text {* Note, that when printing out several parcels of information that semantically belong together, like a warning message consisting for example of a term and a type, you should try to keep this information together in a single string. Therefore do not print out information as@{ML_response_fake [display,gray]"tracing \"First half,\"; tracing \"and second half.\"""First half,and second half."} but as a single string with appropriate formatting. For example@{ML_response_fake [display,gray]"tracing (\"First half,\" ^ \"\\n\" ^ \"and second half.\")""First half,and second half."} To ease this kind of string manipulations, there are a number of library functions. For example, the function @{ML_ind cat_lines} concatenates a list of strings and inserts newlines. @{ML_response [display, gray] "cat_lines [\"foo\", \"bar\"]" "\"foo\\nbar\""} Section \ref{sec:pretty} will also explain infrastructure that helps with more elaborate pretty printing. *}section {* Combinators\label{sec:combinators} *}text {* For beginners perhaps the most puzzling parts in the existing code of Isabelle are the combinators. At first they seem to greatly obstruct the comprehension of the code, but after getting familiar with them, they actually ease the understanding and also the programming. The simplest combinator is @{ML_ind I}, which is just the identity function defined as*}ML{*fun I x = x*}text {* Another simple combinator is @{ML_ind K}, defined as *}ML{*fun K x = fn _ => x*}text {* @{ML_ind K} ``wraps'' a function around the argument @{text "x"}. However, this function ignores its argument. As a result, @{ML K} defines a constant function always returning @{text x}. The next combinator is reverse application, @{ML_ind "|>"}, defined as: *}ML{*fun x |> f = f x*}text {* While just syntactic sugar for the usual function application, the purpose of this combinator is to implement functions in a ``waterfall fashion''. Consider for example the function *}ML %linenosgray{*fun inc_by_five x = x |> (fn x => x + 1) |> (fn x => (x, x)) |> fst |> (fn x => x + 4)*}text {* which increments its argument @{text x} by 5. It proceeds by first incrementing the argument by 1 (Line 2); then storing the result in a pair (Line 3); taking the first component of the pair (Line 4) and finally incrementing the first component by 4 (Line 5). This kind of cascading manipulations of values is quite common when dealing with theories (for example by adding a definition, followed by lemmas and so on). The reverse application allows you to read what happens in a top-down manner. This kind of coding should also be familiar, if you have been exposed to Haskell's {\it do}-notation. Writing the function @{ML inc_by_five} using the reverse application is much clearer than writing*}ML{*fun inc_by_five x = fst ((fn x => (x, x)) (x + 1)) + 4*}text {* or *}ML{*fun inc_by_five x = ((fn x => x + 4) o fst o (fn x => (x, x)) o (fn x => x + 1)) x*}text {* and typographically more economical than *}ML{*fun inc_by_five x =let val y1 = x + 1 val y2 = (y1, y1) val y3 = fst y2 val y4 = y3 + 4in y4 end*}text {* Another reason why the let-bindings in the code above are better to be avoided: it is more than easy to get the intermediate values wrong, not to mention the nightmares the maintenance of this code causes! In Isabelle, a ``real world'' example for a function written in the waterfall fashion might be the following code:*}ML %linenosgray{*fun apply_fresh_args f ctxt = f |> fastype_of |> binder_types |> map (pair "z") |> Variable.variant_frees ctxt [f] |> map Free |> curry list_comb f *}text {* This function takes a term and a context as argument. If the term is of function type, then @{ML "apply_fresh_args"} returns the term with distinct variables applied to it. For example below variables are applied to the term @{term [show_types] "P::nat \<Rightarrow> int \<Rightarrow> unit \<Rightarrow> bool"}: @{ML_response_fake [display,gray]"let val t = @{term \"P::nat \<Rightarrow> int \<Rightarrow> unit \<Rightarrow> bool\"} val ctxt = @{context}in apply_fresh_args t ctxt |> string_of_term ctxt |> tracingend" "P z za zb"} You can read off this behaviour from how @{ML apply_fresh_args} is coded: in Line 2, the function @{ML_ind fastype_of} calculates the type of the term; @{ML_ind binder_types} in the next line produces the list of argument types (in the case above the list @{text "[nat, int, unit]"}); Line 4 pairs up each type with the string @{text "z"}; the function @{ML_ind variant_frees in Variable} generates for each @{text "z"} a unique name avoiding the given @{text f}; the list of name-type pairs is turned into a list of variable terms in Line 6, which in the last line is applied by the function @{ML_ind list_comb} to the term. In this last step we have to use the function @{ML_ind curry}, because @{ML_ind list_comb} expects the function and the variables list as a pair. This kind of functions is often needed when constructing terms with fresh variables. The infrastructure helps tremendously to avoid any name clashes. Consider for example: @{ML_response_fake [display,gray]"let val t = @{term \"za::'a \<Rightarrow> 'b \<Rightarrow> 'c\"} val ctxt = @{context}in apply_fresh_args t ctxt |> string_of_term ctxt |> tracingend" "za z zb"} where the @{text "za"} is correctly avoided. The combinator @{ML_ind "#>"} is the reverse function composition. It can be used to define the following function*}ML{*val inc_by_six = (fn x => x + 1) #> (fn x => x + 2) #> (fn x => x + 3)*}text {* which is the function composed of first the increment-by-one function and then increment-by-two, followed by increment-by-three. Again, the reverse function composition allows you to read the code top-down. The remaining combinators described in this section add convenience for the ``waterfall method'' of writing functions. The combinator @{ML_ind tap} allows you to get hold of an intermediate result (to do some side-calculations for instance). The function *}ML %linenosgray{*fun inc_by_three x = x |> (fn x => x + 1) |> tap (fn x => tracing (PolyML.makestring x)) |> (fn x => x + 2)*}text {* increments the argument first by @{text "1"} and then by @{text "2"}. In the middle (Line 3), however, it uses @{ML_ind tap} for printing the ``plus-one'' intermediate result inside the tracing buffer. The function @{ML_ind tap} can only be used for side-calculations, because any value that is computed cannot be merged back into the ``main waterfall''. To do this, you can use the next combinator. The combinator @{ML_ind "`"} (a backtick) is similar to @{ML tap}, but applies a function to the value and returns the result together with the value (as a pair). For example the function *}ML{*fun inc_as_pair x = x |> `(fn x => x + 1) |> (fn (x, y) => (x, y + 1))*}text {* takes @{text x} as argument, and then increments @{text x}, but also keeps @{text x}. The intermediate result is therefore the pair @{ML "(x + 1, x)" for x}. After that, the function increments the right-hand component of the pair. So finally the result will be @{ML "(x + 1, x + 1)" for x}. The combinators @{ML_ind "|>>"} and @{ML_ind "||>"} are defined for functions manipulating pairs. The first applies the function to the first component of the pair, defined as*}ML{*fun (x, y) |>> f = (f x, y)*}text {* and the second combinator to the second component, defined as*}ML{*fun (x, y) ||> f = (x, f y)*}text {* These two functions can, for example, be used to avoid explicit @{text "lets"} for intermediate values in functions that return pairs. As an example, suppose you want to separate a list of integers into two lists according to a treshold. If the treshold is @{ML "5"}, the list @{ML "[1,6,2,5,3,4]"} should be separated to @{ML "([1,2,3,4], [6,5])"}. This function can be implemented as*}ML{*fun separate i [] = ([], []) | separate i (x::xs) = let val (los, grs) = separate i xs in if i <= x then (los, x::grs) else (x::los, grs) end*}text {* where however the return value of the recursive call is bound explicitly to the pair @{ML "(los, grs)" for los grs}. You can implement this function more concisely as*}ML{*fun separate i [] = ([], []) | separate i (x::xs) = if i <= x then separate i xs ||> cons x else separate i xs |>> cons x*}text {* avoiding the explicit @{text "let"}. While in this example the gain in conciseness is only small, in more complicated situations the benefit of avoiding @{text "lets"} can be substantial. With the combinator @{ML_ind "|->"} you can re-combine the elements from a pair. This combinator is defined as*}ML{*fun (x, y) |-> f = f x y*}text {* and can be used to write the following roundabout version of the @{text double} function: *}ML{*fun double x = x |> (fn x => (x, x)) |-> (fn x => fn y => x + y)*}text {* The combinator @{ML_ind ||>>} plays a central rôle whenever your task is to update a theory and the update also produces a side-result (for example a theorem). Functions for such tasks return a pair whose second component is the theory and the fist component is the side-result. Using @{ML_ind ||>>}, you can do conveniently the update and also accumulate the side-results. Consider the following simple function. *}ML %linenosgray{*fun acc_incs x = x |> (fn x => ("", x)) ||>> (fn x => (x, x + 1)) ||>> (fn x => (x, x + 1)) ||>> (fn x => (x, x + 1))*}text {* The purpose of Line 2 is to just pair up the argument with a dummy value (since @{ML_ind "||>>"} operates on pairs). Each of the next three lines just increment the value by one, but also nest the intermediate results to the left. For example @{ML_response [display,gray] "acc_incs 1" "((((\"\", 1), 2), 3), 4)"} You can continue this chain with: @{ML_response [display,gray] "acc_incs 1 ||>> (fn x => (x, x + 2))" "(((((\"\", 1), 2), 3), 4), 6)"} \footnote{\bf FIXME: maybe give a ``real world'' example for this combinator.}*}text {* Recall that @{ML_ind "|>"} is the reverse function application. Recall also that the related reverse function composition is @{ML_ind "#>"}. In fact all the combinators @{ML_ind "|->"}, @{ML_ind "|>>"} , @{ML_ind "||>"} and @{ML_ind "||>>"} described above have related combinators for function composition, namely @{ML_ind "#->"}, @{ML_ind "#>>"}, @{ML_ind "##>"} and @{ML_ind "##>>"}. Using @{ML_ind "#->"}, for example, the function @{text double} can also be written as:*}ML{*val double = (fn x => (x, x)) #-> (fn x => fn y => x + y)*}text {* When using combinators for writing functions in waterfall fashion, it is sometimes necessary to do some ``plumbing'' in order to fit functions together. We have already seen such plumbing in the function @{ML apply_fresh_args}, where @{ML curry} is needed for making the function @{ML list_comb} that works over pairs to fit with the combinator @{ML "|>"}. Such plumbing is also needed in situations where a function operate over lists, but one calculates only with a single element. An example is the function @{ML_ind check_terms in Syntax}, whose purpose is to type-check a list of terms. @{ML_response_fake [display, gray] "let val ctxt = @{context}in map (Syntax.parse_term ctxt) [\"m + n\", \"m * n\", \"m - (n::nat)\"] |> Syntax.check_terms ctxt |> string_of_terms ctxt |> tracingend" "m + n, m * n, m - n"}*}text {* In this example we obtain three terms whose variables @{text m} and @{text n} are of type @{typ "nat"}. If you have only a single term, then @{ML check_terms in Syntax} needs plumbing. This can be done with the function @{ML singleton}.\footnote{There is already a function @{ML check_term in Syntax} in the Isabelle sources that is defined in terms of @{ML singleton} and @{ML check_terms in Syntax}.} For example @{ML_response_fake [display, gray] "let val ctxt = @{context}in Syntax.parse_term ctxt \"m - (n::nat)\" |> singleton (Syntax.check_terms ctxt) |> string_of_term ctxt |> tracingend" "m - n"} \begin{readmore} The most frequently used combinators are defined in the files @{ML_file "Pure/library.ML"} and @{ML_file "Pure/General/basics.ML"}. Also \isccite{sec:ML-linear-trans} contains further information about combinators. \end{readmore} \footnote{\bf FIXME: find a good exercise for combinators} \footnote{\bf FIXME: say something about calling conventions} *}section {* ML-Antiquotations *}text {* The main advantage of embedding all code in a theory is that the code can contain references to entities defined on the logical level of Isabelle. By this we mean definitions, theorems, terms and so on. This kind of reference is realised in Isabelle with ML-antiquotations, often just called antiquotations.\footnote{There are two kinds of antiquotations in Isabelle, which have very different purposes and infrastructures. The first kind, described in this section, are \emph{ML-antiquotations}. They are used to refer to entities (like terms, types etc) from Isabelle's logic layer inside ML-code. The other kind of antiquotations are \emph{document} antiquotations. They are used only in the text parts of Isabelle and their purpose is to print logical entities inside \LaTeX-documents. Document antiquotations are part of the user level and therefore we are not interested in them in this Tutorial, except in Appendix \ref{rec:docantiquotations} where we show how to implement your own document antiquotations.} For example, one can print out the name of the current theory by typing @{ML_response [display,gray] "Context.theory_name @{theory}" "\"FirstSteps\""} where @{text "@{theory}"} is an antiquotation that is substituted with the current theory (remember that we assumed we are inside the theory @{text FirstSteps}). The name of this theory can be extracted using the function @{ML_ind theory_name in Context}. Note, however, that antiquotations are statically linked, that is their value is determined at ``compile-time'', not at ``run-time''. For example the function*}ML{*fun not_current_thyname () = Context.theory_name @{theory} *}text {* does \emph{not} return the name of the current theory, if it is run in a different theory. Instead, the code above defines the constant function that always returns the string @{text [quotes] "FirstSteps"}, no matter where the function is called. Operationally speaking, the antiquotation @{text "@{theory}"} is \emph{not} replaced with code that will look up the current theory in some data structure and return it. Instead, it is literally replaced with the value representing the theory name. In a similar way you can use antiquotations to refer to proved theorems: @{text "@{thm \<dots>}"} for a single theorem @{ML_response_fake [display,gray] "@{thm allI}" "(\<And>x. ?P x) \<Longrightarrow> \<forall>x. ?P x"} and @{text "@{thms \<dots>}"} for more than one@{ML_response_fake [display,gray] "@{thms conj_ac}" "(?P \<and> ?Q) = (?Q \<and> ?P)(?P \<and> ?Q \<and> ?R) = (?Q \<and> ?P \<and> ?R)((?P \<and> ?Q) \<and> ?R) = (?P \<and> ?Q \<and> ?R)"} The point of these antiquotations is that referring to theorems in this way makes your code independent from what theorems the user might have stored under this name (this becomes especially important when you deal with theorem lists; see Section \ref{sec:storing}). You can also refer to the current simpset via an antiquotation. To illustrate this we implement the function that extracts the theorem names stored in a simpset.*}ML{*fun get_thm_names_from_ss simpset =let val {simps,...} = MetaSimplifier.dest_ss simpsetin map #1 simpsend*}text {* The function @{ML_ind dest_ss in MetaSimplifier} returns a record containing all information stored in the simpset, but we are only interested in the names of the simp-rules. Now you can feed in the current simpset into this function. The current simpset can be referred to using the antiquotation @{text "@{simpset}"}. @{ML_response_fake [display,gray] "get_thm_names_from_ss @{simpset}" "[\"Nat.of_nat_eq_id\", \"Int.of_int_eq_id\", \"Nat.One_nat_def\", \<dots>]"} Again, this way of referencing simpsets makes you independent from additions of lemmas to the simpset by the user that can potentially cause loops in your code. On the ML-level of Isabelle, you often have to work with qualified names. These are strings with some additional information, such as positional information and qualifiers. Such qualified names can be generated with the antiquotation @{text "@{binding \<dots>}"}. For example @{ML_response [display,gray] "@{binding \"name\"}" "name"} An example where a qualified name is needed is the function @{ML_ind define in LocalTheory}. This function is used below to define the constant @{term "TrueConj"} as the conjunction @{term "True \<and> True"}.*}local_setup %gray {* LocalTheory.define Thm.internalK ((@{binding "TrueConj"}, NoSyn), (Attrib.empty_binding, @{term "True \<and> True"})) #> snd *}text {* Now querying the definition you obtain: \begin{isabelle} \isacommand{thm}~@{text "TrueConj_def"}\\ @{text "> "}~@{thm TrueConj_def} \end{isabelle} \begin{readmore} The basic operations on bindings are implemented in @{ML_file "Pure/General/binding.ML"}. \end{readmore} \footnote{\bf FIXME give a better example why bindings are important; maybe give a pointer to \isacommand{local\_setup}; if not, then explain why @{ML snd} is needed.} \footnote{\bf FIXME: There should probably a separate section on binding, long-names and sign.} It is also possible to define your own antiquotations. But you should exercise care when introducing new ones, as they can also make your code also difficult to read. In the next chapter we will see how the (build in) antiquotation @{text "@{term \<dots>}"} can be used to construct terms. A restriction of this antiquotation is that it does not allow you to use schematic variables. If you want to have an antiquotation that does not have this restriction, you can implement your own using the function @{ML_ind ML_Antiquote.inline}. The code for the antiquotation @{text "term_pat"} is as follows.*}ML %linenosgray{*let val parser = Args.context -- Scan.lift Args.name_source fun term_pat (ctxt, str) = str |> ProofContext.read_term_pattern ctxt |> ML_Syntax.print_term |> ML_Syntax.atomicin ML_Antiquote.inline "term_pat" (parser >> term_pat)end*}text {* The parser in Line 2 provides us with a context and a string; this string is transformed into a term using the function @{ML_ind read_term_pattern in ProofContext} (Line 5); the next two lines transform the term into a string so that the ML-system can understand them. An example for the usage of this antiquotation is: @{ML_response_fake [display,gray] "@{term_pat \"Suc (?x::nat)\"}" "Const (\"Suc\", \"nat \<Rightarrow> nat\") $ Var ((\"x\", 0), \"nat\")"} which shows the internal representation of the term @{text "Suc ?x"}. \begin{readmore} The file @{ML_file "Pure/ML/ml_antiquote.ML"} contains the the definitions for most antiquotations. Most of the basic operations on ML-syntax are implemented in @{ML_file "Pure/ML/ml_syntax.ML"}. \end{readmore}*}section {* Storing Data in Isabelle\label{sec:storing} *}text {* Isabelle provides mechanisms for storing (and retrieving) arbitrary data. Before we delve into the details, let us digress a bit. Conventional wisdom has it that the type-system of ML ensures that for example an @{ML_type "'a list"} can only hold elements of the same type, namely @{ML_type "'a"}. Despite this wisdom, however, it is possible to implement a universal type in ML, although by some arguably accidental features of ML. This universal type can be used to store data of different type into a single list. It allows one to inject and to project data of \emph{arbitrary} type. This is in contrast to datatypes, which only allow injection and projection of data for some fixed collection of types. In light of the conventional wisdom cited above it is important to keep in mind that the universal type does not destroy type-safety of ML: storing and accessing the data can only be done in a type-safe manner. \begin{readmore} In Isabelle the universal type is implemented as the type @{ML_type Universal.universal} in the file @{ML_file "Pure/ML-Systems/universal.ML"}. \end{readmore} We will show the usage of the universal type by storing an integer and a boolean into a single list. Let us first define injection and projection functions for booleans and integers into and from @{ML_type Universal.universal}.*}ML{*local val fn_int = Universal.tag () : int Universal.tag val fn_bool = Universal.tag () : bool Universal.tagin val inject_int = Universal.tagInject fn_int; val inject_bool = Universal.tagInject fn_bool; val project_int = Universal.tagProject fn_int; val project_bool = Universal.tagProject fn_boolend*}text {* Using the injection functions, we can inject the integer @{ML_text "13"} and the boolean value @{ML_text "true"} into @{ML_type Universal.universal}, and then store them in a @{ML_type "Universal.universal list"} as follows:*}ML{*val foo_list = let val thirteen = inject_int 13 val truth_val = inject_bool truein [thirteen, truth_val]end*}text {* The data can be retrieved using the projection functions. @{ML_response [display, gray]"(project_int (nth foo_list 0), project_bool (nth foo_list 1))" "(13, true)"} Notice that we access the integer as an integer and the boolean as a boolean. If we attempt to access the integer as a boolean, then we get a runtime error. @{ML_response_fake [display, gray]"project_bool (nth foo_list 0)" "*** Exception- Match raised"} This runtime error is the reason why ML is still type-sound despite containing a universal type. Now, Isabelle heavily uses this mechanism for storing all sorts of data: theorem lists, simpsets, facts etc. Roughly speaking, there are two places where data can be stored: in \emph{theories} and in \emph{proof contexts}. Again roughly speaking, data such as simpsets need to be stored in a theory, since they need to be maintained across proofs and even across theories. On the other hand, data such as facts change inside a proof and are only relevant to the proof at hand. Therefore such data needs to be maintained inside a proof context.\footnote{\bf TODO: explain more about this in a separate section.} For theories and proof contexts there are, respectively, the functors @{ML_funct_ind TheoryDataFun} and @{ML_funct_ind ProofDataFun} that help with the data storage. Below we show how to implement a table in which we can store theorems and look them up according to a string key. The intention is to be able to look up introduction rules for logical connectives. Such a table might be useful in an automatic proof procedure and therefore it makes sense to store this data inside a theory. The code for the table is:*}ML %linenosgray{*structure Data = TheoryDataFun (type T = thm Symtab.table val empty = Symtab.empty val copy = I val extend = I fun merge _ = Symtab.merge (K true))*}text {* In order to store data in a theory, we have to specify the type of the data (Line 2). In this case we specify the type @{ML_type "thm Symtab.table"}, which stands for tables in which @{ML_type string}s can be looked up producing an associated @{ML_type thm}. We also have to specify four functions: how to initialise the data storage (Line 3), how to copy it (Line 4), how to extend it (Line 5) and how two tables should be merged (Line 6). These functions correspond roughly to the operations performed on theories.\footnote{\bf FIXME: Say more about the assumptions of these operations.} The result structure @{ML_text Data} contains functions for accessing the table (@{ML Data.get}) and for updating it (@{ML Data.map}). There are also two more functions (@{ML Data.init} and @{ML Data.put}), which however we ignore here. Below we define two auxiliary functions, which help us with accessing the table.*}ML{*val lookup = Symtab.lookup o Data.getfun update k v = Data.map (Symtab.update (k, v))*}text {* Since we want to store introduction rules associated with their logical connective, we can fill the table as follows.*}setup %gray {* update "op &" @{thm conjI} #> update "op -->" @{thm impI} #> update "All" @{thm allI}*}text {* The use of the command \isacommand{setup} makes sure the table in the \emph{current} theory is updated. The lookup can now be performed as follows. @{ML_response_fake [display, gray]"lookup @{theory} \"op &\"""SOME \"\<lbrakk>?P; ?Q\<rbrakk> \<Longrightarrow> ?P \<and> ?Q\""} An important point to note is that these tables (and data in general) need to be treated in a purely functional fashion. Although we can update the table as follows*}setup %gray {* update "op &" @{thm TrueI} *}text {* and @{ML lookup} now produces the introduction rule for @{term "True"}@{ML_response_fake [display, gray]"lookup @{theory} \"op &\"""SOME \"True\""} there are no references involved. This is one of the most fundamental coding conventions for programming in Isabelle. References would interfere with the multithreaded execution model of Isabelle and also defeat its undo-mechanism in proof scripts. For this consider the following data container where we maintain a reference to a list of integers.*}ML{*structure WrongRefData = TheoryDataFun (type T = (int list) Unsynchronized.ref val empty = Unsynchronized.ref [] val copy = I val extend = I fun merge _ = fst)*}text {* We initialise the reference with the empty list. Consequently a first lookup produces @{ML "ref []" in Unsynchronized}. @{ML_response_fake [display,gray] "WrongRefData.get @{theory}" "ref []"} For updating the reference we use the following function*}ML{*fun ref_update n = WrongRefData.map (fn r => let val _ = r := n::(!r) in r end)*}text {* which takes an integer and adds it to the content of the reference. As above we update the reference with the command \isacommand{setup}. *}setup %gray{* ref_update 1 *}text {* A lookup in the current theory gives then the expected list @{ML "ref [1]" in Unsynchronized}. @{ML_response_fake [display,gray] "WrongRefData.get @{theory}" "ref [1]"} So far everything is as expected. But, the trouble starts if we attempt to backtrack to ``before'' the \isacommand{setup}-command. There, we would expect that the list is empty again. But since it is stored in a reference, Isabelle has no control over it. So it is not empty, but still @{ML "ref [1]" in Unsynchronized}. Adding to the trouble, if we execute the \isacommand{setup}-command again, we do not obtain @{ML "ref [1]" in Unsynchronized}, but @{ML_response_fake [display,gray] "WrongRefData.get @{theory}" "ref [1, 1]"} Now imagine how often you go backwards and forwards in your proof scripts. By using references in Isabelle code, you are bound to cause all hell to break loose. Therefore observe the coding convention: Do not use references for storing data! \begin{readmore} The functors for data storage are defined in @{ML_file "Pure/context.ML"}. Isabelle contains implementations of several container data structures, including association lists in @{ML_file "Pure/General/alist.ML"}, directed graphs in @{ML_file "Pure/General/graph.ML"}. and tables and symtables in @{ML_file "Pure/General/table.ML"}. \end{readmore} Storing data in a proof context is done in a similar fashion. The corresponding functor is @{ML_funct_ind ProofDataFun}. With the following code we can store a list of terms in a proof context. *}ML{*structure Data = ProofDataFun (type T = term list fun init _ = [])*}text {* The function we have to specify has to produce a list once a context is initialised (taking the theory into account from which the context is derived). We choose to just return the empty list. Next we define two auxiliary functions for updating the list with a given term and printing the list. *}ML{*fun update trm = Data.map (fn trms => trm::trms)fun print ctxt = case (Data.get ctxt) of [] => tracing "Empty!" | trms => tracing (string_of_terms ctxt trms)*}text {* Next we start with the context generated by the antiquotation @{text "@{context}"} and update it in various ways. @{ML_response_fake [display,gray]"let val ctxt = @{context} val ctxt' = ctxt |> update @{term \"False\"} |> update @{term \"True \<and> True\"} val ctxt'' = ctxt |> update @{term \"1::nat\"} val ctxt''' = ctxt'' |> update @{term \"2::nat\"}in print ctxt; print ctxt'; print ctxt''; print ctxt'''end""Empty!True \<and> True, False12, 1"} Many functions in Isabelle manage and update data in a similar fashion. Consequently, such calculation with contexts occur frequently in Isabelle code, although the ``context flow'' is usually only linear. Note also that the calculation above has no effect on the underlying theory. Once we throw away the contexts, we have no access to their associated data. This is different to theories, where the command \isacommand{setup} registers the data with the current and future theories, and therefore one can access the data potentially indefinitely.\footnote{\bf FIXME: check this; it seems that is in conflict with the statements below.} For convenience there is an abstract layer, the type @{ML_type Context.generic}, for theories and proof contexts. This type is defined as follows*}ML_val{*datatype generic = Theory of theory | Proof of proof*}text {* For this type a number of operations are defined which allow the uniform treatment of theories and proof contexts. There are two special instances of the data storage mechanism described above. The first instance are named theorem lists. Since storing theorems in a list is such a common task, there is the special functor @{ML_funct_ind Named_Thms}. To obtain a named theorem list, you just declare*}ML{*structure FooRules = Named_Thms (val name = "foo" val description = "Theorems for foo") *}text {* and set up the @{ML_struct FooRules} with the command*}setup %gray {* FooRules.setup *}text {* This code declares a data container where the theorems are stored, an attribute @{text foo} (with the @{text add} and @{text del} options for adding and deleting theorems) and an internal ML-interface to retrieve and modify the theorems. Furthermore, the facts are made available on the user-level under the dynamic fact name @{text foo}. For example you can declare three lemmas to be of the kind @{text foo} by:*}lemma rule1[foo]: "A" sorrylemma rule2[foo]: "B" sorrylemma rule3[foo]: "C" sorrytext {* and undeclare the first one by: *}declare rule1[foo del]text {* and query the remaining ones with: \begin{isabelle} \isacommand{thm}~@{text "foo"}\\ @{text "> ?C"}\\ @{text "> ?B"} \end{isabelle} On the ML-level, we can add theorems to the list with @{ML FooRules.add_thm}:*} setup %gray {* Context.theory_map (FooRules.add_thm @{thm TrueI}) *}text {* The rules in the list can be retrieved using the function @{ML FooRules.get}: @{ML_response_fake [display,gray] "FooRules.get @{context}" "[\"True\", \"?C\",\"?B\"]"} \begin{readmore} For more information see @{ML_file "Pure/Tools/named_thms.ML"}. \end{readmore} The second special instance of the data storage mechanism are configuration values. They are used to enable users to configure tools without having to resort to the ML-level (and also to avoid references). Assume you want the user to control three values, say @{text bval} containing a boolean, @{text ival} containing an integer and @{text sval} containing a string. These values can be declared by*}ML{*val (bval, setup_bval) = Attrib.config_bool "bval" falseval (ival, setup_ival) = Attrib.config_int "ival" 0val (sval, setup_sval) = Attrib.config_string "sval" "some string" *}text {* where each value needs to be given a default. To enable these values, they need to be set up with*}setup %gray {* setup_bval #> setup_ival *}text {* The user can now manipulate the values from the user-level of Isabelle with the command*}declare [[bval = true, ival = 3]]text {* On the ML-level these values can be retrieved using the function @{ML Config.get}. @{ML_response [display,gray] "Config.get @{context} bval" "true"} \begin{readmore} For more information about configuration values see @{ML_file "Pure/Isar/attrib.ML"} and @{ML_file "Pure/config.ML"}. \end{readmore}*}(**************************************************)(* bak *)(**************************************************)(*section {* Do Not Try This At Home! *}ML {* val my_thms = ref ([]: thm list) *}attribute_setup my_thm_bad = {* Scan.succeed (Thm.declaration_attribute (fn th => fn ctxt => (my_thms := th :: ! my_thms; ctxt))) *} "bad attribute"declare sym [my_thm_bad]declare refl [my_thm_bad]ML "!my_thms"*)end