diff -r 53460ac408b5 -r 5bb2d29553c2 CookBook/Package/Ind_Interface.thy --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/CookBook/Package/Ind_Interface.thy Fri Oct 10 17:13:21 2008 +0200 @@ -0,0 +1,449 @@ +theory Ind_Interface +imports Base Simple_Inductive_Package +begin + +(*<*) +ML {* +structure SIP = SimpleInductivePackage +*} +(*>*) + +section{* The interface *} + +text {* +\label{sec:ind-interface} +In order to add a new inductive predicate to a theory with the help of our package, the user +must \emph{invoke} it. For every package, there are essentially two different ways of invoking +it, which we will refer to as \emph{external} and \emph{internal}. By external +invocation we mean that the package is called from within a theory document. In this case, +the type of the inductive predicate, as well as its introduction rules, are given as strings +by the user. Before the package can actually make the definition, the type and introduction +rules have to be parsed. In contrast, internal invocation means that the package is called +by some other package. For example, the function definition package \cite{Krauss-IJCAR06} +calls the inductive definition package to define the graph of the function. However, it is +not a good idea for the function definition package to pass the introduction rules for the +function graph to the inductive definition package as strings. In this case, it is better +to directly pass the rules to the package as a list of terms, which is more robust than +handling strings that are lacking the additional structure of terms. These two ways of +invoking the package are reflected in its ML programming interface, which consists of two +functions: +@{ML_chunk [display] SIMPLE_INDUCTIVE_PACKAGE} +The function for external invocation of the package is called @{ML_open add_inductive (SIP)}, +whereas the one for internal invocation is called @{ML_open add_inductive_i (SIP)}. Both +of these functions take as arguments the names and types of the inductive predicates, the +names and types of their parameters, the actual introduction rules and a \emph{local theory}. +They return a local theory containing the definition, together with a tuple containing +the introduction and induction rules, which are stored in the local theory, too. +In contrast to an ordinary theory, which simply consists of a type signature, as +well as tables for constants, axioms and theorems, a local theory also contains +additional context information, such as locally fixed variables and local assumptions +that may be used by the package. The type @{ML_type local_theory} is identical to the +type of \emph{proof contexts} @{ML_type "Proof.context"}, although not every proof context +constitutes a valid local theory. +Note that @{ML_open add_inductive_i (SIP)} expects the types +of the predicates and parameters to be specified using the datatype @{ML_type typ} of Isabelle's +logical framework, whereas @{ML_open add_inductive (SIP)} +expects them to be given as optional strings. If no string is +given for a particular predicate or parameter, this means that the type should be +inferred by the package. Additional \emph{mixfix syntax} may be associated with +the predicates and parameters as well. Note that @{ML_open add_inductive_i (SIP)} does not +allow mixfix syntax to be associated with parameters, since it can only be used +for parsing. The names of the predicates, parameters and rules are represented by the +type @{ML_type Name.binding}. Strings can be turned into elements of the type +@{ML_type Name.binding} using the function +@{ML [display] "Name.binding : string -> Name.binding"} +Each introduction rule is given as a tuple containing its name, a list of \emph{attributes} +and a logical formula. Note that the type @{ML_type Attrib.binding} used in the list of +introduction rules is just a shorthand for the type @{ML_type "Name.binding * Attrib.src list"}. +The function @{ML_open add_inductive_i (SIP)} expects the formula to be specified using the datatype +@{ML_type term}, whereas @{ML_open add_inductive (SIP)} expects it to be given as a string. +An attribute specifies additional actions and transformations that should be applied to +a theorem, such as storing it in the rule databases used by automatic tactics +like the simplifier. The code of the package, which will be described in the following +section, will mostly treat attributes as a black box and just forward them to other +functions for storing theorems in local theories. +The implementation of the function @{ML_open add_inductive (SIP)} for external invocation +of the package is quite simple. Essentially, it just parses the introduction rules +and then passes them on to @{ML_open add_inductive_i (SIP)}: +@{ML_chunk [display] add_inductive} +For parsing and type checking the introduction rules, we use the function +@{ML_open [display] "Specification.read_specification: + (Name.binding * string option * mixfix) list -> (*{variables}*) + (Attrib.binding * string list) list list -> (*{rules}*) + local_theory -> + (((Name.binding * typ) * mixfix) list * + (Attrib.binding * term list) list) * + local_theory"} +During parsing, both predicates and parameters are treated as variables, so +the lists \verb!preds_syn! and \verb!params_syn! are just appended +before being passed to @{ML_open read_specification (Specification)}. Note that the format +for rules supported by @{ML_open read_specification (Specification)} is more general than +what is required for our package. It allows several rules to be associated +with one name, and the list of rules can be partitioned into several +sublists. In order for the list \verb!intro_srcs! of introduction rules +to be acceptable as an input for @{ML_open read_specification (Specification)}, we first +have to turn it into a list of singleton lists. This transformation +has to be reversed later on by applying the function +@{ML [display] "the_single: 'a list -> 'a"} +to the list \verb!specs! containing the parsed introduction rules. +The function @{ML_open read_specification (Specification)} also returns the list \verb!vars! +of predicates and parameters that contains the inferred types as well. +This list has to be chopped into the two lists \verb!preds_syn'! and +\verb!params_syn'! for predicates and parameters, respectively. +All variables occurring in a rule but not in the list of variables passed to +@{ML_open read_specification (Specification)} will be bound by a meta-level universal +quantifier. +Finally, @{ML_open read_specification (Specification)} also returns another local theory, +but we can safely discard it. As an example, let us look at how we can use this +function to parse the introduction rules of the @{text trcl} predicate: +@{ML_response [display] +"Specification.read_specification + [(Name.binding \"trcl\", NONE, NoSyn), + (Name.binding \"r\", SOME \"'a \ 'a \ bool\", NoSyn)] + [[((Name.binding \"base\", []), [\"trcl r x x\"])], + [((Name.binding \"step\", []), [\"trcl r x y \ r y z \ trcl r x z\"])]] + @{context}" +"((\, + [(\, + [Const (\"all\", \) $ Abs (\"x\", TFree (\"'a\", \), + Const (\"Trueprop\", \) $ + (Free (\"trcl\", \) $ Free (\"r\", \) $ Bound 0 $ Bound 0))]), + (\, + [Const (\"all\", \) $ Abs (\"x\", TFree (\"'a\", \), + Const (\"all\", \) $ Abs (\"y\", TFree (\"'a\", \), + Const (\"all\", \) $ Abs (\"z\", TFree (\"'a\", \), + Const (\"==>\", \) $ + (Const (\"Trueprop\", \) $ + (Free (\"trcl\", \) $ Free (\"r\", \) $ Bound 2 $ Bound 1)) $ + (Const (\"==>\", \) $ \ $ \))))])]), + \) +: (((Name.binding * typ) * mixfix) list * + (Attrib.binding * term list) list) * local_theory"} +In the list of variables passed to @{ML_open read_specification (Specification)}, we have +used the mixfix annotation @{ML NoSyn} to indicate that we do not want to associate any +mixfix syntax with the variable. Moreover, we have only specified the type of \texttt{r}, +whereas the type of \texttt{trcl} is computed using type inference. +The local variables \texttt{x}, \texttt{y} and \texttt{z} of the introduction rules +are turned into bound variables with the de Bruijn indices, +whereas \texttt{trcl} and \texttt{r} remain free variables. + +\paragraph{Parsers for theory syntax} + +Although the function @{ML_open add_inductive (SIP)} parses terms and types, it still +cannot be used to invoke the package directly from within a theory document. +In order to do this, we have to write another parser. Before we describe +the process of writing parsers for theory syntax in more detail, we first +show some examples of how we would like to use the inductive definition +package. + +\noindent +The definition of the transitive closure should look as follows: +*} + +simple_inductive + trcl for r :: "'a \ 'a \ bool" +where + base: "trcl r x x" +| step: "trcl r x y \ r y z \ trcl r x z" +(*<*) +thm trcl_def +thm trcl.induct +thm base +thm step +thm trcl.intros + +lemma trcl_strong_induct: + assumes trcl: "trcl r x y" + and I1: "\x. P x x" + and I2: "\x y z. P x y \ trcl r x y \ r y z \ P x z" + shows "P x y" +proof - + from trcl + have "P x y \ trcl r x y" + proof induct + case (base x) + from I1 and trcl.base show ?case .. + next + case (step x y z) + then have "P x y" and "trcl r x y" by simp_all + from `P x y` `trcl r x y` `r y z` have "P x z" + by (rule I2) + moreover from `trcl r x y` `r y z` have "trcl r x z" + by (rule trcl.step) + ultimately show ?case .. + qed + then show ?thesis .. +qed +(*>*) + +text {* +\noindent +Even and odd numbers can be defined by +*} + +simple_inductive + even and odd +where + even0: "even 0" +| evenS: "odd n \ even (Suc n)" +| oddS: "even n \ odd (Suc n)" +(*<*) +thm even_def odd_def +thm even.induct odd.induct +thm even0 +thm evenS +thm oddS +thm even_odd.intros +(*>*) + +text {* +\noindent +The accessible part of a relation can be introduced as follows: +*} + +simple_inductive + accpart for r :: "'a \ 'a \ bool" +where + accpartI: "(\y. r y x \ accpart r y) \ accpart r x" +(*<*) +thm accpart_def +thm accpart.induct +thm accpartI +(*>*) + +text {* +\noindent +Moreover, it should also be possible to define the accessible part +inside a locale fixing the relation @{text r}: +*} + +locale rel = + fixes r :: "'a \ 'a \ bool" + +simple_inductive (in rel) accpart' +where + accpartI': "\x. (\y. r y x \ accpart' y) \ accpart' x" +(*<*) +context rel +begin + +thm accpartI' +thm accpart'.induct + +end + +thm rel.accpartI' +thm rel.accpart'.induct + +ML {* +val (result, lthy) = SimpleInductivePackage.add_inductive + [(Name.binding "trcl'", NONE, NoSyn)] [(Name.binding "r", SOME "'a \ 'a \ bool", NoSyn)] + [((Name.binding "base", []), "\x. trcl' r x x"), ((Name.binding "step", []), "\x y z. trcl' r x y \ r y z \ trcl' r x z")] + (TheoryTarget.init NONE @{theory}) +*} +(*>*) + +text {* +\noindent +In this context, it is important to note that Isabelle distinguishes +between \emph{outer} and \emph{inner} syntax. Theory commands such as +\isa{\isacommand{simple{\isacharunderscore}inductive} $\ldots$ \isacommand{for} $\ldots$ \isacommand{where} $\ldots$} +belong to the outer syntax, whereas items in quotation marks, in particular +terms such as @{text [source] "trcl r x x"} and types such as +@{text [source] "'a \ 'a \ bool"} belong to the inner syntax. +Separating the two layers of outer and inner syntax greatly simplifies +matters, because the parser for terms and types does not have to know +anything about the possible syntax of theory commands, and the parser +for theory commands need not be concerned about the syntactic structure +of terms and types. + +\medskip +\noindent +The syntax of the \isa{\isacommand{simple{\isacharunderscore}inductive}} command +can be described by the following railroad diagram: +\begin{rail} + 'simple\_inductive' target? fixes ('for' fixes)? \\ + ('where' (thmdecl? prop + '|'))? + ; +\end{rail} + +\paragraph{Functional parsers} + +For parsing terms and types, Isabelle uses a rather general and sophisticated +algorithm due to Earley, which is driven by \emph{priority grammars}. +In contrast, parsers for theory syntax are built up using a set of combinators. +Functional parsing using combinators is a well-established technique, which +has been described by many authors, including Paulson \cite{paulson-ML-91} +and Wadler \cite{Wadler-AFP95}. +The central idea is that a parser is a function of type @{ML_type "'a list -> 'b * 'a list"}, +where @{ML_type "'a"} is a type of \emph{tokens}, and @{ML_type "'b"} is a type for +encoding items that the parser has recognized. When a parser is applied to a +list of tokens whose prefix it can recognize, it returns an encoding of the +prefix as an element of type @{ML_type "'b"}, together with the suffix of the list +containing the remaining tokens. Otherwise, the parser raises an exception +indicating a syntax error. The library for writing functional parsers in +Isabelle can roughly be split up into two parts. The first part consists of a +collection of generic parser combinators that are contained in the structure +@{ML_struct Scan} defined in the file @{ML_file "Pure/General/scan.ML"} in the Isabelle +sources. While these combinators do not make any assumptions about the concrete +structure of the tokens used, the second part of the library consists of combinators +for dealing with specific token types. +The following is an excerpt from the signature of @{ML_struct Scan}: +\begin{mytable} +@{ML "|| : ('a -> 'b) * ('a -> 'b) -> 'a -> 'b"} \\ +@{ML "-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> ('b * 'd) * 'e"} \\ +@{ML "|-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'd * 'e"} \\ +@{ML "--| : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'b * 'e"} \\ +@{ML_open "optional: ('a -> 'b * 'a) -> 'b -> 'a -> 'b * 'a" (Scan)} \\ +@{ML_open "repeat: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" (Scan)} \\ +@{ML_open "repeat1: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" (Scan)} \\ +@{ML ">> : ('a -> 'b * 'c) * ('b -> 'd) -> 'a -> 'd * 'c"} \\ +@{ML "!! : ('a * string option -> string) -> ('a -> 'b) -> 'a -> 'b"} +\end{mytable} +Interestingly, the functions shown above are so generic that they do not +even rely on the input and output of the parser being a list of tokens. +If \texttt{p} succeeds, i.e.\ does not raise an exception, the parser +@{ML_open "p || q" for p q} returns the result of \texttt{p}, otherwise it returns +the result of \texttt{q}. The parser @{ML_open "p -- q" for p q} first parses an +item of type @{ML_type "'b"} using \texttt{p}, then passes the remaining tokens +of type @{ML_type "'c"} to \texttt{q}, which parses an item of type @{ML_type "'d"} +and returns the remaining tokens of type @{ML_type "'e"}, which are finally +returned together with a pair of type @{ML_type "'b * 'd"} containing the two +parsed items. The parsers @{ML_open "p |-- q" for p q} and @{ML_open "p --| q" for p q} +work in a similar way as the previous one, with the difference that they +discard the item parsed by the first and the second parser, respectively. +If \texttt{p} succeeds, the parser @{ML_open "optional p x" for p x (Scan)} returns the result +of \texttt{p}, otherwise it returns the default value \texttt{x}. The parser +@{ML_open "repeat p" for p (Scan)} applies \texttt{p} as often as it can, returning a possibly +empty list of parsed items. The parser @{ML_open "repeat1 p" for p (Scan)} is similar, +but requires \texttt{p} to succeed at least once. The parser +@{ML_open "p >> f" for p f} uses \texttt{p} to parse an item of type @{ML_type "'b"}, to which +it applies the function \texttt{f} yielding a value of type @{ML_type "'d"}, which +is returned together with the remaining tokens of type @{ML_type "'c"}. +Finally, @{ML "!!"} is used for transforming exceptions produced by parsers. +If \texttt{p} raises an exception indicating that it cannot parse a given input, +then an enclosing parser such as +@{ML_open [display] "q -- p || r" for p q r} +will try the alternative parser \texttt{r}. By writing +@{ML_open [display] "q -- !! err p || r" for err p q r} +instead, one can achieve that a failure of \texttt{p} causes the whole parser to abort. +The @{ML "!!"} operator is similar to the \emph{cut} operator in Prolog, which prevents +the interpreter from backtracking. The \texttt{err} function supplied as an argument +to @{ML "!!"} can be used to produce an error message depending on the current +state of the parser, as well as the optional error message returned by \texttt{p}. + +So far, we have only looked at combinators that construct more complex parsers +from simpler parsers. In order for these combinators to be useful, we also need +some basic parsers. As an example, we consider the following two parsers +defined in @{ML_struct Scan}: +\begin{mytable} +@{ML_open "one: ('a -> bool) -> 'a list -> 'a * 'a list" (Scan)} \\ +@{ML_open "$$ : string -> string list -> string * string list"} +\end{mytable} +The parser @{ML_open "one pred" for pred (Scan)} parses exactly one token that +satisfies the predicate \texttt{pred}, whereas @{ML_open "$$ s" for s} only +accepts a token that equals the string \texttt{s}. Note that we can easily +express @{ML_open "$$ s" for s} using @{ML_open "one" (Scan)}: +@{ML_open [display] "one (fn s' => s' = s)" for s (Scan)} +As an example, let us look at how we can use @{ML "$$"} and @{ML "--"} to parse +the prefix ``\texttt{hello}'' of the character list ``\texttt{hello world}'': +@{ML_response [display] +"($$ \"h\" -- $$ \"e\" -- $$ \"l\" -- $$ \"l\" -- $$ \"o\") +[\"h\", \"e\", \"l\", \"l\", \"o\", \" \", \"w\", \"o\", \"r\", \"l\", \"d\"]" +"(((((\"h\", \"e\"), \"l\"), \"l\"), \"o\"), [\" \", \"w\", \"o\", \"r\", \"l\", \"d\"]) +: ((((string * string) * string) * string) * string) * string list"} +Most of the time, however, we will have to deal with tokens that are not just strings. +The parsers for the theory syntax, as well as the parsers for the argument syntax +of proof methods and attributes use the token type @{ML_type OuterParse.token}, +which is identical to the type @{ML_type OuterLex.token}. +The parser functions for the theory syntax are contained in the structure +@{ML_struct OuterParse} defined in the file @{ML_file "Pure/Isar/outer_parse.ML"}. +In our parser, we will use the following functions: +\begin{mytable} +@{ML_open "$$$ : string -> token list -> string * token list" (OuterParse)} \\ +@{ML_open "enum1: string -> (token list -> 'a * token list) -> token list -> + 'a list * token list" (OuterParse)} \\ +@{ML_open "prop: token list -> string * token list" (OuterParse)} \\ +@{ML_open "opt_target: token list -> string option * token list" (OuterParse)} \\ +@{ML_open "fixes: token list -> + (Name.binding * string option * mixfix) list * token list" (OuterParse)} \\ +@{ML_open "for_fixes: token list -> + (Name.binding * string option * mixfix) list * token list" (OuterParse)} \\ +@{ML_open "!!! : (token list -> 'a) -> token list -> 'a" (OuterParse)} +\end{mytable} +The parsers @{ML_open "$$$" (OuterParse)} and @{ML_open "!!!" (OuterParse)} are +defined using the parsers @{ML_open "one" (Scan)} and @{ML "!!"} from +@{ML_struct Scan}. +The parser @{ML_open "enum1 s p" for s p (OuterParse)} parses a non-emtpy list of items +recognized by the parser \texttt{p}, where the items are separated by \texttt{s}. +A proposition can be parsed using the function @{ML_open prop (OuterParse)}. +Essentially, a proposition is just a string or an identifier, but using the +specific parser function @{ML_open prop (OuterParse)} leads to more instructive +error messages, since the parser will complain that a proposition was expected +when something else than a string or identifier is found. +An optional locale target specification of the form \isa{(\isacommand{in}\ $\ldots$)} +can be parsed using @{ML_open opt_target (OuterParse)}. +The lists of names of the predicates and parameters, together with optional +types and syntax, are parsed using the functions @{ML_open "fixes" (OuterParse)} +and @{ML_open for_fixes (OuterParse)}, respectively. +In addition, the following function from @{ML_struct SpecParse} for parsing +an optional theorem name and attribute, followed by a delimiter, will be useful: +\begin{mytable} +@{ML_open "opt_thm_name: + string -> token list -> Attrib.binding * token list" (SpecParse)} +\end{mytable} +We now have all the necessary tools to write the parser for our +\isa{\isacommand{simple{\isacharunderscore}inductive}} command: +@{ML_chunk [display] syntax} +The definition of the parser \verb!ind_decl! closely follows the railroad +diagram shown above. In order to make the code more readable, the structures +@{ML_struct OuterParse} and @{ML_struct OuterKeyword} are abbreviated by +\texttt{P} and \texttt{K}, respectively. Note how the parser combinator +@{ML_open "!!!" (OuterParse)} is used: once the keyword \texttt{where} +has been parsed, a non-empty list of introduction rules must follow. +Had we not used the combinator @{ML_open "!!!" (OuterParse)}, a +\texttt{where} not followed by a list of rules would have caused the parser +to respond with the somewhat misleading error message +\begin{verbatim} + Outer syntax error: end of input expected, but keyword where was found +\end{verbatim} +rather than with the more instructive message +\begin{verbatim} + Outer syntax error: proposition expected, but terminator was found +\end{verbatim} +Once all arguments of the command have been parsed, we apply the function +@{ML_open add_inductive (SimpleInductivePackage)}, which yields a local theory +transformer of type @{ML_type "local_theory -> local_theory"}. Commands in +Isabelle/Isar are realized by transition transformers of type +@{ML_type [display] "Toplevel.transition -> Toplevel.transition"} +We can turn a local theory transformer into a transition transformer by using +the function +@{ML [display] "Toplevel.local_theory : string option -> + (local_theory -> local_theory) -> + Toplevel.transition -> Toplevel.transition"} +which, apart from the local theory transformer, takes an optional name of a locale +to be used as a basis for the local theory. The whole parser for our command has type +@{ML_type [display] "OuterLex.token list -> + (Toplevel.transition -> Toplevel.transition) * OuterLex.token list"} +which is abbreviated by @{ML_type OuterSyntax.parser_fn}. The new command can be added +to the system via the function +@{ML [display] "OuterSyntax.command : + string -> string -> OuterKeyword.T -> OuterSyntax.parser_fn -> unit"} +which imperatively updates the parser table behind the scenes. In addition to the parser, this +function takes two strings representing the name of the command and a short description, +as well as an element of type @{ML_type OuterKeyword.T} describing which \emph{kind} of +command we intend to add. Since we want to add a command for declaring new concepts, +we choose the kind @{ML "OuterKeyword.thy_decl"}. Other kinds include +@{ML "OuterKeyword.thy_goal"}, which is similar to @{ML_open thy_decl (OuterKeyword)}, +but requires the user to prove a goal before making the declaration, or +@{ML "OuterKeyword.diag"}, which corresponds to a purely diagnostic command that does +not change the context. For example, the @{ML_open thy_goal (OuterKeyword)} kind is used +by the \isa{\isacommand{function}} command \cite{Krauss-IJCAR06}, which requires the user +to prove that a given set of equations is non-overlapping and covers all cases. The kind +of the command should be chosen with care, since selecting the wrong one can cause strange +behaviour of the user interface, such as failure of the undo mechanism. +*} + +(*<*) +end +(*>*)