diff -r 039845fc96bd -r c9ff326e3ce5 CookBook/Package/Ind_Interface.thy --- a/CookBook/Package/Ind_Interface.thy Fri Feb 13 14:15:28 2009 +0000 +++ b/CookBook/Package/Ind_Interface.thy Sat Feb 14 00:11:50 2009 +0000 @@ -4,55 +4,174 @@ section {* The Interface \label{sec:ind-interface} *} +text {* + The purpose of the package we show next is that the user just specifies the + inductive predicate by stating some introduction rules and then the packages + makes the equivalent definition and derives from it useful properties. + To be able to write down the specification in Isabelle, we have to introduce + a new command (see Section~\ref{sec:newcommand}). As the keyword for the new + command we use \simpleinductive{}. The specifications corresponding to our + examples described earlier are: +*} + +simple_inductive + trcl :: "('a \ 'a \ bool) \ 'a \ 'a \ bool" +where + base: "trcl R x x" +| step: "trcl R x y \ R y z \ trcl R x z" + +simple_inductive + even and odd +where + even0: "even 0" +| evenS: "odd n \ even (Suc n)" +| oddS: "even n \ odd (Suc n)" + +simple_inductive + accpart :: "('a \ 'a \ bool) \ 'a \ bool" +where + accpartI: "(\y. R y x \ accpart R y) \ accpart R x" + text {* + We expect a constant (or constants) with possible typing annotations and a + list of introduction rules. While these specifications are all + straightforward, there is a technicality we like to deal with to do with + fixed parameters and locales. Remember we pointed out that the parameter + @{text R} is fixed throughout the specifications of @{text trcl} and @{text + accpart}. The point is that they might be fixed in a locale and we like to + support this. Accordingly we treat some parameters of the inductive + definition specially; see Figure~\ref{fig:inddefsfixed} where the transitive + closure and accessible part are defined with a fixed parameter @{text R} and + also inside a locale fixing @{text R}. +*} + +text_raw {* +\begin{figure}[p] +\begin{isabelle} +*} +simple_inductive + trcl' for R :: "'a \ 'a \ bool" +where + base: "trcl' R x x" +| step: "trcl' R x y \ R y z \ trcl' R x z" + +simple_inductive + accpart' for R :: "'a \ 'a \ bool" +where + accpartI: "(\y. R y x \ accpart' R y) \ accpart' R x" + +locale rel = + fixes R :: "'a \ 'a \ bool" + +simple_inductive (in rel) trcl'' +where + base: "trcl'' x x" +| step: "trcl'' x y \ R y z \ trcl'' x z" + +simple_inductive (in rel) accpart'' +where + accpartI: "(\y. R y x \ accpart'' y) \ accpart'' x" +text_raw {* +\end{isabelle} +\caption{The first definition is for the transitive closure where the +relation @{text R} is explicitly fixed. Simiraly the second definition +of the accessible part of the relation @{text R}. The last two definitions +specify the same inductive predicates, but this time defined inside +a locale.\label{fig:inddefsfixed}} +\end{figure} +*} + +text {* +\begin{figure}[p] +\begin{isabelle} + \railnontermfont{\rmfamily\itshape} + \railterm{simpleinductive,where,for} + \railalias{simpleinductive}{\simpleinductive{}} + \railalias{where}{\isacommand{where}} + \railalias{for}{\isacommand{for}} + \begin{rail} + simpleinductive target? fixes (for fixes)? \\ + (where (thmdecl? prop + '|'))? + ; + \end{rail} +\end{isabelle} +\caption{A railroad diagram describing the syntax of \simpleinductive{}. +The \emph{target} indicates an optional locale; the \emph{fixes} are an +\isacommand{and}-separated list of names for the inductive predicates (they +can also contain typing- and syntax anotations); similarly the \emph{fixes} +after \isacommand{for} to indicate fixed parameters; \emph{prop} stands for a +introduction rule with an optional theorem declaration (\emph{thmdecl}). +\label{fig:railroad}} +\end{figure} +*} + +text {* + The syntax of the \simpleinductive{} command can be described by the + railroad diagram in Figure~\ref{fig:railroad}. This diagram more or less + translates directly into the parser + + @{ML_chunk [display,gray] parser} + + which we also described in Section~\ref{sec:parsingspecs} In order to add a new inductive predicate to a theory with the help of our package, the user must \emph{invoke} it. For every package, there are essentially two different ways of invoking it, which we will refer to as \emph{external} and \emph{internal}. By external invocation we mean that the - package is called from within a theory document. In this case, the type of - the inductive predicate, as well as its introduction rules, are given as - strings by the user. Before the package can actually make the definition, - the type and introduction rules have to be parsed. In contrast, internal - invocation means that the package is called by some other package. For - example, the function definition package \cite{Krauss-IJCAR06} calls the - inductive definition package to define the graph of the function. However, - it is not a good idea for the function definition package to pass the - introduction rules for the function graph to the inductive definition - package as strings. In this case, it is better to directly pass the rules to - the package as a list of terms, which is more robust than handling strings - that are lacking the additional structure of terms. These two ways of - invoking the package are reflected in its ML programming interface, which - consists of two functions: + package is called from within a theory document. In this case, the + specification of the inductive predicate, including type annotations and + introduction rules, are given as strings by the user. Before the package can + actually make the definition, the type and introduction rules have to be + parsed. In contrast, internal invocation means that the package is called by + some other package. For example, the function definition package + \cite{Krauss-IJCAR06} calls the inductive definition package to define the + graph of the function. However, it is not a good idea for the function + definition package to pass the introduction rules for the function graph to + the inductive definition package as strings. In this case, it is better to + directly pass the rules to the package as a list of terms, which is more + robust than handling strings that are lacking the additional structure of + terms. These two ways of invoking the package are reflected in its ML + programming interface, which consists of two functions: + @{ML_chunk [display,gray] SIMPLE_INDUCTIVE_PACKAGE} *} text {* + (FIXME: explain Binding.binding; mixfix; Attrib.src; Attrib.src somewhere else) + + The function for external invocation of the package is called @{ML add_inductive in SimpleInductivePackage}, whereas the one for internal invocation is called @{ML add_inductive_i in SimpleInductivePackage}. Both of these functions take as arguments the names and types of the inductive predicates, the names and types of their parameters, the actual introduction rules and a \emph{local theory}. They return a local theory containing the - definition, together with a tuple containing the introduction and induction - rules, which are stored in the local theory, too. In contrast to an - ordinary theory, which simply consists of a type signature, as well as - tables for constants, axioms and theorems, a local theory also contains - additional context information, such as locally fixed variables and local - assumptions that may be used by the package. The type @{ML_type - local_theory} is identical to the type of \emph{proof contexts} @{ML_type - "Proof.context"}, although not every proof context constitutes a valid local - theory. Note that @{ML add_inductive_i in SimpleInductivePackage} expects + definition and the induction principle as well introduction rules. + + In contrast to an ordinary theory, which simply consists of a type + signature, as well as tables for constants, axioms and theorems, a local + theory also contains additional context information, such as locally fixed + variables and local assumptions that may be used by the package. The type + @{ML_type local_theory} is identical to the type of \emph{proof contexts} + @{ML_type "Proof.context"}, although not every proof context constitutes a + valid local theory. + + + Note that @{ML add_inductive_i in SimpleInductivePackage} expects the types of the predicates and parameters to be specified using the datatype @{ML_type typ} of Isabelle's logical framework, whereas @{ML add_inductive in SimpleInductivePackage} expects them to be given as optional strings. If no string is given for a particular predicate or parameter, this means that the type should be inferred by the - package. Additional \emph{mixfix syntax} may be associated with the + package. + + + Additional \emph{mixfix syntax} may be associated with the predicates and parameters as well. Note that @{ML add_inductive_i in SimpleInductivePackage} does not allow mixfix syntax to be associated with - parameters, since it can only be used for parsing. The names of the + parameters, since it can only be used for parsing.\footnote{FIXME: why ist it there then?} + The names of the predicates, parameters and rules are represented by the type @{ML_type Binding.binding}. Strings can be turned into elements of the type @{ML_type Binding.binding} using the function @{ML [display] "Binding.name : string -> @@ -162,237 +281,8 @@ The definition of the transitive closure should look as follows: *} -simple_inductive - trcl for r :: "'a \ 'a \ bool" -where - base: "trcl r x x" -| step: "trcl r x y \ r y z \ trcl r x z" -(*<*) -thm trcl_def -thm trcl.induct -thm base -thm step -thm trcl.intros - -lemma trcl_strong_induct: - assumes trcl: "trcl r x y" - and I1: "\x. P x x" - and I2: "\x y z. P x y \ trcl r x y \ r y z \ P x z" - shows "P x y" -proof - - from trcl - have "P x y \ trcl r x y" - proof induct - case (base x) - from I1 and trcl.base show ?case .. - next - case (step x y z) - then have "P x y" and "trcl r x y" by simp_all - from `P x y` `trcl r x y` `r y z` have "P x z" - by (rule I2) - moreover from `trcl r x y` `r y z` have "trcl r x z" - by (rule trcl.step) - ultimately show ?case .. - qed - then show ?thesis .. -qed -(*>*) - -text {* Even and odd numbers can be defined by *} - -simple_inductive - even and odd -where - even0: "even 0" -| evenS: "odd n \ even (Suc n)" -| oddS: "even n \ odd (Suc n)" -(*<*) -thm even_def odd_def -thm even.induct odd.induct -thm even0 -thm evenS -thm oddS -thm even_odd.intros -(*>*) - -text {* The accessible part of a relation can be introduced as follows: *} - -simple_inductive - accpart for r :: "'a \ 'a \ bool" -where - accpartI: "(\y. r y x \ accpart r y) \ accpart r x" -(*<*) -thm accpart_def -thm accpart.induct -thm accpartI -(*>*) - -text {* - Moreover, it should also be possible to define the accessible part - inside a locale fixing the relation @{text r}: -*} - -locale rel = - fixes r :: "'a \ 'a \ bool" - -simple_inductive (in rel) accpart' -where - accpartI': "\x. (\y. r y x \ accpart' y) \ accpart' x" -(*<*) -context rel -begin - -thm accpartI' -thm accpart'.induct - -end - -thm rel.accpartI' -thm rel.accpart'.induct - -(*>*) - text {* - In this context, it is important to note that Isabelle distinguishes - between \emph{outer} and \emph{inner} syntax. Theory commands such as - \isa{\isacommand{simple{\isacharunderscore}inductive} $\ldots$ \isacommand{for} $\ldots$ \isacommand{where} $\ldots$} - belong to the outer syntax, whereas items in quotation marks, in particular - terms such as @{text [source] "trcl r x x"} and types such as - @{text [source] "'a \ 'a \ bool"} belong to the inner syntax. - Separating the two layers of outer and inner syntax greatly simplifies - matters, because the parser for terms and types does not have to know - anything about the possible syntax of theory commands, and the parser - for theory commands need not be concerned about the syntactic structure - of terms and types. - - \medskip - \noindent - The syntax of the \isa{\isacommand{simple{\isacharunderscore}inductive}} command - can be described by the following railroad diagram: - \begin{rail} - 'simple\_inductive' target? fixes ('for' fixes)? \\ - ('where' (thmdecl? prop + '|'))? - ; - \end{rail} - - \paragraph{Functional parsers} - - For parsing terms and types, Isabelle uses a rather general and sophisticated - algorithm due to Earley, which is driven by \emph{priority grammars}. - In contrast, parsers for theory syntax are built up using a set of combinators. - Functional parsing using combinators is a well-established technique, which - has been described by many authors, including Paulson \cite{paulson-ML-91} - and Wadler \cite{Wadler-AFP95}. - The central idea is that a parser is a function of type @{ML_type "'a list -> 'b * 'a list"}, - where @{ML_type "'a"} is a type of \emph{tokens}, and @{ML_type "'b"} is a type for - encoding items that the parser has recognized. When a parser is applied to a - list of tokens whose prefix it can recognize, it returns an encoding of the - prefix as an element of type @{ML_type "'b"}, together with the suffix of the list - containing the remaining tokens. Otherwise, the parser raises an exception - indicating a syntax error. The library for writing functional parsers in - Isabelle can roughly be split up into two parts. The first part consists of a - collection of generic parser combinators that are contained in the structure - @{ML_struct Scan} defined in the file @{ML_file "Pure/General/scan.ML"} in the Isabelle - sources. While these combinators do not make any assumptions about the concrete - structure of the tokens used, the second part of the library consists of combinators - for dealing with specific token types. - The following is an excerpt from the signature of @{ML_struct Scan}: - - \begin{table} - @{ML "|| : ('a -> 'b) * ('a -> 'b) -> 'a -> 'b"} \\ - @{ML "-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> ('b * 'd) * 'e"} \\ - @{ML "|-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'd * 'e"} \\ - @{ML "--| : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'b * 'e"} \\ - @{ML "optional: ('a -> 'b * 'a) -> 'b -> 'a -> 'b * 'a" in Scan} \\ - @{ML "repeat: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" in Scan} \\ - @{ML "repeat1: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" in Scan} \\ - @{ML ">> : ('a -> 'b * 'c) * ('b -> 'd) -> 'a -> 'd * 'c"} \\ - @{ML "!! : ('a * string option -> string) -> ('a -> 'b) -> 'a -> 'b"} - \end{table} - - Interestingly, the functions shown above are so generic that they do not - even rely on the input and output of the parser being a list of tokens. - If \texttt{p} succeeds, i.e.\ does not raise an exception, the parser - @{ML "p || q" for p q} returns the result of \texttt{p}, otherwise it returns - the result of \texttt{q}. The parser @{ML "p -- q" for p q} first parses an - item of type @{ML_type "'b"} using \texttt{p}, then passes the remaining tokens - of type @{ML_type "'c"} to \texttt{q}, which parses an item of type @{ML_type "'d"} - and returns the remaining tokens of type @{ML_type "'e"}, which are finally - returned together with a pair of type @{ML_type "'b * 'd"} containing the two - parsed items. The parsers @{ML "p |-- q" for p q} and @{ML "p --| q" for p q} - work in a similar way as the previous one, with the difference that they - discard the item parsed by the first and the second parser, respectively. - If \texttt{p} succeeds, the parser @{ML "optional p x" for p x in Scan} returns the result - of \texttt{p}, otherwise it returns the default value \texttt{x}. The parser - @{ML "repeat p" for p in Scan} applies \texttt{p} as often as it can, returning a possibly - empty list of parsed items. The parser @{ML "repeat1 p" for p in Scan} is similar, - but requires \texttt{p} to succeed at least once. The parser - @{ML "p >> f" for p f} uses \texttt{p} to parse an item of type @{ML_type "'b"}, to which - it applies the function \texttt{f} yielding a value of type @{ML_type "'d"}, which - is returned together with the remaining tokens of type @{ML_type "'c"}. - Finally, @{ML "!!"} is used for transforming exceptions produced by parsers. - If \texttt{p} raises an exception indicating that it cannot parse a given input, - then an enclosing parser such as - @{ML [display] "q -- p || r" for p q r} - will try the alternative parser \texttt{r}. By writing - @{ML [display] "q -- !! err p || r" for err p q r} - instead, one can achieve that a failure of \texttt{p} causes the whole parser to abort. - The @{ML "!!"} operator is similar to the \emph{cut} operator in Prolog, which prevents - the interpreter from backtracking. The \texttt{err} function supplied as an argument - to @{ML "!!"} can be used to produce an error message depending on the current - state of the parser, as well as the optional error message returned by \texttt{p}. - - So far, we have only looked at combinators that construct more complex parsers - from simpler parsers. In order for these combinators to be useful, we also need - some basic parsers. As an example, we consider the following two parsers - defined in @{ML_struct Scan}: - - \begin{table} - @{ML "one: ('a -> bool) -> 'a list -> 'a * 'a list" in Scan} \\ - @{ML "$$ : string -> string list -> string * string list"} - \end{table} - - The parser @{ML "one pred" for pred in Scan} parses exactly one token that - satisfies the predicate \texttt{pred}, whereas @{ML "$$ s" for s} only - accepts a token that equals the string \texttt{s}. Note that we can easily - express @{ML "$$ s" for s} using @{ML "one" in Scan}: - @{ML [display] "one (fn s' => s' = s)" for s in Scan} - As an example, let us look at how we can use @{ML "$$"} and @{ML "--"} to parse - the prefix ``\texttt{hello}'' of the character list ``\texttt{hello world}'': - - @{ML_response [display] - "($$ \"h\" -- $$ \"e\" -- $$ \"l\" -- $$ \"l\" -- $$ \"o\") - [\"h\", \"e\", \"l\", \"l\", \"o\", \" \", \"w\", \"o\", \"r\", \"l\", \"d\"]" - "(((((\"h\", \"e\"), \"l\"), \"l\"), \"o\"), [\" \", \"w\", \"o\", \"r\", \"l\", \"d\"]) - : ((((string * string) * string) * string) * string) * string list"} - - Most of the time, however, we will have to deal with tokens that are not just strings. - The parsers for the theory syntax, as well as the parsers for the argument syntax - of proof methods and attributes use the token type @{ML_type OuterParse.token}, - which is identical to @{ML_type OuterLex.token}. - The parser functions for the theory syntax are contained in the structure - @{ML_struct OuterParse} defined in the file @{ML_file "Pure/Isar/outer_parse.ML"}. - In our parser, we will use the following functions: - - \begin{table} - @{ML "$$$ : string -> token list -> string * token list" in OuterParse} \\ - @{ML "enum1: string -> (token list -> 'a * token list) -> token list -> - 'a list * token list" in OuterParse} \\ - @{ML "prop: token list -> string * token list" in OuterParse} \\ - @{ML "opt_target: token list -> string option * token list" in OuterParse} \\ - @{ML "fixes: token list -> - (Binding.binding * string option * mixfix) list * token list" in OuterParse} \\ - @{ML "for_fixes: token list -> - (Binding.binding * string option * mixfix) list * token list" in OuterParse} \\ - @{ML "!!! : (token list -> 'a) -> token list -> 'a" in OuterParse} - \end{table} - - The parsers @{ML "$$$" in OuterParse} and @{ML "!!!" in OuterParse} are - defined using the parsers @{ML "one" in Scan} and @{ML "!!"} from - @{ML_struct Scan}. - The parser @{ML "enum1 s p" for s p in OuterParse} parses a non-emtpy list of items - recognized by the parser \texttt{p}, where the items are separated by \texttt{s}. A proposition can be parsed using the function @{ML prop in OuterParse}. Essentially, a proposition is just a string or an identifier, but using the specific parser function @{ML prop in OuterParse} leads to more instructive @@ -414,7 +304,7 @@ We now have all the necessary tools to write the parser for our \isa{\isacommand{simple{\isacharunderscore}inductive}} command: - @{ML_chunk [display] syntax} + The definition of the parser \verb!ind_decl! closely follows the railroad diagram shown above. In order to make the code more readable, the structures