diff -r 53460ac408b5 -r 5bb2d29553c2 CookBook/Package/Ind_Interface.thy
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/CookBook/Package/Ind_Interface.thy	Fri Oct 10 17:13:21 2008 +0200
@@ -0,0 +1,449 @@
+theory Ind_Interface
+imports Base Simple_Inductive_Package
+begin
+
+(*<*)
+ML {*
+structure SIP = SimpleInductivePackage
+*}
+(*>*)
+
+section{* The interface *}
+
+text {*
+\label{sec:ind-interface}
+In order to add a new inductive predicate to a theory with the help of our package, the user
+must \emph{invoke} it. For every package, there are essentially two different ways of invoking
+it, which we will refer to as \emph{external} and \emph{internal}. By external
+invocation we mean that the package is called from within a theory document. In this case,
+the type of the inductive predicate, as well as its introduction rules, are given as strings
+by the user. Before the package can actually make the definition, the type and introduction
+rules have to be parsed. In contrast, internal invocation means that the package is called
+by some other package. For example, the function definition package \cite{Krauss-IJCAR06}
+calls the inductive definition package to define the graph of the function. However, it is
+not a good idea for the function definition package to pass the introduction rules for the
+function graph to the inductive definition package as strings. In this case, it is better
+to directly pass the rules to the package as a list of terms, which is more robust than
+handling strings that are lacking the additional structure of terms. These two ways of
+invoking the package are reflected in its ML programming interface, which consists of two
+functions:
+@{ML_chunk [display] SIMPLE_INDUCTIVE_PACKAGE}
+The function for external invocation of the package is called @{ML_open add_inductive (SIP)},
+whereas the one for internal invocation is called @{ML_open add_inductive_i (SIP)}. Both
+of these functions take as arguments the names and types of the inductive predicates, the
+names and types of their parameters, the actual introduction rules and a \emph{local theory}.
+They return a local theory containing the definition, together with a tuple containing
+the introduction and induction rules, which are stored in the local theory, too.
+In contrast to an ordinary theory, which simply consists of a type signature, as
+well as tables for constants, axioms and theorems, a local theory also contains
+additional context information, such as locally fixed variables and local assumptions
+that may be used by the package. The type @{ML_type local_theory} is identical to the
+type of \emph{proof contexts} @{ML_type "Proof.context"}, although not every proof context
+constitutes a valid local theory.
+Note that @{ML_open add_inductive_i (SIP)} expects the types
+of the predicates and parameters to be specified using the datatype @{ML_type typ} of Isabelle's
+logical framework, whereas @{ML_open add_inductive (SIP)}
+expects them to be given as optional strings. If no string is
+given for a particular predicate or parameter, this means that the type should be
+inferred by the package. Additional \emph{mixfix syntax} may be associated with
+the predicates and parameters as well. Note that @{ML_open add_inductive_i (SIP)} does not
+allow mixfix syntax to be associated with parameters, since it can only be used
+for parsing. The names of the predicates, parameters and rules are represented by the
+type @{ML_type Name.binding}. Strings can be turned into elements of the type
+@{ML_type Name.binding} using the function
+@{ML [display] "Name.binding : string -> Name.binding"}
+Each introduction rule is given as a tuple containing its name, a list of \emph{attributes}
+and a logical formula. Note that the type @{ML_type Attrib.binding} used in the list of
+introduction rules is just a shorthand for the type @{ML_type "Name.binding * Attrib.src list"}.
+The function @{ML_open add_inductive_i (SIP)} expects the formula to be specified using the datatype
+@{ML_type term}, whereas @{ML_open add_inductive (SIP)} expects it to be given as a string.
+An attribute specifies additional actions and transformations that should be applied to
+a theorem, such as storing it in the rule databases used by automatic tactics
+like the simplifier. The code of the package, which will be described in the following
+section, will mostly treat attributes as a black box and just forward them to other
+functions for storing theorems in local theories.
+The implementation of the function @{ML_open add_inductive (SIP)} for external invocation
+of the package is quite simple. Essentially, it just parses the introduction rules
+and then passes them on to @{ML_open add_inductive_i (SIP)}:
+@{ML_chunk [display] add_inductive}
+For parsing and type checking the introduction rules, we use the function
+@{ML_open [display] "Specification.read_specification:
+  (Name.binding * string option * mixfix) list ->  (*{variables}*)
+  (Attrib.binding * string list) list list ->  (*{rules}*)
+  local_theory ->
+  (((Name.binding * typ) * mixfix) list *
+   (Attrib.binding * term list) list) *
+  local_theory"}
+During parsing, both predicates and parameters are treated as variables, so
+the lists \verb!preds_syn! and \verb!params_syn! are just appended
+before being passed to @{ML_open read_specification (Specification)}. Note that the format
+for rules supported by @{ML_open read_specification (Specification)} is more general than
+what is required for our package. It allows several rules to be associated
+with one name, and the list of rules can be partitioned into several
+sublists. In order for the list \verb!intro_srcs! of introduction rules
+to be acceptable as an input for @{ML_open read_specification (Specification)}, we first
+have to turn it into a list of singleton lists. This transformation
+has to be reversed later on by applying the function
+@{ML [display] "the_single: 'a list -> 'a"}
+to the list \verb!specs! containing the parsed introduction rules.
+The function @{ML_open read_specification (Specification)} also returns the list \verb!vars!
+of predicates and parameters that contains the inferred types as well.
+This list has to be chopped into the two lists \verb!preds_syn'! and
+\verb!params_syn'! for predicates and parameters, respectively.
+All variables occurring in a rule but not in the list of variables passed to
+@{ML_open read_specification (Specification)} will be bound by a meta-level universal
+quantifier.
+Finally, @{ML_open read_specification (Specification)} also returns another local theory,
+but we can safely discard it. As an example, let us look at how we can use this
+function to parse the introduction rules of the @{text trcl} predicate:
+@{ML_response [display]
+"Specification.read_specification
+  [(Name.binding \"trcl\", NONE, NoSyn),
+   (Name.binding \"r\", SOME \"'a \<Rightarrow> 'a \<Rightarrow> bool\", NoSyn)]
+  [[((Name.binding \"base\", []), [\"trcl r x x\"])],
+   [((Name.binding \"step\", []), [\"trcl r x y \<Longrightarrow> r y z \<Longrightarrow> trcl r x z\"])]]
+  @{context}"
+"((\<dots>,
+  [(\<dots>,
+    [Const (\"all\", \<dots>) $ Abs (\"x\", TFree (\"'a\", \<dots>),
+       Const (\"Trueprop\", \<dots>) $
+         (Free (\"trcl\", \<dots>) $ Free (\"r\", \<dots>) $ Bound 0 $ Bound 0))]),
+   (\<dots>,
+    [Const (\"all\", \<dots>) $ Abs (\"x\", TFree (\"'a\", \<dots>),
+       Const (\"all\", \<dots>) $ Abs (\"y\", TFree (\"'a\", \<dots>),
+         Const (\"all\", \<dots>) $ Abs (\"z\", TFree (\"'a\", \<dots>),
+           Const (\"==>\", \<dots>) $
+             (Const (\"Trueprop\", \<dots>) $
+                (Free (\"trcl\", \<dots>) $ Free (\"r\", \<dots>) $ Bound 2 $ Bound 1)) $
+             (Const (\"==>\", \<dots>) $ \<dots> $ \<dots>))))])]),
+ \<dots>)
+: (((Name.binding * typ) * mixfix) list *
+   (Attrib.binding * term list) list) * local_theory"}
+In the list of variables passed to @{ML_open read_specification (Specification)}, we have
+used the mixfix annotation @{ML NoSyn} to indicate that we do not want to associate any
+mixfix syntax with the variable. Moreover, we have only specified the type of \texttt{r},
+whereas the type of \texttt{trcl} is computed using type inference.
+The local variables \texttt{x}, \texttt{y} and \texttt{z} of the introduction rules
+are turned into bound variables with the de Bruijn indices,
+whereas \texttt{trcl} and \texttt{r} remain free variables.
+
+\paragraph{Parsers for theory syntax}
+
+Although the function @{ML_open add_inductive (SIP)} parses terms and types, it still
+cannot be used to invoke the package directly from within a theory document.
+In order to do this, we have to write another parser. Before we describe
+the process of writing parsers for theory syntax in more detail, we first
+show some examples of how we would like to use the inductive definition
+package.
+
+\noindent
+The definition of the transitive closure should look as follows:
+*}
+
+simple_inductive
+  trcl for r :: "'a \<Rightarrow> 'a \<Rightarrow> bool"
+where
+  base: "trcl r x x"
+| step: "trcl r x y \<Longrightarrow> r y z \<Longrightarrow> trcl r x z"
+(*<*)
+thm trcl_def
+thm trcl.induct
+thm base
+thm step
+thm trcl.intros
+
+lemma trcl_strong_induct:
+  assumes trcl: "trcl r x y"
+  and I1: "\<And>x. P x x"
+  and I2: "\<And>x y z. P x y \<Longrightarrow> trcl r x y \<Longrightarrow> r y z \<Longrightarrow> P x z"
+  shows "P x y" 
+proof -
+  from trcl
+  have "P x y \<and> trcl r x y"
+  proof induct
+    case (base x)
+    from I1 and trcl.base show ?case ..
+  next
+    case (step x y z)
+    then have "P x y" and "trcl r x y" by simp_all
+    from `P x y` `trcl r x y` `r y z` have "P x z"
+      by (rule I2)
+    moreover from `trcl r x y` `r y z` have "trcl r x z"
+      by (rule trcl.step)
+    ultimately show ?case ..
+  qed
+  then show ?thesis ..
+qed
+(*>*)
+
+text {*
+\noindent
+Even and odd numbers can be defined by
+*}
+
+simple_inductive
+  even and odd
+where
+  even0: "even 0"
+| evenS: "odd n \<Longrightarrow> even (Suc n)"
+| oddS: "even n \<Longrightarrow> odd (Suc n)"
+(*<*)
+thm even_def odd_def
+thm even.induct odd.induct
+thm even0
+thm evenS
+thm oddS
+thm even_odd.intros
+(*>*)
+
+text {*
+\noindent
+The accessible part of a relation can be introduced as follows:
+*}
+
+simple_inductive
+  accpart for r :: "'a \<Rightarrow> 'a \<Rightarrow> bool"
+where
+  accpartI: "(\<And>y. r y x \<Longrightarrow> accpart r y) \<Longrightarrow> accpart r x"
+(*<*)
+thm accpart_def
+thm accpart.induct
+thm accpartI
+(*>*)
+
+text {*
+\noindent
+Moreover, it should also be possible to define the accessible part
+inside a locale fixing the relation @{text r}:
+*}
+
+locale rel =
+  fixes r :: "'a \<Rightarrow> 'a \<Rightarrow> bool"
+
+simple_inductive (in rel) accpart'
+where
+  accpartI': "\<And>x. (\<And>y. r y x \<Longrightarrow> accpart' y) \<Longrightarrow> accpart' x"
+(*<*)
+context rel
+begin
+
+thm accpartI'
+thm accpart'.induct
+
+end
+
+thm rel.accpartI'
+thm rel.accpart'.induct
+
+ML {*
+val (result, lthy) = SimpleInductivePackage.add_inductive
+  [(Name.binding "trcl'", NONE, NoSyn)] [(Name.binding "r", SOME "'a \<Rightarrow> 'a \<Rightarrow> bool", NoSyn)]
+  [((Name.binding "base", []), "\<And>x. trcl' r x x"), ((Name.binding "step", []), "\<And>x y z. trcl' r x y \<Longrightarrow> r y z \<Longrightarrow> trcl' r x z")]
+  (TheoryTarget.init NONE @{theory})
+*}
+(*>*)
+
+text {*
+\noindent
+In this context, it is important to note that Isabelle distinguishes
+between \emph{outer} and \emph{inner} syntax. Theory commands such as
+\isa{\isacommand{simple{\isacharunderscore}inductive} $\ldots$ \isacommand{for} $\ldots$ \isacommand{where} $\ldots$}
+belong to the outer syntax, whereas items in quotation marks, in particular
+terms such as @{text [source] "trcl r x x"} and types such as
+@{text [source] "'a \<Rightarrow> 'a \<Rightarrow> bool"} belong to the inner syntax.
+Separating the two layers of outer and inner syntax greatly simplifies
+matters, because the parser for terms and types does not have to know
+anything about the possible syntax of theory commands, and the parser
+for theory commands need not be concerned about the syntactic structure
+of terms and types.
+
+\medskip
+\noindent
+The syntax of the \isa{\isacommand{simple{\isacharunderscore}inductive}} command
+can be described by the following railroad diagram:
+\begin{rail}
+  'simple\_inductive' target? fixes ('for' fixes)? \\
+  ('where' (thmdecl? prop + '|'))?
+  ;
+\end{rail}
+
+\paragraph{Functional parsers}
+
+For parsing terms and types, Isabelle uses a rather general and sophisticated
+algorithm due to Earley, which is driven by \emph{priority grammars}.
+In contrast, parsers for theory syntax are built up using a set of combinators.
+Functional parsing using combinators is a well-established technique, which
+has been described by many authors, including Paulson \cite{paulson-ML-91}
+and Wadler \cite{Wadler-AFP95}. 
+The central idea is that a parser is a function of type @{ML_type "'a list -> 'b * 'a list"},
+where @{ML_type "'a"} is a type of \emph{tokens}, and @{ML_type "'b"} is a type for
+encoding items that the parser has recognized. When a parser is applied to a
+list of tokens whose prefix it can recognize, it returns an encoding of the
+prefix as an element of type @{ML_type "'b"}, together with the suffix of the list
+containing the remaining tokens. Otherwise, the parser raises an exception
+indicating a syntax error. The library for writing functional parsers in
+Isabelle can roughly be split up into two parts. The first part consists of a
+collection of generic parser combinators that are contained in the structure
+@{ML_struct Scan} defined in the file @{ML_file "Pure/General/scan.ML"} in the Isabelle
+sources. While these combinators do not make any assumptions about the concrete
+structure of the tokens used, the second part of the library consists of combinators
+for dealing with specific token types.
+The following is an excerpt from the signature of @{ML_struct Scan}:
+\begin{mytable}
+@{ML "|| : ('a -> 'b) * ('a -> 'b) -> 'a -> 'b"} \\
+@{ML "-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> ('b * 'd) * 'e"} \\
+@{ML "|-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'd * 'e"} \\
+@{ML "--| : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'b * 'e"} \\
+@{ML_open "optional: ('a -> 'b * 'a) -> 'b -> 'a -> 'b * 'a" (Scan)} \\
+@{ML_open "repeat: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" (Scan)} \\
+@{ML_open "repeat1: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" (Scan)} \\
+@{ML ">> : ('a -> 'b * 'c) * ('b -> 'd) -> 'a -> 'd * 'c"} \\
+@{ML "!! : ('a * string option -> string) -> ('a -> 'b) -> 'a -> 'b"}
+\end{mytable}
+Interestingly, the functions shown above are so generic that they do not
+even rely on the input and output of the parser being a list of tokens.
+If \texttt{p} succeeds, i.e.\ does not raise an exception, the parser
+@{ML_open "p || q" for p q} returns the result of \texttt{p}, otherwise it returns
+the result of \texttt{q}. The parser @{ML_open "p -- q" for p q} first parses an
+item of type @{ML_type "'b"} using \texttt{p}, then passes the remaining tokens
+of type @{ML_type "'c"} to \texttt{q}, which parses an item of type @{ML_type "'d"}
+and returns the remaining tokens of type @{ML_type "'e"}, which are finally
+returned together with a pair of type @{ML_type "'b * 'd"} containing the two
+parsed items. The parsers @{ML_open "p |-- q" for p q} and @{ML_open "p --| q" for p q}
+work in a similar way as the previous one, with the difference that they
+discard the item parsed by the first and the second parser, respectively.
+If \texttt{p} succeeds, the parser @{ML_open "optional p x" for p x (Scan)} returns the result
+of \texttt{p}, otherwise it returns the default value \texttt{x}. The parser
+@{ML_open "repeat p" for p (Scan)} applies \texttt{p} as often as it can, returning a possibly
+empty list of parsed items. The parser @{ML_open "repeat1 p" for p (Scan)} is similar,
+but requires \texttt{p} to succeed at least once. The parser
+@{ML_open "p >> f" for p f} uses \texttt{p} to parse an item of type @{ML_type "'b"}, to which
+it applies the function \texttt{f} yielding a value of type @{ML_type "'d"}, which
+is returned together with the remaining tokens of type @{ML_type "'c"}.
+Finally, @{ML "!!"} is used for transforming exceptions produced by parsers.
+If \texttt{p} raises an exception indicating that it cannot parse a given input,
+then an enclosing parser such as
+@{ML_open [display] "q -- p || r" for p q r}
+will try the alternative parser \texttt{r}. By writing
+@{ML_open [display] "q -- !! err p || r" for err p q r}
+instead, one can achieve that a failure of \texttt{p} causes the whole parser to abort.
+The @{ML "!!"} operator is similar to the \emph{cut} operator in Prolog, which prevents
+the interpreter from backtracking. The \texttt{err} function supplied as an argument
+to @{ML "!!"} can be used to produce an error message depending on the current
+state of the parser, as well as the optional error message returned by \texttt{p}.
+
+So far, we have only looked at combinators that construct more complex parsers
+from simpler parsers. In order for these combinators to be useful, we also need
+some basic parsers. As an example, we consider the following two parsers
+defined in @{ML_struct Scan}:
+\begin{mytable}
+@{ML_open "one: ('a -> bool) -> 'a list -> 'a * 'a list" (Scan)} \\
+@{ML_open "$$ : string -> string list -> string * string list"}
+\end{mytable}
+The parser @{ML_open "one pred" for pred (Scan)} parses exactly one token that
+satisfies the predicate \texttt{pred}, whereas @{ML_open "$$ s" for s} only
+accepts a token that equals the string \texttt{s}. Note that we can easily
+express @{ML_open "$$ s" for s} using @{ML_open "one" (Scan)}:
+@{ML_open [display] "one (fn s' => s' = s)" for s (Scan)}
+As an example, let us look at how we can use @{ML "$$"} and @{ML "--"} to parse
+the prefix ``\texttt{hello}'' of the character list ``\texttt{hello world}'':
+@{ML_response [display]
+"($$ \"h\" -- $$ \"e\" -- $$ \"l\" -- $$ \"l\" -- $$ \"o\")
+[\"h\", \"e\", \"l\", \"l\", \"o\", \" \", \"w\", \"o\", \"r\", \"l\", \"d\"]"
+"(((((\"h\", \"e\"), \"l\"), \"l\"), \"o\"), [\" \", \"w\", \"o\", \"r\", \"l\", \"d\"])
+: ((((string * string) * string) * string) * string) * string list"}
+Most of the time, however, we will have to deal with tokens that are not just strings.
+The parsers for the theory syntax, as well as the parsers for the argument syntax
+of proof methods and attributes use the token type @{ML_type OuterParse.token},
+which is identical to the type @{ML_type OuterLex.token}.
+The parser functions for the theory syntax are contained in the structure
+@{ML_struct OuterParse} defined in the file @{ML_file "Pure/Isar/outer_parse.ML"}.
+In our parser, we will use the following functions:
+\begin{mytable}
+@{ML_open "$$$ : string -> token list -> string * token list" (OuterParse)} \\
+@{ML_open "enum1: string -> (token list -> 'a * token list) -> token list ->
+  'a list * token list" (OuterParse)} \\
+@{ML_open "prop: token list -> string * token list" (OuterParse)} \\
+@{ML_open "opt_target: token list -> string option * token list" (OuterParse)} \\
+@{ML_open "fixes: token list ->
+  (Name.binding * string option * mixfix) list * token list" (OuterParse)} \\
+@{ML_open "for_fixes: token list ->
+  (Name.binding * string option * mixfix) list * token list" (OuterParse)} \\
+@{ML_open "!!! : (token list -> 'a) -> token list -> 'a" (OuterParse)}
+\end{mytable}
+The parsers @{ML_open "$$$" (OuterParse)} and @{ML_open "!!!" (OuterParse)} are
+defined using the parsers @{ML_open "one" (Scan)} and @{ML "!!"} from
+@{ML_struct Scan}.
+The parser @{ML_open "enum1 s p" for s p (OuterParse)} parses a non-emtpy list of items
+recognized by the parser \texttt{p}, where the items are separated by \texttt{s}.
+A proposition can be parsed using the function @{ML_open prop (OuterParse)}.
+Essentially, a proposition is just a string or an identifier, but using the
+specific parser function @{ML_open prop (OuterParse)} leads to more instructive
+error messages, since the parser will complain that a proposition was expected
+when something else than a string or identifier is found.
+An optional locale target specification of the form \isa{(\isacommand{in}\ $\ldots$)}
+can be parsed using @{ML_open opt_target (OuterParse)}.
+The lists of names of the predicates and parameters, together with optional
+types and syntax, are parsed using the functions @{ML_open "fixes" (OuterParse)}
+and @{ML_open for_fixes (OuterParse)}, respectively.
+In addition, the following function from @{ML_struct SpecParse} for parsing
+an optional theorem name and attribute, followed by a delimiter, will be useful:
+\begin{mytable}
+@{ML_open "opt_thm_name:
+  string -> token list -> Attrib.binding * token list" (SpecParse)}
+\end{mytable}
+We now have all the necessary tools to write the parser for our
+\isa{\isacommand{simple{\isacharunderscore}inductive}} command:
+@{ML_chunk [display] syntax}
+The definition of the parser \verb!ind_decl! closely follows the railroad
+diagram shown above. In order to make the code more readable, the structures
+@{ML_struct OuterParse} and @{ML_struct OuterKeyword} are abbreviated by
+\texttt{P} and \texttt{K}, respectively. Note how the parser combinator
+@{ML_open "!!!" (OuterParse)} is used: once the keyword \texttt{where}
+has been parsed, a non-empty list of introduction rules must follow.
+Had we not used the combinator @{ML_open "!!!" (OuterParse)}, a
+\texttt{where} not followed by a list of rules would have caused the parser
+to respond with the somewhat misleading error message
+\begin{verbatim}
+  Outer syntax error: end of input expected, but keyword where was found
+\end{verbatim}
+rather than with the more instructive message
+\begin{verbatim}
+  Outer syntax error: proposition expected, but terminator was found
+\end{verbatim}
+Once all arguments of the command have been parsed, we apply the function
+@{ML_open add_inductive (SimpleInductivePackage)}, which yields a local theory
+transformer of type @{ML_type "local_theory -> local_theory"}. Commands in
+Isabelle/Isar are realized by transition transformers of type
+@{ML_type [display] "Toplevel.transition -> Toplevel.transition"}
+We can turn a local theory transformer into a transition transformer by using
+the function
+@{ML [display] "Toplevel.local_theory : string option ->
+  (local_theory -> local_theory) ->
+  Toplevel.transition -> Toplevel.transition"}
+which, apart from the local theory transformer, takes an optional name of a locale
+to be used as a basis for the local theory. The whole parser for our command has type
+@{ML_type [display] "OuterLex.token list ->
+  (Toplevel.transition -> Toplevel.transition) * OuterLex.token list"}
+which is abbreviated by @{ML_type OuterSyntax.parser_fn}. The new command can be added
+to the system via the function
+@{ML [display] "OuterSyntax.command :
+  string -> string -> OuterKeyword.T -> OuterSyntax.parser_fn -> unit"}
+which imperatively updates the parser table behind the scenes. In addition to the parser, this
+function takes two strings representing the name of the command and a short description,
+as well as an element of type @{ML_type OuterKeyword.T} describing which \emph{kind} of
+command we intend to add. Since we want to add a command for declaring new concepts,
+we choose the kind @{ML "OuterKeyword.thy_decl"}. Other kinds include
+@{ML "OuterKeyword.thy_goal"}, which is similar to @{ML_open thy_decl (OuterKeyword)},
+but requires the user to prove a goal before making the declaration, or
+@{ML "OuterKeyword.diag"}, which corresponds to a purely diagnostic command that does
+not change the context. For example, the @{ML_open thy_goal (OuterKeyword)} kind is used
+by the \isa{\isacommand{function}} command \cite{Krauss-IJCAR06}, which requires the user
+to prove that a given set of equations is non-overlapping and covers all cases. The kind
+of the command should be chosen with care, since selecting the wrong one can cause strange
+behaviour of the user interface, such as failure of the undo mechanism.
+*}
+
+(*<*)
+end
+(*>*)