--- a/CookBook/Package/Ind_Interface.thy Wed Jan 28 06:43:51 2009 +0000
+++ b/CookBook/Package/Ind_Interface.thy Thu Jan 29 09:46:17 2009 +0000
@@ -2,76 +2,82 @@
imports "../Base" Simple_Inductive_Package
begin
-(*<*)
-ML {*
-structure SIP = SimpleInductivePackage
-*}
-(*>*)
-
-section{* The interface *}
+section{* The Interface \label{sec:ind-interface} *}
text {*
-\label{sec:ind-interface}
-In order to add a new inductive predicate to a theory with the help of our package, the user
-must \emph{invoke} it. For every package, there are essentially two different ways of invoking
-it, which we will refer to as \emph{external} and \emph{internal}. By external
-invocation we mean that the package is called from within a theory document. In this case,
-the type of the inductive predicate, as well as its introduction rules, are given as strings
-by the user. Before the package can actually make the definition, the type and introduction
-rules have to be parsed. In contrast, internal invocation means that the package is called
-by some other package. For example, the function definition package \cite{Krauss-IJCAR06}
-calls the inductive definition package to define the graph of the function. However, it is
-not a good idea for the function definition package to pass the introduction rules for the
-function graph to the inductive definition package as strings. In this case, it is better
-to directly pass the rules to the package as a list of terms, which is more robust than
-handling strings that are lacking the additional structure of terms. These two ways of
-invoking the package are reflected in its ML programming interface, which consists of two
-functions:
-@{ML_chunk [display] SIMPLE_INDUCTIVE_PACKAGE}
+ In order to add a new inductive predicate to a theory with the help of our
+ package, the user must \emph{invoke} it. For every package, there are
+ essentially two different ways of invoking it, which we will refer to as
+ \emph{external} and \emph{internal}. By external invocation we mean that the
+ package is called from within a theory document. In this case, the type of
+ the inductive predicate, as well as its introduction rules, are given as
+ strings by the user. Before the package can actually make the definition,
+ the type and introduction rules have to be parsed. In contrast, internal
+ invocation means that the package is called by some other package. For
+ example, the function definition package \cite{Krauss-IJCAR06} calls the
+ inductive definition package to define the graph of the function. However,
+ it is not a good idea for the function definition package to pass the
+ introduction rules for the function graph to the inductive definition
+ package as strings. In this case, it is better to directly pass the rules to
+ the package as a list of terms, which is more robust than handling strings
+ that are lacking the additional structure of terms. These two ways of
+ invoking the package are reflected in its ML programming interface, which
+ consists of two functions:
+
+ @{ML_chunk [display] SIMPLE_INDUCTIVE_PACKAGE}
*}
text {*
-The function for external invocation of the package is called @{ML add_inductive in SIP},
-whereas the one for internal invocation is called @{ML add_inductive_i in SIP}. Both
-of these functions take as arguments the names and types of the inductive predicates, the
-names and types of their parameters, the actual introduction rules and a \emph{local theory}.
-They return a local theory containing the definition, together with a tuple containing
-the introduction and induction rules, which are stored in the local theory, too.
-In contrast to an ordinary theory, which simply consists of a type signature, as
-well as tables for constants, axioms and theorems, a local theory also contains
-additional context information, such as locally fixed variables and local assumptions
-that may be used by the package. The type @{ML_type local_theory} is identical to the
-type of \emph{proof contexts} @{ML_type "Proof.context"}, although not every proof context
-constitutes a valid local theory.
-Note that @{ML add_inductive_i in SIP} expects the types
-of the predicates and parameters to be specified using the datatype @{ML_type typ} of Isabelle's
-logical framework, whereas @{ML add_inductive in SIP}
-expects them to be given as optional strings. If no string is
-given for a particular predicate or parameter, this means that the type should be
-inferred by the package. Additional \emph{mixfix syntax} may be associated with
-the predicates and parameters as well. Note that @{ML add_inductive_i in SIP} does not
-allow mixfix syntax to be associated with parameters, since it can only be used
-for parsing. The names of the predicates, parameters and rules are represented by the
-type @{ML_type Binding.binding}. Strings can be turned into elements of the type
-@{ML_type Binding.binding} using the function
-@{ML [display] "Binding.name : string -> Binding.binding"}
-Each introduction rule is given as a tuple containing its name, a list of \emph{attributes}
-and a logical formula. Note that the type @{ML_type Attrib.binding} used in the list of
-introduction rules is just a shorthand for the type @{ML_type "Binding.binding * Attrib.src list"}.
-The function @{ML add_inductive_i in SIP} expects the formula to be specified using the datatype
-@{ML_type term}, whereas @{ML add_inductive in SIP} expects it to be given as a string.
-An attribute specifies additional actions and transformations that should be applied to
-a theorem, such as storing it in the rule databases used by automatic tactics
-like the simplifier. The code of the package, which will be described in the following
-section, will mostly treat attributes as a black box and just forward them to other
-functions for storing theorems in local theories.
-The implementation of the function @{ML add_inductive in SIP} for external invocation
-of the package is quite simple. Essentially, it just parses the introduction rules
-and then passes them on to @{ML add_inductive_i in SIP}:
-@{ML_chunk [display] add_inductive}
-For parsing and type checking the introduction rules, we use the function
-@{ML [display] "Specification.read_specification:
+ The function for external invocation of the package is called @{ML
+ add_inductive in SimpleInductivePackage}, whereas the one for internal
+ invocation is called @{ML add_inductive_i in SimpleInductivePackage}. Both
+ of these functions take as arguments the names and types of the inductive
+ predicates, the names and types of their parameters, the actual introduction
+ rules and a \emph{local theory}. They return a local theory containing the
+ definition, together with a tuple containing the introduction and induction
+ rules, which are stored in the local theory, too. In contrast to an
+ ordinary theory, which simply consists of a type signature, as well as
+ tables for constants, axioms and theorems, a local theory also contains
+ additional context information, such as locally fixed variables and local
+ assumptions that may be used by the package. The type @{ML_type
+ local_theory} is identical to the type of \emph{proof contexts} @{ML_type
+ "Proof.context"}, although not every proof context constitutes a valid local
+ theory. Note that @{ML add_inductive_i in SimpleInductivePackage} expects
+ the types of the predicates and parameters to be specified using the
+ datatype @{ML_type typ} of Isabelle's logical framework, whereas @{ML
+ add_inductive in SimpleInductivePackage} expects them to be given as
+ optional strings. If no string is given for a particular predicate or
+ parameter, this means that the type should be inferred by the
+ package. Additional \emph{mixfix syntax} may be associated with the
+ predicates and parameters as well. Note that @{ML add_inductive_i in
+ SimpleInductivePackage} does not allow mixfix syntax to be associated with
+ parameters, since it can only be used for parsing. The names of the
+ predicates, parameters and rules are represented by the type @{ML_type
+ Binding.binding}. Strings can be turned into elements of the type @{ML_type
+ Binding.binding} using the function @{ML [display] "Binding.name : string ->
+ Binding.binding"} Each introduction rule is given as a tuple containing its
+ name, a list of \emph{attributes} and a logical formula. Note that the type
+ @{ML_type Attrib.binding} used in the list of introduction rules is just a
+ shorthand for the type @{ML_type "Binding.binding * Attrib.src list"}. The
+ function @{ML add_inductive_i in SimpleInductivePackage} expects the formula
+ to be specified using the datatype @{ML_type term}, whereas @{ML
+ add_inductive in SimpleInductivePackage} expects it to be given as a string.
+ An attribute specifies additional actions and transformations that should be
+ applied to a theorem, such as storing it in the rule databases used by
+ automatic tactics like the simplifier. The code of the package, which will
+ be described in the following section, will mostly treat attributes as a
+ black box and just forward them to other functions for storing theorems in
+ local theories. The implementation of the function @{ML add_inductive in
+ SimpleInductivePackage} for external invocation of the package is quite
+ simple. Essentially, it just parses the introduction rules and then passes
+ them on to @{ML add_inductive_i in SimpleInductivePackage}:
+
+ @{ML_chunk [display] add_inductive}
+
+ For parsing and type checking the introduction rules, we use the function
+
+ @{ML [display] "Specification.read_specification:
(Binding.binding * string option * mixfix) list -> (*{variables}*)
(Attrib.binding * string list) list list -> (*{rules}*)
local_theory ->
@@ -81,32 +87,33 @@
*}
text {*
-During parsing, both predicates and parameters are treated as variables, so
-the lists \verb!preds_syn! and \verb!params_syn! are just appended
-before being passed to @{ML read_specification in Specification}. Note that the format
-for rules supported by @{ML read_specification in Specification} is more general than
-what is required for our package. It allows several rules to be associated
-with one name, and the list of rules can be partitioned into several
-sublists. In order for the list \verb!intro_srcs! of introduction rules
-to be acceptable as an input for @{ML read_specification in Specification}, we first
-have to turn it into a list of singleton lists. This transformation
-has to be reversed later on by applying the function
-@{ML [display] "the_single: 'a list -> 'a"}
-to the list \verb!specs! containing the parsed introduction rules.
-The function @{ML read_specification in Specification} also returns the list \verb!vars!
-of predicates and parameters that contains the inferred types as well.
-This list has to be chopped into the two lists \verb!preds_syn'! and
-\verb!params_syn'! for predicates and parameters, respectively.
-All variables occurring in a rule but not in the list of variables passed to
-@{ML read_specification in Specification} will be bound by a meta-level universal
-quantifier.
+ During parsing, both predicates and parameters are treated as variables, so
+ the lists \verb!preds_syn! and \verb!params_syn! are just appended
+ before being passed to @{ML read_specification in Specification}. Note that the format
+ for rules supported by @{ML read_specification in Specification} is more general than
+ what is required for our package. It allows several rules to be associated
+ with one name, and the list of rules can be partitioned into several
+ sublists. In order for the list \verb!intro_srcs! of introduction rules
+ to be acceptable as an input for @{ML read_specification in Specification}, we first
+ have to turn it into a list of singleton lists. This transformation
+ has to be reversed later on by applying the function
+ @{ML [display] "the_single: 'a list -> 'a"}
+ to the list \verb!specs! containing the parsed introduction rules.
+ The function @{ML read_specification in Specification} also returns the list \verb!vars!
+ of predicates and parameters that contains the inferred types as well.
+ This list has to be chopped into the two lists \verb!preds_syn'! and
+ \verb!params_syn'! for predicates and parameters, respectively.
+ All variables occurring in a rule but not in the list of variables passed to
+ @{ML read_specification in Specification} will be bound by a meta-level universal
+ quantifier.
*}
text {*
-Finally, @{ML read_specification in Specification} also returns another local theory,
-but we can safely discard it. As an example, let us look at how we can use this
-function to parse the introduction rules of the @{text trcl} predicate:
-@{ML_response [display]
+ Finally, @{ML read_specification in Specification} also returns another local theory,
+ but we can safely discard it. As an example, let us look at how we can use this
+ function to parse the introduction rules of the @{text trcl} predicate:
+
+ @{ML_response [display]
"Specification.read_specification
[(Binding.name \"trcl\", NONE, NoSyn),
(Binding.name \"r\", SOME \"'a \<Rightarrow> 'a \<Rightarrow> bool\", NoSyn)]
@@ -129,27 +136,30 @@
\<dots>)
: (((Binding.binding * typ) * mixfix) list *
(Attrib.binding * term list) list) * local_theory"}
-In the list of variables passed to @{ML read_specification in Specification}, we have
-used the mixfix annotation @{ML NoSyn} to indicate that we do not want to associate any
-mixfix syntax with the variable. Moreover, we have only specified the type of \texttt{r},
-whereas the type of \texttt{trcl} is computed using type inference.
-The local variables \texttt{x}, \texttt{y} and \texttt{z} of the introduction rules
-are turned into bound variables with the de Bruijn indices,
-whereas \texttt{trcl} and \texttt{r} remain free variables.
+
+ In the list of variables passed to @{ML read_specification in Specification}, we have
+ used the mixfix annotation @{ML NoSyn} to indicate that we do not want to associate any
+ mixfix syntax with the variable. Moreover, we have only specified the type of \texttt{r},
+ whereas the type of \texttt{trcl} is computed using type inference.
+ The local variables \texttt{x}, \texttt{y} and \texttt{z} of the introduction rules
+ are turned into bound variables with the de Bruijn indices,
+ whereas \texttt{trcl} and \texttt{r} remain free variables.
+
*}
text {*
-\paragraph{Parsers for theory syntax}
+
+ \paragraph{Parsers for theory syntax}
-Although the function @{ML add_inductive in SIP} parses terms and types, it still
-cannot be used to invoke the package directly from within a theory document.
-In order to do this, we have to write another parser. Before we describe
-the process of writing parsers for theory syntax in more detail, we first
-show some examples of how we would like to use the inductive definition
-package.
+ Although the function @{ML add_inductive in SimpleInductivePackage} parses terms and types, it still
+ cannot be used to invoke the package directly from within a theory document.
+ In order to do this, we have to write another parser. Before we describe
+ the process of writing parsers for theory syntax in more detail, we first
+ show some examples of how we would like to use the inductive definition
+ package.
-\noindent
-The definition of the transitive closure should look as follows:
+
+ The definition of the transitive closure should look as follows:
*}
simple_inductive
@@ -188,10 +198,7 @@
qed
(*>*)
-text {*
-\noindent
-Even and odd numbers can be defined by
-*}
+text {* Even and odd numbers can be defined by *}
simple_inductive
even and odd
@@ -208,10 +215,7 @@
thm even_odd.intros
(*>*)
-text {*
-\noindent
-The accessible part of a relation can be introduced as follows:
-*}
+text {* The accessible part of a relation can be introduced as follows: *}
simple_inductive
accpart for r :: "'a \<Rightarrow> 'a \<Rightarrow> bool"
@@ -224,9 +228,8 @@
(*>*)
text {*
-\noindent
-Moreover, it should also be possible to define the accessible part
-inside a locale fixing the relation @{text r}:
+ Moreover, it should also be possible to define the accessible part
+ inside a locale fixing the relation @{text r}:
*}
locale rel =
@@ -247,8 +250,7 @@
thm rel.accpartI'
thm rel.accpart'.induct
-ML {*
-val (result, lthy) = SimpleInductivePackage.add_inductive
+ML{*val (result, lthy) = SimpleInductivePackage.add_inductive
[(Binding.name "trcl'", NONE, NoSyn)] [(Binding.name "r", SOME "'a \<Rightarrow> 'a \<Rightarrow> bool", NoSyn)]
[((Binding.name "base", []), "\<And>x. trcl' r x x"), ((Binding.name "step", []), "\<And>x y z. trcl' r x y \<Longrightarrow> r y z \<Longrightarrow> trcl' r x z")]
(TheoryTarget.init NONE @{theory})
@@ -256,218 +258,229 @@
(*>*)
text {*
-\noindent
-In this context, it is important to note that Isabelle distinguishes
-between \emph{outer} and \emph{inner} syntax. Theory commands such as
-\isa{\isacommand{simple{\isacharunderscore}inductive} $\ldots$ \isacommand{for} $\ldots$ \isacommand{where} $\ldots$}
-belong to the outer syntax, whereas items in quotation marks, in particular
-terms such as @{text [source] "trcl r x x"} and types such as
-@{text [source] "'a \<Rightarrow> 'a \<Rightarrow> bool"} belong to the inner syntax.
-Separating the two layers of outer and inner syntax greatly simplifies
-matters, because the parser for terms and types does not have to know
-anything about the possible syntax of theory commands, and the parser
-for theory commands need not be concerned about the syntactic structure
-of terms and types.
-\medskip
-\noindent
-The syntax of the \isa{\isacommand{simple{\isacharunderscore}inductive}} command
-can be described by the following railroad diagram:
-\begin{rail}
+ In this context, it is important to note that Isabelle distinguishes
+ between \emph{outer} and \emph{inner} syntax. Theory commands such as
+ \isa{\isacommand{simple{\isacharunderscore}inductive} $\ldots$ \isacommand{for} $\ldots$ \isacommand{where} $\ldots$}
+ belong to the outer syntax, whereas items in quotation marks, in particular
+ terms such as @{text [source] "trcl r x x"} and types such as
+ @{text [source] "'a \<Rightarrow> 'a \<Rightarrow> bool"} belong to the inner syntax.
+ Separating the two layers of outer and inner syntax greatly simplifies
+ matters, because the parser for terms and types does not have to know
+ anything about the possible syntax of theory commands, and the parser
+ for theory commands need not be concerned about the syntactic structure
+ of terms and types.
+
+ \medskip
+ \noindent
+ The syntax of the \isa{\isacommand{simple{\isacharunderscore}inductive}} command
+ can be described by the following railroad diagram:
+ \begin{rail}
'simple\_inductive' target? fixes ('for' fixes)? \\
('where' (thmdecl? prop + '|'))?
;
-\end{rail}
-
-\paragraph{Functional parsers}
+ \end{rail}
-For parsing terms and types, Isabelle uses a rather general and sophisticated
-algorithm due to Earley, which is driven by \emph{priority grammars}.
-In contrast, parsers for theory syntax are built up using a set of combinators.
-Functional parsing using combinators is a well-established technique, which
-has been described by many authors, including Paulson \cite{paulson-ML-91}
-and Wadler \cite{Wadler-AFP95}.
-The central idea is that a parser is a function of type @{ML_type "'a list -> 'b * 'a list"},
-where @{ML_type "'a"} is a type of \emph{tokens}, and @{ML_type "'b"} is a type for
-encoding items that the parser has recognized. When a parser is applied to a
-list of tokens whose prefix it can recognize, it returns an encoding of the
-prefix as an element of type @{ML_type "'b"}, together with the suffix of the list
-containing the remaining tokens. Otherwise, the parser raises an exception
-indicating a syntax error. The library for writing functional parsers in
-Isabelle can roughly be split up into two parts. The first part consists of a
-collection of generic parser combinators that are contained in the structure
-@{ML_struct Scan} defined in the file @{ML_file "Pure/General/scan.ML"} in the Isabelle
-sources. While these combinators do not make any assumptions about the concrete
-structure of the tokens used, the second part of the library consists of combinators
-for dealing with specific token types.
-The following is an excerpt from the signature of @{ML_struct Scan}:
+ \paragraph{Functional parsers}
-\begin{table}
-@{ML "|| : ('a -> 'b) * ('a -> 'b) -> 'a -> 'b"} \\
-@{ML "-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> ('b * 'd) * 'e"} \\
-@{ML "|-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'd * 'e"} \\
-@{ML "--| : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'b * 'e"} \\
-@{ML "optional: ('a -> 'b * 'a) -> 'b -> 'a -> 'b * 'a" in Scan} \\
-@{ML "repeat: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" in Scan} \\
-@{ML "repeat1: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" in Scan} \\
-@{ML ">> : ('a -> 'b * 'c) * ('b -> 'd) -> 'a -> 'd * 'c"} \\
-@{ML "!! : ('a * string option -> string) -> ('a -> 'b) -> 'a -> 'b"}
-\end{table}
-Interestingly, the functions shown above are so generic that they do not
-even rely on the input and output of the parser being a list of tokens.
-If \texttt{p} succeeds, i.e.\ does not raise an exception, the parser
-@{ML "p || q" for p q} returns the result of \texttt{p}, otherwise it returns
-the result of \texttt{q}. The parser @{ML "p -- q" for p q} first parses an
-item of type @{ML_type "'b"} using \texttt{p}, then passes the remaining tokens
-of type @{ML_type "'c"} to \texttt{q}, which parses an item of type @{ML_type "'d"}
-and returns the remaining tokens of type @{ML_type "'e"}, which are finally
-returned together with a pair of type @{ML_type "'b * 'd"} containing the two
-parsed items. The parsers @{ML "p |-- q" for p q} and @{ML "p --| q" for p q}
-work in a similar way as the previous one, with the difference that they
-discard the item parsed by the first and the second parser, respectively.
-If \texttt{p} succeeds, the parser @{ML "optional p x" for p x in Scan} returns the result
-of \texttt{p}, otherwise it returns the default value \texttt{x}. The parser
-@{ML "repeat p" for p in Scan} applies \texttt{p} as often as it can, returning a possibly
-empty list of parsed items. The parser @{ML "repeat1 p" for p in Scan} is similar,
-but requires \texttt{p} to succeed at least once. The parser
-@{ML "p >> f" for p f} uses \texttt{p} to parse an item of type @{ML_type "'b"}, to which
-it applies the function \texttt{f} yielding a value of type @{ML_type "'d"}, which
-is returned together with the remaining tokens of type @{ML_type "'c"}.
-Finally, @{ML "!!"} is used for transforming exceptions produced by parsers.
-If \texttt{p} raises an exception indicating that it cannot parse a given input,
-then an enclosing parser such as
-@{ML [display] "q -- p || r" for p q r}
-will try the alternative parser \texttt{r}. By writing
-@{ML [display] "q -- !! err p || r" for err p q r}
-instead, one can achieve that a failure of \texttt{p} causes the whole parser to abort.
-The @{ML "!!"} operator is similar to the \emph{cut} operator in Prolog, which prevents
-the interpreter from backtracking. The \texttt{err} function supplied as an argument
-to @{ML "!!"} can be used to produce an error message depending on the current
-state of the parser, as well as the optional error message returned by \texttt{p}.
+ For parsing terms and types, Isabelle uses a rather general and sophisticated
+ algorithm due to Earley, which is driven by \emph{priority grammars}.
+ In contrast, parsers for theory syntax are built up using a set of combinators.
+ Functional parsing using combinators is a well-established technique, which
+ has been described by many authors, including Paulson \cite{paulson-ML-91}
+ and Wadler \cite{Wadler-AFP95}.
+ The central idea is that a parser is a function of type @{ML_type "'a list -> 'b * 'a list"},
+ where @{ML_type "'a"} is a type of \emph{tokens}, and @{ML_type "'b"} is a type for
+ encoding items that the parser has recognized. When a parser is applied to a
+ list of tokens whose prefix it can recognize, it returns an encoding of the
+ prefix as an element of type @{ML_type "'b"}, together with the suffix of the list
+ containing the remaining tokens. Otherwise, the parser raises an exception
+ indicating a syntax error. The library for writing functional parsers in
+ Isabelle can roughly be split up into two parts. The first part consists of a
+ collection of generic parser combinators that are contained in the structure
+ @{ML_struct Scan} defined in the file @{ML_file "Pure/General/scan.ML"} in the Isabelle
+ sources. While these combinators do not make any assumptions about the concrete
+ structure of the tokens used, the second part of the library consists of combinators
+ for dealing with specific token types.
+ The following is an excerpt from the signature of @{ML_struct Scan}:
-So far, we have only looked at combinators that construct more complex parsers
-from simpler parsers. In order for these combinators to be useful, we also need
-some basic parsers. As an example, we consider the following two parsers
-defined in @{ML_struct Scan}:
-
-\begin{table}
-@{ML "one: ('a -> bool) -> 'a list -> 'a * 'a list" in Scan} \\
-@{ML "$$ : string -> string list -> string * string list"}
-\end{table}
+ \begin{table}
+ @{ML "|| : ('a -> 'b) * ('a -> 'b) -> 'a -> 'b"} \\
+ @{ML "-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> ('b * 'd) * 'e"} \\
+ @{ML "|-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'd * 'e"} \\
+ @{ML "--| : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'b * 'e"} \\
+ @{ML "optional: ('a -> 'b * 'a) -> 'b -> 'a -> 'b * 'a" in Scan} \\
+ @{ML "repeat: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" in Scan} \\
+ @{ML "repeat1: ('a -> 'b * 'a) -> 'a -> 'b list * 'a" in Scan} \\
+ @{ML ">> : ('a -> 'b * 'c) * ('b -> 'd) -> 'a -> 'd * 'c"} \\
+ @{ML "!! : ('a * string option -> string) -> ('a -> 'b) -> 'a -> 'b"}
+ \end{table}
-The parser @{ML "one pred" for pred in Scan} parses exactly one token that
-satisfies the predicate \texttt{pred}, whereas @{ML "$$ s" for s} only
-accepts a token that equals the string \texttt{s}. Note that we can easily
-express @{ML "$$ s" for s} using @{ML "one" in Scan}:
-@{ML [display] "one (fn s' => s' = s)" for s in Scan}
-As an example, let us look at how we can use @{ML "$$"} and @{ML "--"} to parse
-the prefix ``\texttt{hello}'' of the character list ``\texttt{hello world}'':
-@{ML_response [display]
-"($$ \"h\" -- $$ \"e\" -- $$ \"l\" -- $$ \"l\" -- $$ \"o\")
-[\"h\", \"e\", \"l\", \"l\", \"o\", \" \", \"w\", \"o\", \"r\", \"l\", \"d\"]"
-"(((((\"h\", \"e\"), \"l\"), \"l\"), \"o\"), [\" \", \"w\", \"o\", \"r\", \"l\", \"d\"])
-: ((((string * string) * string) * string) * string) * string list"}
-Most of the time, however, we will have to deal with tokens that are not just strings.
-The parsers for the theory syntax, as well as the parsers for the argument syntax
-of proof methods and attributes use the token type @{ML_type OuterParse.token},
-which is identical to @{ML_type OuterLex.token}.
-The parser functions for the theory syntax are contained in the structure
-@{ML_struct OuterParse} defined in the file @{ML_file "Pure/Isar/outer_parse.ML"}.
-In our parser, we will use the following functions:
-
-\begin{table}
-@{ML "$$$ : string -> token list -> string * token list" in OuterParse} \\
-@{ML "enum1: string -> (token list -> 'a * token list) -> token list ->
- 'a list * token list" in OuterParse} \\
-@{ML "prop: token list -> string * token list" in OuterParse} \\
-@{ML "opt_target: token list -> string option * token list" in OuterParse} \\
-@{ML "fixes: token list ->
- (Binding.binding * string option * mixfix) list * token list" in OuterParse} \\
-@{ML "for_fixes: token list ->
- (Binding.binding * string option * mixfix) list * token list" in OuterParse} \\
-@{ML "!!! : (token list -> 'a) -> token list -> 'a" in OuterParse}
-\end{table}
+ Interestingly, the functions shown above are so generic that they do not
+ even rely on the input and output of the parser being a list of tokens.
+ If \texttt{p} succeeds, i.e.\ does not raise an exception, the parser
+ @{ML "p || q" for p q} returns the result of \texttt{p}, otherwise it returns
+ the result of \texttt{q}. The parser @{ML "p -- q" for p q} first parses an
+ item of type @{ML_type "'b"} using \texttt{p}, then passes the remaining tokens
+ of type @{ML_type "'c"} to \texttt{q}, which parses an item of type @{ML_type "'d"}
+ and returns the remaining tokens of type @{ML_type "'e"}, which are finally
+ returned together with a pair of type @{ML_type "'b * 'd"} containing the two
+ parsed items. The parsers @{ML "p |-- q" for p q} and @{ML "p --| q" for p q}
+ work in a similar way as the previous one, with the difference that they
+ discard the item parsed by the first and the second parser, respectively.
+ If \texttt{p} succeeds, the parser @{ML "optional p x" for p x in Scan} returns the result
+ of \texttt{p}, otherwise it returns the default value \texttt{x}. The parser
+ @{ML "repeat p" for p in Scan} applies \texttt{p} as often as it can, returning a possibly
+ empty list of parsed items. The parser @{ML "repeat1 p" for p in Scan} is similar,
+ but requires \texttt{p} to succeed at least once. The parser
+ @{ML "p >> f" for p f} uses \texttt{p} to parse an item of type @{ML_type "'b"}, to which
+ it applies the function \texttt{f} yielding a value of type @{ML_type "'d"}, which
+ is returned together with the remaining tokens of type @{ML_type "'c"}.
+ Finally, @{ML "!!"} is used for transforming exceptions produced by parsers.
+ If \texttt{p} raises an exception indicating that it cannot parse a given input,
+ then an enclosing parser such as
+ @{ML [display] "q -- p || r" for p q r}
+ will try the alternative parser \texttt{r}. By writing
+ @{ML [display] "q -- !! err p || r" for err p q r}
+ instead, one can achieve that a failure of \texttt{p} causes the whole parser to abort.
+ The @{ML "!!"} operator is similar to the \emph{cut} operator in Prolog, which prevents
+ the interpreter from backtracking. The \texttt{err} function supplied as an argument
+ to @{ML "!!"} can be used to produce an error message depending on the current
+ state of the parser, as well as the optional error message returned by \texttt{p}.
+
+ So far, we have only looked at combinators that construct more complex parsers
+ from simpler parsers. In order for these combinators to be useful, we also need
+ some basic parsers. As an example, we consider the following two parsers
+ defined in @{ML_struct Scan}:
+
+ \begin{table}
+ @{ML "one: ('a -> bool) -> 'a list -> 'a * 'a list" in Scan} \\
+ @{ML "$$ : string -> string list -> string * string list"}
+ \end{table}
+
+ The parser @{ML "one pred" for pred in Scan} parses exactly one token that
+ satisfies the predicate \texttt{pred}, whereas @{ML "$$ s" for s} only
+ accepts a token that equals the string \texttt{s}. Note that we can easily
+ express @{ML "$$ s" for s} using @{ML "one" in Scan}:
+ @{ML [display] "one (fn s' => s' = s)" for s in Scan}
+ As an example, let us look at how we can use @{ML "$$"} and @{ML "--"} to parse
+ the prefix ``\texttt{hello}'' of the character list ``\texttt{hello world}'':
+
+ @{ML_response [display]
+ "($$ \"h\" -- $$ \"e\" -- $$ \"l\" -- $$ \"l\" -- $$ \"o\")
+ [\"h\", \"e\", \"l\", \"l\", \"o\", \" \", \"w\", \"o\", \"r\", \"l\", \"d\"]"
+ "(((((\"h\", \"e\"), \"l\"), \"l\"), \"o\"), [\" \", \"w\", \"o\", \"r\", \"l\", \"d\"])
+ : ((((string * string) * string) * string) * string) * string list"}
-The parsers @{ML "$$$" in OuterParse} and @{ML "!!!" in OuterParse} are
-defined using the parsers @{ML "one" in Scan} and @{ML "!!"} from
-@{ML_struct Scan}.
-The parser @{ML "enum1 s p" for s p in OuterParse} parses a non-emtpy list of items
-recognized by the parser \texttt{p}, where the items are separated by \texttt{s}.
-A proposition can be parsed using the function @{ML prop in OuterParse}.
-Essentially, a proposition is just a string or an identifier, but using the
-specific parser function @{ML prop in OuterParse} leads to more instructive
-error messages, since the parser will complain that a proposition was expected
-when something else than a string or identifier is found.
-An optional locale target specification of the form \isa{(\isacommand{in}\ $\ldots$)}
-can be parsed using @{ML opt_target in OuterParse}.
-The lists of names of the predicates and parameters, together with optional
-types and syntax, are parsed using the functions @{ML "fixes" in OuterParse}
-and @{ML for_fixes in OuterParse}, respectively.
-In addition, the following function from @{ML_struct SpecParse} for parsing
-an optional theorem name and attribute, followed by a delimiter, will be useful:
-
-\begin{table}
-@{ML "opt_thm_name:
- string -> token list -> Attrib.binding * token list" in SpecParse}
-\end{table}
+ Most of the time, however, we will have to deal with tokens that are not just strings.
+ The parsers for the theory syntax, as well as the parsers for the argument syntax
+ of proof methods and attributes use the token type @{ML_type OuterParse.token},
+ which is identical to @{ML_type OuterLex.token}.
+ The parser functions for the theory syntax are contained in the structure
+ @{ML_struct OuterParse} defined in the file @{ML_file "Pure/Isar/outer_parse.ML"}.
+ In our parser, we will use the following functions:
+
+ \begin{table}
+ @{ML "$$$ : string -> token list -> string * token list" in OuterParse} \\
+ @{ML "enum1: string -> (token list -> 'a * token list) -> token list ->
+ 'a list * token list" in OuterParse} \\
+ @{ML "prop: token list -> string * token list" in OuterParse} \\
+ @{ML "opt_target: token list -> string option * token list" in OuterParse} \\
+ @{ML "fixes: token list ->
+ (Binding.binding * string option * mixfix) list * token list" in OuterParse} \\
+ @{ML "for_fixes: token list ->
+ (Binding.binding * string option * mixfix) list * token list" in OuterParse} \\
+ @{ML "!!! : (token list -> 'a) -> token list -> 'a" in OuterParse}
+ \end{table}
-We now have all the necessary tools to write the parser for our
-\isa{\isacommand{simple{\isacharunderscore}inductive}} command:
-@{ML_chunk [display] syntax}
-The definition of the parser \verb!ind_decl! closely follows the railroad
-diagram shown above. In order to make the code more readable, the structures
-@{ML_struct OuterParse} and @{ML_struct OuterKeyword} are abbreviated by
-\texttt{P} and \texttt{K}, respectively. Note how the parser combinator
-@{ML "!!!" in OuterParse} is used: once the keyword \texttt{where}
-has been parsed, a non-empty list of introduction rules must follow.
-Had we not used the combinator @{ML "!!!" in OuterParse}, a
-\texttt{where} not followed by a list of rules would have caused the parser
-to respond with the somewhat misleading error message
-\begin{verbatim}
+ The parsers @{ML "$$$" in OuterParse} and @{ML "!!!" in OuterParse} are
+ defined using the parsers @{ML "one" in Scan} and @{ML "!!"} from
+ @{ML_struct Scan}.
+ The parser @{ML "enum1 s p" for s p in OuterParse} parses a non-emtpy list of items
+ recognized by the parser \texttt{p}, where the items are separated by \texttt{s}.
+ A proposition can be parsed using the function @{ML prop in OuterParse}.
+ Essentially, a proposition is just a string or an identifier, but using the
+ specific parser function @{ML prop in OuterParse} leads to more instructive
+ error messages, since the parser will complain that a proposition was expected
+ when something else than a string or identifier is found.
+ An optional locale target specification of the form \isa{(\isacommand{in}\ $\ldots$)}
+ can be parsed using @{ML opt_target in OuterParse}.
+ The lists of names of the predicates and parameters, together with optional
+ types and syntax, are parsed using the functions @{ML "fixes" in OuterParse}
+ and @{ML for_fixes in OuterParse}, respectively.
+ In addition, the following function from @{ML_struct SpecParse} for parsing
+ an optional theorem name and attribute, followed by a delimiter, will be useful:
+
+ \begin{table}
+ @{ML "opt_thm_name:
+ string -> token list -> Attrib.binding * token list" in SpecParse}
+ \end{table}
+
+ We now have all the necessary tools to write the parser for our
+ \isa{\isacommand{simple{\isacharunderscore}inductive}} command:
+
+ @{ML_chunk [display] syntax}
+
+ The definition of the parser \verb!ind_decl! closely follows the railroad
+ diagram shown above. In order to make the code more readable, the structures
+ @{ML_struct OuterParse} and @{ML_struct OuterKeyword} are abbreviated by
+ \texttt{P} and \texttt{K}, respectively. Note how the parser combinator
+ @{ML "!!!" in OuterParse} is used: once the keyword \texttt{where}
+ has been parsed, a non-empty list of introduction rules must follow.
+ Had we not used the combinator @{ML "!!!" in OuterParse}, a
+ \texttt{where} not followed by a list of rules would have caused the parser
+ to respond with the somewhat misleading error message
+
+ \begin{verbatim}
Outer syntax error: end of input expected, but keyword where was found
-\end{verbatim}
-rather than with the more instructive message
-\begin{verbatim}
+ \end{verbatim}
+
+ rather than with the more instructive message
+
+ \begin{verbatim}
Outer syntax error: proposition expected, but terminator was found
-\end{verbatim}
-Once all arguments of the command have been parsed, we apply the function
-@{ML add_inductive in SimpleInductivePackage}, which yields a local theory
-transformer of type @{ML_type "local_theory -> local_theory"}. Commands in
-Isabelle/Isar are realized by transition transformers of type
-@{ML_type [display] "Toplevel.transition -> Toplevel.transition"}
-We can turn a local theory transformer into a transition transformer by using
-the function
-@{ML [display] "Toplevel.local_theory : string option ->
+ \end{verbatim}
+
+ Once all arguments of the command have been parsed, we apply the function
+ @{ML add_inductive in SimpleInductivePackage}, which yields a local theory
+ transformer of type @{ML_type "local_theory -> local_theory"}. Commands in
+ Isabelle/Isar are realized by transition transformers of type
+ @{ML_type [display] "Toplevel.transition -> Toplevel.transition"}
+ We can turn a local theory transformer into a transition transformer by using
+ the function
+
+ @{ML [display] "Toplevel.local_theory : string option ->
(local_theory -> local_theory) ->
Toplevel.transition -> Toplevel.transition"}
-which, apart from the local theory transformer, takes an optional name of a locale
-to be used as a basis for the local theory.
+
+ which, apart from the local theory transformer, takes an optional name of a locale
+ to be used as a basis for the local theory.
-(FIXME : needs to be adjusted to new parser type)
+ (FIXME : needs to be adjusted to new parser type)
-{\it
-The whole parser for our command has type
-@{ML_text [display] "OuterLex.token list ->
+ {\it
+ The whole parser for our command has type
+ @{ML_text [display] "OuterLex.token list ->
(Toplevel.transition -> Toplevel.transition) * OuterLex.token list"}
-which is abbreviated by @{ML_text OuterSyntax.parser_fn}. The new command can be added
-to the system via the function
-@{ML_text [display] "OuterSyntax.command :
+ which is abbreviated by @{ML_text OuterSyntax.parser_fn}. The new command can be added
+ to the system via the function
+ @{ML_text [display] "OuterSyntax.command :
string -> string -> OuterKeyword.T -> OuterSyntax.parser_fn -> unit"}
-which imperatively updates the parser table behind the scenes. }
+ which imperatively updates the parser table behind the scenes. }
-In addition to the parser, this
-function takes two strings representing the name of the command and a short description,
-as well as an element of type @{ML_type OuterKeyword.T} describing which \emph{kind} of
-command we intend to add. Since we want to add a command for declaring new concepts,
-we choose the kind @{ML "OuterKeyword.thy_decl"}. Other kinds include
-@{ML "OuterKeyword.thy_goal"}, which is similar to @{ML thy_decl in OuterKeyword},
-but requires the user to prove a goal before making the declaration, or
-@{ML "OuterKeyword.diag"}, which corresponds to a purely diagnostic command that does
-not change the context. For example, the @{ML thy_goal in OuterKeyword} kind is used
-by the \isa{\isacommand{function}} command \cite{Krauss-IJCAR06}, which requires the user
-to prove that a given set of equations is non-overlapping and covers all cases. The kind
-of the command should be chosen with care, since selecting the wrong one can cause strange
-behaviour of the user interface, such as failure of the undo mechanism.
+ In addition to the parser, this
+ function takes two strings representing the name of the command and a short description,
+ as well as an element of type @{ML_type OuterKeyword.T} describing which \emph{kind} of
+ command we intend to add. Since we want to add a command for declaring new concepts,
+ we choose the kind @{ML "OuterKeyword.thy_decl"}. Other kinds include
+ @{ML "OuterKeyword.thy_goal"}, which is similar to @{ML thy_decl in OuterKeyword},
+ but requires the user to prove a goal before making the declaration, or
+ @{ML "OuterKeyword.diag"}, which corresponds to a purely diagnostic command that does
+ not change the context. For example, the @{ML thy_goal in OuterKeyword} kind is used
+ by the \isa{\isacommand{function}} command \cite{Krauss-IJCAR06}, which requires the user
+ to prove that a given set of equations is non-overlapping and covers all cases. The kind
+ of the command should be chosen with care, since selecting the wrong one can cause strange
+ behaviour of the user interface, such as failure of the undo mechanism.
*}
(*<*)