theory Parsing
imports Base
begin
chapter {* Parsing *}
text {*
Isabelle distinguishes between \emph{outer} and \emph{inner} syntax.
Theory commands, such as \isacommand{definition}, \isacommand{inductive} and so
on, belong to the outer syntax, whereas items inside double quotation marks, such
as terms and types, belong to the inner syntax. For parsing inner syntax,
Isabelle uses a rather general and sophisticated algorithm due to Earley, which
is driven by priority grammars. Parsers for outer syntax are built up by functional
parsing combinators. These combinators are a well-established technique for parsing,
which has, for example, been described in Paulson's classic book \cite{paulson-ml2}.
Isabelle developers are usually concerned with writing parsers for outer
syntax, either for new definitional packages or for calling tactics. The library
for writing such parsers in can roughly be split up into two parts.
The first part consists of a collection of generic parser combinators defined
in the structure @{ML_struct Scan} in the file
@{ML_file "Pure/General/scan.ML"}. The second part of the library consists of
combinators for dealing with specific token types, which are defined in the
structure @{ML_struct OuterParse} in the file
@{ML_file "Pure/Isar/outer_parse.ML"}.
*}
section {* Building Up Generic Parsers *}
text {*
Let us first have a look at parsing strings using generic parsing combinators.
The function @{ML "$$"} takes a string and will ``consume'' this string from
a given input list of strings. Consume in this context means that it will
return a pair consisting of this string and the rest of the list.
For example:
*}
ML {* ($$ "h") (explode "hello");
($$ "w") (explode "world") *}
text {*
returns @{ML "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} and then
@{ML "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"}. This function will either succeed (as in
the two examples above) or raise the exeption @{ML_text "FAIL"} if no string
can be consumed: for example @{ML_text "($$ \"x\") (explode \"world\")"}.
Two such parser can be connected in sequence using the funtion @{ML "(op --)"}.
For example
*}
ML {* (($$ "h") -- ($$ "e") -- ($$ "l")) (explode "hello") *}
text {*
returns @{ML "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"}. Note how the result of
consumed strings builds up on the left as nested pairs.
Parsers that explore
alternatives can be constructed using the function @{ML "(op ||)"}. For example, the
parser @{ML_open "p || q" for p q} returns the result of @{ML_text "p"}, if it succeeds,
otherwise it returns the result of @{ML_text "q"}. For example
*}
ML {*
let val hw = ($$ "h") || ($$ "w")
val input1 = (explode "hello")
val input2 = (explode "world")
in
(hw input1, hw input2)
end
*}
text {*
will in the first case consume the @{ML_text "h"} and in the second the @{ML_text "w"}.
The functions @{ML "(op |--)"} and @{ML "(op --|)"} work like the sequencing funtion
for parsers, except that they discard the item parsed by the first (respectively second)
parser. For example
*}
ML {*
let val just_h = ($$ "h") --| ($$ "e")
val just_e = ($$ "h") |-- ($$ "e")
val input = (explode "hello")
in
(just_h input, just_e input)
end
*}
text {*
produces @{ML "(\"h\", [\"l\", \"l\", \"o\"])"} and @{ML "(\"e\", [\"l\", \"l\", \"o\"])"},
respectively.
(FIXME does it make sense to explain the other functions from @{ML_text BASIC_SCAN}?)
The parser @{ML_open "Scan.optional p x" for p x} returns the result of the parser
@{ML_text "p"}, if it succeeds; otherwise it returns
the default value @{ML_text "x"}. For example
*}
ML {*
let val p = Scan.optional ($$ "h") "x"
val input1 = (explode "hello")
val input2 = (explode "world")
in
(p input1, p input2)
end
*}
text {*
returns @{ML "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} in the first case and
in the second case @{ML "(\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"])"}.
*}
text {* (FIXME: explain @{ML "op !!"}) *}
ML {*
val err_fn = (fn _ => "foo");
val p = (!! err_fn ($$ "h")) || ($$ "w");
val input1 = (explode "hello");
val input2 = (explode "world");
p input1;
*}
text {* (FIXME: why does @{ML_text "p input2"} not do anything with foo?) *}
text {* (FIXME: explain function application) *}
ML {* fun parse_fn (x,y) = (x,y^y) *}
ML {* ((($$ "h") -- ($$ "e")) >> parse_fn) (explode "hello") *}
text {* (FIXME: explain @{ML_text "lift"}) *}
chapter {* Parsing *}
text {*
Lots of Standard ML code is given in this document, for various reasons,
including:
\begin{itemize}
\item direct quotation of code found in the Isabelle source files,
or simplified versions of such code
\item identifiers found in the Isabelle source code, with their types
(or specialisations of their types)
\item code examples, which can be run by the reader, to help illustrate the
behaviour of functions found in the Isabelle source code
\item ancillary functions, not from the Isabelle source code,
which enable the reader to run relevant code examples
\item type abbreviations, which help explain the uses of certain functions
\end{itemize}
*}
section {* Parsing Isar input *}
text {*
The typical parsing function has the type
\texttt{'src -> 'res * 'src}, with input
of type \texttt{'src}, returning a result
of type \texttt{'res}, which is (or is derived from) the first part of the
input, and also returning the remainder of the input.
(In the common case, when it is clear what the ``remainder of the input''
means, we will just say that the functions ``returns'' the
value of type \texttt{'res}).
An exception is raised if an appropriate value
cannot be produced from the input.
A range of exceptions can be used to identify different reasons
for the failure of a parse.
This contrasts the standard parsing function in Standard ML,
which is of type
\texttt{type ('res, 'src) reader = 'src -> ('res * 'src) option};
(for example, \texttt{List.getItem} and \texttt{Substring.getc}).
However, much of the discussion at
FIX file:/home/jeremy/html/ml/SMLBasis/string-cvt.html
is relevant.
Naturally one may convert between the two different sorts of parsing functions
as follows:
\begin{verbatim}
open StringCvt ;
type ('res, 'src) ex_reader = 'src -> 'res * 'src
(* ex_reader : ('res, 'src) reader -> ('res, 'src) ex_reader *)
fun ex_reader rdr src = Option.valOf (rdr src) ;
(* reader : ('res, 'src) ex_reader -> ('res, 'src) reader *)
fun reader exrdr src = SOME (exrdr src) handle _ => NONE ;
\end{verbatim}
*}
section{* The \texttt{Scan} structure *}
text {*
The source file is \texttt{src/General/scan.ML}.
This structure provides functions for using and combining parsing functions
of the type \texttt{'src -> 'res * 'src}.
Three exceptions are used:
\begin{verbatim}
exception MORE of string option; (*need more input (prompt)*)
exception FAIL of string option; (*try alternatives (reason of failure)*)
exception ABORT of string; (*dead end*)
\end{verbatim}
Many functions in this structure (generally those with names composed of
symbols) are declared as infix.
Some functions from that structure are
\begin{verbatim}
|-- : ('src -> 'res1 * 'src') * ('src' -> 'res2 * 'src'') ->
'src -> 'res2 * 'src''
--| : ('src -> 'res1 * 'src') * ('src' -> 'res2 * 'src'') ->
'src -> 'res1 * 'src''
-- : ('src -> 'res1 * 'src') * ('src' -> 'res2 * 'src'') ->
'src -> ('res1 * 'res2) * 'src''
^^ : ('src -> string * 'src') * ('src' -> string * 'src'') ->
'src -> string * 'src''
\end{verbatim}
These functions parse a result off the input source twice.
\texttt{|--} and \texttt{--|}
return the first result and the second result, respectively.
\texttt{--} returns both.
\verb|^^| returns the result of concatenating the two results
(which must be strings).
Note how, although the types
\texttt{'src}, \texttt{'src'} and \texttt{'src''} will normally be the same,
the types as shown help suggest the behaviour of the functions.
\begin{verbatim}
:-- : ('src -> 'res1 * 'src') * ('res1 -> 'src' -> 'res2 * 'src'') ->
'src -> ('res1 * 'res2) * 'src''
:|-- : ('src -> 'res1 * 'src') * ('res1 -> 'src' -> 'res2 * 'src'') ->
'src -> 'res2 * 'src''
\end{verbatim}
These are similar to \texttt{|--} and \texttt{--|},
except that the second parsing function can depend on the result of the first.
\begin{verbatim}
>> : ('src -> 'res1 * 'src') * ('res1 -> 'res2) -> 'src -> 'res2 * 'src'
|| : ('src -> 'res_src) * ('src -> 'res_src) -> 'src -> 'res_src
\end{verbatim}
\texttt{p >> f} applies a function \texttt{f} to the result of a parse.
\texttt{||} tries a second parsing function if the first one
fails by raising an exception of the form \texttt{FAIL \_}.
\begin{verbatim}
succeed : 'res -> ('src -> 'res * 'src) ;
fail : ('src -> 'res_src) ;
!! : ('src * string option -> string) ->
('src -> 'res_src) -> ('src -> 'res_src) ;
\end{verbatim}
\texttt{succeed r} returns \texttt{r}, with the input unchanged.
\texttt{fail} always fails, raising exception \texttt{FAIL NONE}.
\texttt{!! f} only affects the failure mode, turning a failure that
raises \texttt{FAIL \_} into a failure that raises \texttt{ABORT ...}.
This is used to prevent recovery from the failure ---
thus, in \texttt{!! parse1 || parse2}, if \texttt{parse1} fails,
it won't recover by trying \texttt{parse2}.
\begin{verbatim}
one : ('si -> bool) -> ('si list -> 'si * 'si list) ;
some : ('si -> 'res option) -> ('si list -> 'res * 'si list) ;
\end{verbatim}
These require the input to be a list of items:
they fail, raising \texttt{MORE NONE} if the list is empty.
On other failures they raise \texttt{FAIL NONE}
\texttt{one p} takes the first
item from the list if it satisfies \texttt{p}, otherwise fails.
\texttt{some f} takes the first
item from the list and applies \texttt{f} to it, failing if this returns
\texttt{NONE}.
\begin{verbatim}
many : ('si -> bool) -> 'si list -> 'si list * 'si list ;
\end{verbatim}
\texttt{many p} takes items from the input until it encounters one
which does not satisfy \texttt{p}. If it reaches the end of the input
it fails, raising \texttt{MORE NONE}.
\texttt{many1} (with the same type) fails if the first item
does not satisfy \texttt{p}.
\begin{verbatim}
option : ('src -> 'res * 'src) -> ('src -> 'res option * 'src)
optional : ('src -> 'res * 'src) -> 'res -> ('src -> 'res * 'src)
\end{verbatim}
\texttt{option}:
where the parser \texttt{f} succeeds with result \texttt{r}
or raises \texttt{FAIL \_},
\texttt{option f} gives the result \texttt{SOME r} or \texttt{NONE}.
\texttt{optional}: if parser \texttt{f} fails by raising \texttt{FAIL \_},
\texttt{optional f default} provides the result \texttt{default}.
\begin{verbatim}
repeat : ('src -> 'res * 'src) -> 'src -> 'res list * 'src
repeat1 : ('src -> 'res * 'src) -> 'src -> 'res list * 'src
bulk : ('src -> 'res * 'src) -> 'src -> 'res list * 'src
\end{verbatim}
\texttt{repeat f} repeatedly parses an item off the remaining input until
\texttt{f} fails with \texttt{FAIL \_}
\texttt{repeat1} is as for \texttt{repeat}, but requires at least one
successful parse.
\begin{verbatim}
lift : ('src -> 'res * 'src) -> ('ex * 'src -> 'res * ('ex * 'src))
\end{verbatim}
\texttt{lift} changes the source type of a parser by putting in an extra
component \texttt{'ex}, which is ignored in the parsing.
The \texttt{Scan} structure also provides the type \texttt{lexicon},
HOW DO THEY WORK ?? TO BE COMPLETED
\begin{verbatim}
dest_lexicon: lexicon -> string list ;
make_lexicon: string list list -> lexicon ;
empty_lexicon: lexicon ;
extend_lexicon: string list list -> lexicon -> lexicon ;
merge_lexicons: lexicon -> lexicon -> lexicon ;
is_literal: lexicon -> string list -> bool ;
literal: lexicon -> string list -> string list * string list ;
\end{verbatim}
Two lexicons, for the commands and keywords, are stored and can be retrieved
by:
\begin{verbatim}
val (command_lexicon, keyword_lexicon) = OuterSyntax.get_lexicons () ;
val commands = Scan.dest_lexicon command_lexicon ;
val keywords = Scan.dest_lexicon keyword_lexicon ;
\end{verbatim}
*}
section{* The \texttt{OuterLex} structure *}
text {*
The source file is @{text "src/Pure/Isar/outer_lex.ML"}.
In some other source files its name is abbreviated:
\begin{verbatim}
structure T = OuterLex;
\end{verbatim}
This structure defines the type \texttt{token}.
(The types
\texttt{OuterLex.token},
\texttt{OuterParse.token} and
\texttt{SpecParse.token} are all the same).
Input text is split up into tokens, and the input source type for many parsing
functions is \texttt{token list}.
The datatype definition (which is not published in the signature) is
\begin{verbatim}
datatype token = Token of Position.T * (token_kind * string);
\end{verbatim}
but here are some runnable examples for viewing tokens:
*}
text {*
FIXME
@{text "
begin{verbatim}
type token = T.token ;
val toks : token list = OuterSyntax.scan ``theory,imports;begin x.y.z apply ?v1 ?'a 'a -- || 44 simp (* xx *) { * fff * }'' ;
print_depth 20 ;
List.map T.text_of toks ;
val proper_toks = List.filter T.is_proper toks ;
List.map T.kind_of proper_toks ;
List.map T.unparse proper_toks ;
List.map T.val_of proper_toks ;
end{verbatim}"}
*}
text {*
The function \texttt{is\_proper : token -> bool} identifies tokens which are
not white space or comments: many parsing functions assume require spaces or
comments to have been filtered out.
There is a special end-of-file token:
\begin{verbatim}
val (tok_eof : token, is_eof : token -> bool) = T.stopper ;
(* end of file token *)
\end{verbatim}
*}
section {* The \texttt{OuterParse} structure *}
text {*
The source file is \texttt{src/Pure/Isar/outer\_parse.ML}.
In some other source files its name is abbreviated:
\begin{verbatim}
structure P = OuterParse;
\end{verbatim}
Here the parsers use \texttt{token list} as the input source type.
Some of the parsers simply select the first token, provided that it is of the
right kind (as returned by \texttt{T.kind\_of}): these are
\texttt{ command, keyword, short\_ident, long\_ident, sym\_ident, term\_var,
type\_ident, type\_var, number, string, alt\_string, verbatim, sync, eof}
Others select the first token, provided that it is one of several kinds,
(eg, \texttt{name, xname, text, typ}).
\begin{verbatim}
type 'a tlp = token list -> 'a * token list ; (* token list parser *)
$$$ : string -> string tlp
nat : int tlp ;
maybe : 'a tlp -> 'a option tlp ;
\end{verbatim}
\texttt{\$\$\$ s} returns the first token,
if it equals \texttt{s} \emph{and} \texttt{s} is a keyword.
\texttt{nat} returns the first token, if it is a number, and evaluates it.
\texttt{maybe}: if \texttt{p} returns \texttt{r},
then \texttt{maybe p} returns \texttt{SOME r} ;
if the first token is an underscore, it returns \texttt{NONE}.
A few examples:
\begin{verbatim}
P.list : 'a tlp -> 'a list tlp ; (* likewise P.list1 *)
P.and_list : 'a tlp -> 'a list tlp ; (* likewise P.and_list1 *)
val toks : token list = OuterSyntax.scan "44 ,_, 66,77" ;
val proper_toks = List.filter T.is_proper toks ;
P.list P.nat toks ; (* OK, doesn't recognize white space *)
P.list P.nat proper_toks ; (* fails, doesn't recognize what follows ',' *)
P.list (P.maybe P.nat) proper_toks ; (* fails, end of input *)
P.list (P.maybe P.nat) (proper_toks @ [tok_eof]) ; (* OK *)
val toks : token list = OuterSyntax.scan "44 and 55 and 66 and 77" ;
P.and_list P.nat (List.filter T.is_proper toks @ [tok_eof]) ; (* ??? *)
\end{verbatim}
The following code helps run examples:
\begin{verbatim}
fun parse_str tlp str =
let val toks : token list = OuterSyntax.scan str ;
val proper_toks = List.filter T.is_proper toks @ [tok_eof] ;
val (res, rem_toks) = tlp proper_toks ;
val rem_str = String.concat
(Library.separate " " (List.map T.unparse rem_toks)) ;
in (res, rem_str) end ;
\end{verbatim}
Some examples from \texttt{src/Pure/Isar/outer\_parse.ML}
\begin{verbatim}
val type_args =
type_ident >> Library.single ||
$$$ "(" |-- !!! (list1 type_ident --| $$$ ")") ||
Scan.succeed [];
\end{verbatim}
There are three ways parsing a list of type arguments can succeed.
The first line reads a single type argument, and turns it into a singleton
list.
The second line reads "(", and then the remainder, ignoring the "(" ;
the remainder consists of a list of type identifiers (at least one),
and then a ")" which is also ignored.
The \texttt{!!!} ensures that if the parsing proceeds this far and then fails,
it won't try the third line (see the description of \texttt{Scan.!!}).
The third line consumes no input and returns the empty list.
\begin{verbatim}
fun triple2 (x, (y, z)) = (x, y, z);
val arity = xname -- ($$$ "::" |-- !!! (
Scan.optional ($$$ "(" |-- !!! (list1 sort --| $$$ ")")) []
-- sort)) >> triple2;
\end{verbatim}
The parser \texttt{arity} reads a typename $t$, then ``\texttt{::}'' (which is
ignored), then optionally a list $ss$ of sorts and then another sort $s$.
The result $(t, (ss, s))$ is transformed by \texttt{triple2} to $(t, ss, s)$.
The second line reads the optional list of sorts:
it reads first ``\texttt{(}'' and last ``\texttt{)}'', which are both ignored,
and between them a comma-separated list of sorts.
If this list is absent, the default \texttt{[]} provides the list of sorts.
\begin{verbatim}
parse_str P.type_args "('a, 'b) ntyp" ;
parse_str P.type_args "'a ntyp" ;
parse_str P.type_args "ntyp" ;
parse_str P.arity "ty :: tycl" ;
parse_str P.arity "ty :: (tycl1, tycl2) tycl" ;
\end{verbatim}
*}
section {* The \texttt{SpecParse} structure *}
text {*
The source file is \texttt{src/Pure/Isar/spec\_parse.ML}.
This structure contains token list parsers for more complicated values.
For example,
\begin{verbatim}
open SpecParse ;
attrib : Attrib.src tok_rdr ;
attribs : Attrib.src list tok_rdr ;
opt_attribs : Attrib.src list tok_rdr ;
xthm : (thmref * Attrib.src list) tok_rdr ;
xthms1 : (thmref * Attrib.src list) list tok_rdr ;
parse_str attrib "simp" ;
parse_str opt_attribs "hello" ;
val (ass, "") = parse_str attribs "[standard, xxxx, simp, intro, OF sym]" ;
map Args.dest_src ass ;
val (asrc, "") = parse_str attrib "THEN trans [THEN sym]" ;
parse_str xthm "mythm [attr]" ;
parse_str xthms1 "thm1 [attr] thms2" ;
\end{verbatim}
As you can see, attributes are described using types of the \texttt{Args}
structure, described below.
*}
section{* The \texttt{Args} structure *}
text {*
The source file is \texttt{src/Pure/Isar/args.ML}.
The primary type of this structure is the \texttt{src} datatype;
the single constructors not published in the signature, but
\texttt{Args.src} and \texttt{Args.dest\_src}
are in fact the constructor and destructor functions.
Note that the types \texttt{Attrib.src} and \texttt{Method.src}
are in fact \texttt{Args.src}.
\begin{verbatim}
src : (string * Args.T list) * Position.T -> Args.src ;
dest_src : Args.src -> (string * Args.T list) * Position.T ;
Args.pretty_src : Proof.context -> Args.src -> Pretty.T ;
fun pr_src ctxt src = Pretty.string_of (Args.pretty_src ctxt src) ;
val thy = ML_Context.the_context () ;
val ctxt = ProofContext.init thy ;
map (pr_src ctxt) ass ;
\end{verbatim}
So an \texttt{Args.src} consists of the first word, then a list of further
``arguments'', of type \texttt{Args.T}, with information about position in the
input.
\begin{verbatim}
(* how an Args.src is parsed *)
P.position : 'a tlp -> ('a * Position.T) tlp ;
P.arguments : Args.T list tlp ;
val parse_src : Args.src tlp =
P.position (P.xname -- P.arguments) >> Args.src ;
\end{verbatim}
\begin{verbatim}
val ((first_word, args), pos) = Args.dest_src asrc ;
map Args.string_of args ;
\end{verbatim}
The \texttt{Args} structure contains more parsers and parser transformers
for which the input source type is \texttt{Args.T list}. For example,
\begin{verbatim}
type 'a atlp = Args.T list -> 'a * Args.T list ;
open Args ;
nat : int atlp ; (* also Args.int *)
thm_sel : PureThy.interval list atlp ;
list : 'a atlp -> 'a list atlp ;
attribs : (string -> string) -> Args.src list atlp ;
opt_attribs : (string -> string) -> Args.src list atlp ;
(* parse_atl_str : 'a atlp -> (string -> 'a * string) ;
given an Args.T list parser, to get a string parser *)
fun parse_atl_str atlp str =
let val (ats, rem_str) = parse_str P.arguments str ;
val (res, rem_ats) = atlp ats ;
in (res, String.concat (Library.separate " "
(List.map Args.string_of rem_ats @ [rem_str]))) end ;
parse_atl_str Args.int "-1-," ;
parse_atl_str (Scan.option Args.int) "x1-," ;
parse_atl_str Args.thm_sel "(1-,4,13-22)" ;
val (ats as atsrc :: _, "") = parse_atl_str (Args.attribs I)
"[THEN trans [THEN sym], simp, OF sym]" ;
\end{verbatim}
From here, an attribute is interpreted using \texttt{Attrib.attribute}.
\texttt{Args} has a large number of functions which parse an \texttt{Args.src}
and also refer to a generic context.
Note the use of \texttt{Scan.lift} for this.
(as does \texttt{Attrib} - RETHINK THIS)
(\texttt{Args.syntax} shown below has type specialised)
\begin{verbatim}
type ('res, 'src) parse_fn = 'src -> 'res * 'src ;
type 'a cgatlp = ('a, Context.generic * Args.T list) parse_fn ;
Scan.lift : 'a atlp -> 'a cgatlp ;
term : term cgatlp ;
typ : typ cgatlp ;
Args.syntax : string -> 'res cgatlp -> src -> ('res, Context.generic) parse_fn ;
Attrib.thm : thm cgatlp ;
Attrib.thms : thm list cgatlp ;
Attrib.multi_thm : thm list cgatlp ;
(* parse_cgatl_str : 'a cgatlp -> (string -> 'a * string) ;
given a (Context.generic * Args.T list) parser, to get a string parser *)
fun parse_cgatl_str cgatlp str =
let
(* use the current generic context *)
val generic = Context.Theory thy ;
val (ats, rem_str) = parse_str P.arguments str ;
(* ignore any change to the generic context *)
val (res, (_, rem_ats)) = cgatlp (generic, ats) ;
in (res, String.concat (Library.separate " "
(List.map Args.string_of rem_ats @ [rem_str]))) end ;
\end{verbatim}
*}
section{* Attributes, and the \texttt{Attrib} structure *}
text {*
The type \texttt{attribute} is declared in \texttt{src/Pure/thm.ML}.
The source file for the \texttt{Attrib} structure is
\texttt{src/Pure/Isar/attrib.ML}.
Most attributes use a theorem to change a generic context (for example,
by declaring that the theorem should be used, by default, in simplification),
or change a theorem (which most often involves referring to the current
theory).
The functions \texttt{Thm.rule\_attribute} and
\texttt{Thm.declaration\_attribute} create attributes of these kinds.
\begin{verbatim}
type attribute = Context.generic * thm -> Context.generic * thm;
type 'a trf = 'a -> 'a ; (* transformer of a given type *)
Thm.rule_attribute : (Context.generic -> thm -> thm) -> attribute ;
Thm.declaration_attribute : (thm -> Context.generic trf) -> attribute ;
Attrib.print_attributes : theory -> unit ;
Attrib.pretty_attribs : Proof.context -> src list -> Pretty.T list ;
List.app Pretty.writeln (Attrib.pretty_attribs ctxt ass) ;
\end{verbatim}
An attribute is stored in a theory as indicated by:
\begin{verbatim}
Attrib.add_attributes :
(bstring * (src -> attribute) * string) list -> theory trf ;
(*
Attrib.add_attributes [("THEN", THEN_att, "resolution with rule")] ;
*)
\end{verbatim}
where the first and third arguments are name and description of the attribute,
and the second is a function which parses the attribute input text
(including the attribute name, which has necessarily already been parsed).
Here, \texttt{THEN\_att} is a function declared in the code for the
structure \texttt{Attrib}, but not published in its signature.
The source file \texttt{src/Pure/Isar/attrib.ML} shows the use of
\texttt{Attrib.add\_attributes} to add a number of attributes.
\begin{verbatim}
FullAttrib.THEN_att : src -> attribute ;
FullAttrib.THEN_att atsrc (generic, ML_Context.thm "sym") ;
FullAttrib.THEN_att atsrc (generic, ML_Context.thm "all_comm") ;
\end{verbatim}
\begin{verbatim}
Attrib.syntax : attribute cgatlp -> src -> attribute ;
Attrib.no_args : attribute -> src -> attribute ;
\end{verbatim}
When this is called as \texttt{syntax scan src (gc, th)}
the generic context \texttt{gc} is used
(and potentially changed to \texttt{gc'})
by \texttt{scan} in parsing to obtain an attribute \texttt{attr} which would
then be applied to \texttt{(gc', th)}.
The source for parsing the attribute is the arguments part of \texttt{src},
which must all be consumed by the parse.
For example, for \texttt{Attrib.no\_args attr src}, the attribute parser
simply returns \texttt{attr}, requiring that the arguments part of
\texttt{src} must be empty.
Some examples from \texttt{src/Pure/Isar/attrib.ML}, modified:
\begin{verbatim}
fun rot_att_n n (gc, th) = (gc, rotate_prems n th) ;
rot_att_n : int -> attribute ;
val rot_arg = Scan.lift (Scan.optional Args.int 1 : int atlp) : int cgatlp ;
val rotated_att : src -> attribute =
Attrib.syntax (rot_arg >> rot_att_n : attribute cgatlp) ;
val THEN_arg : int cgatlp = Scan.lift
(Scan.optional (Args.bracks Args.nat : int atlp) 1 : int atlp) ;
Attrib.thm : thm cgatlp ;
THEN_arg -- Attrib.thm : (int * thm) cgatlp ;
fun THEN_att_n (n, tht) (gc, th) = (gc, th RSN (n, tht)) ;
THEN_att_n : int * thm -> attribute ;
val THEN_att : src -> attribute = Attrib.syntax
(THEN_arg -- Attrib.thm >> THEN_att_n : attribute cgatlp);
\end{verbatim}
The functions I've called \texttt{rot\_arg} and \texttt{THEN\_arg}
read an optional argument, which for \texttt{rotated} is an integer,
and for \texttt{THEN} is a natural enclosed in square brackets;
the default, if the argument is absent, is 1 in each case.
Functions \texttt{rot\_att\_n} and \texttt{THEN\_att\_n} turn these into
attributes, where \texttt{THEN\_att\_n} also requires a theorem, which is
parsed by \texttt{Attrib.thm}.
Infix operators \texttt{--} and \texttt{>>} are in the structure \texttt{Scan}.
*}
section{* Methods, and the \texttt{Method} structure *}
text {*
The source file is \texttt{src/Pure/Isar/method.ML}.
The type \texttt{method} is defined by the datatype declaration
\begin{verbatim}
(* datatype method = Meth of thm list -> cases_tactic; *)
RuleCases.NO_CASES : tactic -> cases_tactic ;
\end{verbatim}
In fact \texttt{RAW\_METHOD\_CASES} (below) is exactly the constructor
\texttt{Meth}.
A \texttt{cases\_tactic} is an elaborated version of a tactic.
\texttt{NO\_CASES tac} is a \texttt{cases\_tactic} which consists of a
\texttt{cases\_tactic} without any further case information.
For further details see the description of structure \texttt{RuleCases} below.
The list of theorems to be passed to a method consists of the current
\emph{facts} in the proof.
\begin{verbatim}
RAW_METHOD : (thm list -> tactic) -> method ;
METHOD : (thm list -> tactic) -> method ;
SIMPLE_METHOD : tactic -> method ;
SIMPLE_METHOD' : (int -> tactic) -> method ;
SIMPLE_METHOD'' : ((int -> tactic) -> tactic) -> (int -> tactic) -> method ;
RAW_METHOD_CASES : (thm list -> cases_tactic) -> method ;
METHOD_CASES : (thm list -> cases_tactic) -> method ;
\end{verbatim}
A method is, in its simplest form, a tactic; applying the method is to apply
the tactic to the current goal state.
Applying \texttt{RAW\_METHOD tacf} creates a tactic by applying
\texttt{tacf} to the current {facts}, and applying that tactic to the
goal state.
\texttt{METHOD} is similar but also first applies
\texttt{Goal.conjunction\_tac} to all subgoals.
\texttt{SIMPLE\_METHOD tac} inserts the facts into all subgoals and then
applies \texttt{tacf}.
\texttt{SIMPLE\_METHOD' tacf} inserts the facts and then
applies \texttt{tacf} to subgoal 1.
\texttt{SIMPLE\_METHOD'' quant tacf} does this for subgoal(s) selected by
\texttt{quant}, which may be, for example,
\texttt{ALLGOALS} (all subgoals),
\texttt{TRYALL} (try all subgoals, failure is OK),
\texttt{FIRSTGOAL} (try subgoals until it succeeds once),
\texttt{(fn tacf => tacf 4)} (subgoal 4), etc
(see the \texttt{Tactical} structure, FIXME) %%\cite[Chapter 4]{ref}).
A method is stored in a theory as indicated by:
\begin{verbatim}
Method.add_method :
(bstring * (src -> Proof.context -> method) * string) -> theory trf ;
( *
* )
\end{verbatim}
where the first and third arguments are name and description of the method,
and the second is a function which parses the method input text
(including the method name, which has necessarily already been parsed).
Here, \texttt{xxx} is a function declared in the code for the
structure \texttt{Method}, but not published in its signature.
The source file \texttt{src/Pure/Isar/method.ML} shows the use of
\texttt{Method.add\_method} to add a number of methods.
*}
end