CookBook/Parsing.thy
changeset 4 2a69b119cdee
child 16 5045dec52d2b
equal deleted inserted replaced
3:cd861a121f60 4:2a69b119cdee
       
     1 theory Parsing
       
     2 imports Main
       
     3 
       
     4 begin
       
     5 
       
     6 chapter {* Parsing *}
       
     7 
       
     8 text {*
       
     9 
       
    10   Lots of Standard ML code is given in this document, for various reasons,
       
    11   including:
       
    12   \begin{itemize}
       
    13   \item direct quotation of code found in the Isabelle source files,
       
    14   or simplified versions of such code
       
    15   \item identifiers found in the Isabelle source code, with their types 
       
    16   (or specialisations of their types)
       
    17   \item code examples, which can be run by the reader, to help illustrate the
       
    18   behaviour of functions found in the Isabelle source code
       
    19   \item ancillary functions, not from the Isabelle source code, 
       
    20   which enable the reader to run relevant code examples
       
    21   \item type abbreviations, which help explain the uses of certain functions
       
    22   \end{itemize}
       
    23 
       
    24 *}
       
    25 
       
    26 section {* Parsing Isar input *}
       
    27 
       
    28 text {*
       
    29 
       
    30   The typical parsing function has the type
       
    31   \texttt{'src -> 'res * 'src}, with input  
       
    32   of type \texttt{'src}, returning a result 
       
    33   of type \texttt{'res}, which is (or is derived from) the first part of the
       
    34   input, and also returning the remainder of the input.
       
    35   (In the common case, when it is clear what the ``remainder of the input''
       
    36   means, we will just say that the functions ``returns'' the
       
    37   value of type \texttt{'res}). 
       
    38   An exception is raised if an appropriate value 
       
    39   cannot be produced from the input.
       
    40   A range of exceptions can be used to identify different reasons 
       
    41   for the failure of a parse.
       
    42   
       
    43   This contrasts the standard parsing function in Standard ML,
       
    44   which is of type 
       
    45   \texttt{type ('res, 'src) reader = 'src -> ('res * 'src) option};
       
    46   (for example, \texttt{List.getItem} and \texttt{Substring.getc}).
       
    47   However, much of the discussion at 
       
    48   FIX file:/home/jeremy/html/ml/SMLBasis/string-cvt.html
       
    49   is relevant.
       
    50 
       
    51   Naturally one may convert between the two different sorts of parsing functions
       
    52   as follows:
       
    53   \begin{verbatim}
       
    54   open StringCvt ;
       
    55   type ('res, 'src) ex_reader = 'src -> 'res * 'src
       
    56   (* ex_reader : ('res, 'src) reader -> ('res, 'src) ex_reader *)
       
    57   fun ex_reader rdr src = Option.valOf (rdr src) ;
       
    58   (* reader : ('res, 'src) ex_reader -> ('res, 'src) reader *)
       
    59   fun reader exrdr src = SOME (exrdr src) handle _ => NONE ;
       
    60   \end{verbatim}
       
    61   
       
    62 *}
       
    63 
       
    64 section{* The \texttt{Scan} structure *}
       
    65 
       
    66 text {* 
       
    67   The source file is \texttt{src/General/scan.ML}.
       
    68   This structure provides functions for using and combining parsing functions
       
    69   of the type \texttt{'src -> 'res * 'src}.
       
    70   Three exceptions are used:
       
    71   \begin{verbatim}
       
    72   exception MORE of string option;  (*need more input (prompt)*)
       
    73   exception FAIL of string option;  (*try alternatives (reason of failure)*)
       
    74   exception ABORT of string;        (*dead end*)
       
    75   \end{verbatim}
       
    76   Many functions in this structure (generally those with names composed of
       
    77   symbols) are declared as infix.
       
    78 
       
    79   Some functions from that structure are
       
    80   \begin{verbatim}
       
    81   |-- : ('src -> 'res1 * 'src') * ('src' -> 'res2 * 'src'') ->
       
    82   'src -> 'res2 * 'src''
       
    83   --| : ('src -> 'res1 * 'src') * ('src' -> 'res2 * 'src'') ->
       
    84   'src -> 'res1 * 'src''
       
    85   -- : ('src -> 'res1 * 'src') * ('src' -> 'res2 * 'src'') ->
       
    86   'src -> ('res1 * 'res2) * 'src''
       
    87   ^^ : ('src -> string * 'src') * ('src' -> string * 'src'') ->
       
    88   'src -> string * 'src''
       
    89   \end{verbatim}
       
    90   These functions parse a result off the input source twice.
       
    91 
       
    92   \texttt{|--} and \texttt{--|} 
       
    93   return the first result and the second result, respectively.
       
    94 
       
    95   \texttt{--} returns both.
       
    96 
       
    97   \verb|^^| returns the result of concatenating the two results
       
    98   (which must be strings).
       
    99 
       
   100   Note how, although the types 
       
   101   \texttt{'src}, \texttt{'src'} and \texttt{'src''} will normally be the same,
       
   102   the types as shown help suggest the behaviour of the functions.
       
   103   \begin{verbatim}
       
   104   :-- : ('src -> 'res1 * 'src') * ('res1 -> 'src' -> 'res2 * 'src'') ->
       
   105   'src -> ('res1 * 'res2) * 'src''
       
   106   :|-- : ('src -> 'res1 * 'src') * ('res1 -> 'src' -> 'res2 * 'src'') ->
       
   107   'src -> 'res2 * 'src''
       
   108   \end{verbatim}
       
   109   These are similar to \texttt{|--} and \texttt{--|},
       
   110   except that the second parsing function can depend on the result of the first.
       
   111   \begin{verbatim}
       
   112   >> : ('src -> 'res1 * 'src') * ('res1 -> 'res2) -> 'src -> 'res2 * 'src'
       
   113   || : ('src -> 'res_src) * ('src -> 'res_src) -> 'src -> 'res_src
       
   114   \end{verbatim}
       
   115   \texttt{p >> f} applies a function \texttt{f} to the result of a parse.
       
   116   
       
   117   \texttt{||} tries a second parsing function if the first one
       
   118   fails by raising an exception of the form \texttt{FAIL \_}.
       
   119   
       
   120   \begin{verbatim}
       
   121   succeed : 'res -> ('src -> 'res * 'src) ;
       
   122   fail : ('src -> 'res_src) ;
       
   123   !! : ('src * string option -> string) -> 
       
   124   ('src -> 'res_src) -> ('src -> 'res_src) ;
       
   125   \end{verbatim}
       
   126   \texttt{succeed r} returns \texttt{r}, with the input unchanged.
       
   127   \texttt{fail} always fails, raising exception \texttt{FAIL NONE}.
       
   128   \texttt{!! f} only affects the failure mode, turning a failure that
       
   129   raises \texttt{FAIL \_} into a failure that raises \texttt{ABORT ...}.
       
   130   This is used to prevent recovery from the failure ---
       
   131   thus, in \texttt{!! parse1 || parse2}, if \texttt{parse1} fails, 
       
   132   it won't recover by trying \texttt{parse2}.
       
   133 
       
   134   \begin{verbatim}
       
   135   one : ('si -> bool) -> ('si list -> 'si * 'si list) ;
       
   136   some : ('si -> 'res option) -> ('si list -> 'res * 'si list) ;
       
   137   \end{verbatim}
       
   138   These require the input to be a list of items:
       
   139   they fail, raising \texttt{MORE NONE} if the list is empty.
       
   140   On other failures they raise \texttt{FAIL NONE} 
       
   141 
       
   142   \texttt{one p} takes the first
       
   143   item from the list if it satisfies \texttt{p}, otherwise fails.
       
   144 
       
   145   \texttt{some f} takes the first
       
   146   item from the list and applies \texttt{f} to it, failing if this returns
       
   147   \texttt{NONE}.  
       
   148 
       
   149   \begin{verbatim}
       
   150   many : ('si -> bool) -> 'si list -> 'si list * 'si list ; 
       
   151   \end{verbatim}
       
   152   \texttt{many p} takes items from the input until it encounters one 
       
   153   which does not satisfy \texttt{p}.  If it reaches the end of the input
       
   154   it fails, raising \texttt{MORE NONE}.
       
   155 
       
   156   \texttt{many1} (with the same type) fails if the first item 
       
   157   does not satisfy \texttt{p}.  
       
   158 
       
   159   \begin{verbatim}
       
   160   option : ('src -> 'res * 'src) -> ('src -> 'res option * 'src)
       
   161   optional : ('src -> 'res * 'src) -> 'res -> ('src -> 'res * 'src)
       
   162   \end{verbatim}
       
   163   \texttt{option}: 
       
   164   where the parser \texttt{f} succeeds with result \texttt{r} 
       
   165   or raises \texttt{FAIL \_},
       
   166   \texttt{option f} gives the result \texttt{SOME r} or \texttt{NONE}.
       
   167   
       
   168   \texttt{optional}: if parser \texttt{f} fails by raising \texttt{FAIL \_},
       
   169   \texttt{optional f default} provides the result \texttt{default}.
       
   170 
       
   171   \begin{verbatim}
       
   172   repeat : ('src -> 'res * 'src) -> 'src -> 'res list * 'src
       
   173   repeat1 : ('src -> 'res * 'src) -> 'src -> 'res list * 'src
       
   174   bulk : ('src -> 'res * 'src) -> 'src -> 'res list * 'src 
       
   175   \end{verbatim}
       
   176   \texttt{repeat f} repeatedly parses an item off the remaining input until 
       
   177   \texttt{f} fails with \texttt{FAIL \_}
       
   178 
       
   179   \texttt{repeat1} is as for \texttt{repeat}, but requires at least one
       
   180   successful parse.
       
   181 
       
   182   \begin{verbatim}
       
   183   lift : ('src -> 'res * 'src) -> ('ex * 'src -> 'res * ('ex * 'src))
       
   184   \end{verbatim}
       
   185   \texttt{lift} changes the source type of a parser by putting in an extra
       
   186   component \texttt{'ex}, which is ignored in the parsing.
       
   187 
       
   188   The \texttt{Scan} structure also provides the type \texttt{lexicon}, 
       
   189   HOW DO THEY WORK ?? TO BE COMPLETED
       
   190   \begin{verbatim}
       
   191   dest_lexicon: lexicon -> string list ;
       
   192   make_lexicon: string list list -> lexicon ;
       
   193   empty_lexicon: lexicon ;
       
   194   extend_lexicon: string list list -> lexicon -> lexicon ;
       
   195   merge_lexicons: lexicon -> lexicon -> lexicon ;
       
   196   is_literal: lexicon -> string list -> bool ;
       
   197   literal: lexicon -> string list -> string list * string list ;
       
   198   \end{verbatim}
       
   199   Two lexicons, for the commands and keywords, are stored and can be retrieved
       
   200   by:
       
   201   \begin{verbatim}
       
   202   val (command_lexicon, keyword_lexicon) = OuterSyntax.get_lexicons () ;
       
   203   val commands = Scan.dest_lexicon command_lexicon ;
       
   204   val keywords = Scan.dest_lexicon keyword_lexicon ;
       
   205   \end{verbatim}
       
   206 *}
       
   207 
       
   208 section{* The \texttt{OuterLex} structure *}
       
   209 
       
   210 text {*
       
   211   The source file is @{text "src/Pure/Isar/outer_lex.ML"}.
       
   212   In some other source files its name is abbreviated:
       
   213   \begin{verbatim}
       
   214   structure T = OuterLex;
       
   215   \end{verbatim}
       
   216   This structure defines the type \texttt{token}.
       
   217   (The types
       
   218   \texttt{OuterLex.token},
       
   219   \texttt{OuterParse.token} and
       
   220   \texttt{SpecParse.token} are all the same).
       
   221   
       
   222   Input text is split up into tokens, and the input source type for many parsing
       
   223   functions is \texttt{token list}.
       
   224 
       
   225   The datatype definition (which is not published in the signature) is
       
   226   \begin{verbatim}
       
   227   datatype token = Token of Position.T * (token_kind * string);
       
   228   \end{verbatim}
       
   229   but here are some runnable examples for viewing tokens: 
       
   230 
       
   231 *}
       
   232 
       
   233 text {*
       
   234   FIXME
       
   235 
       
   236   @{text "
       
   237   begin{verbatim}
       
   238   type token = T.token ;
       
   239   val toks : token list = OuterSyntax.scan ``theory,imports;begin x.y.z apply ?v1 ?'a 'a -- || 44 simp (* xx *) { * fff * }'' ;
       
   240   print_depth 20 ;
       
   241   List.map T.text_of toks ;
       
   242   val proper_toks = List.filter T.is_proper toks ;
       
   243   List.map T.kind_of proper_toks ;
       
   244   List.map T.unparse proper_toks ;
       
   245   List.map T.val_of proper_toks ;
       
   246   end{verbatim}"}
       
   247 
       
   248 *}
       
   249 
       
   250 text {*
       
   251 
       
   252   The function \texttt{is\_proper : token -> bool} identifies tokens which are
       
   253   not white space or comments: many parsing functions assume require spaces or
       
   254   comments to have been filtered out.
       
   255   
       
   256   There is a special end-of-file token:
       
   257   \begin{verbatim}
       
   258   val (tok_eof : token, is_eof : token -> bool) = T.stopper ; 
       
   259   (* end of file token *)
       
   260   \end{verbatim}
       
   261 
       
   262 *}
       
   263 
       
   264 section {* The \texttt{OuterParse} structure *}
       
   265 
       
   266 text {*
       
   267   The source file is \texttt{src/Pure/Isar/outer\_parse.ML}.
       
   268   In some other source files its name is abbreviated:
       
   269   \begin{verbatim}
       
   270   structure P = OuterParse;
       
   271   \end{verbatim}
       
   272   Here the parsers use \texttt{token list} as the input source type. 
       
   273   
       
   274   Some of the parsers simply select the first token, provided that it is of the
       
   275   right kind (as returned by \texttt{T.kind\_of}): these are 
       
   276   \texttt{ command, keyword, short\_ident, long\_ident, sym\_ident, term\_var,
       
   277   type\_ident, type\_var, number, string, alt\_string, verbatim, sync, eof}
       
   278   Others select the first token, provided that it is one of several kinds,
       
   279   (eg, \texttt{name, xname, text, typ}).
       
   280 
       
   281   \begin{verbatim}
       
   282   type 'a tlp = token list -> 'a * token list ; (* token list parser *)
       
   283   $$$ : string -> string tlp
       
   284   nat : int tlp ;
       
   285   maybe : 'a tlp -> 'a option tlp ;
       
   286   \end{verbatim}
       
   287 
       
   288   \texttt{\$\$\$ s} returns the first token,
       
   289   if it equals \texttt{s} \emph{and} \texttt{s} is a keyword.
       
   290 
       
   291   \texttt{nat} returns the first token, if it is a number, and evaluates it.
       
   292 
       
   293   \texttt{maybe}: if \texttt{p} returns \texttt{r}, 
       
   294   then \texttt{maybe p} returns \texttt{SOME r} ;
       
   295   if the first token is an underscore, it returns \texttt{NONE}.
       
   296 
       
   297   A few examples:
       
   298   \begin{verbatim}
       
   299   P.list : 'a tlp -> 'a list tlp ; (* likewise P.list1 *)
       
   300   P.and_list : 'a tlp -> 'a list tlp ; (* likewise P.and_list1 *)
       
   301   val toks : token list = OuterSyntax.scan "44 ,_, 66,77" ;
       
   302   val proper_toks = List.filter T.is_proper toks ;
       
   303   P.list P.nat toks ; (* OK, doesn't recognize white space *)
       
   304   P.list P.nat proper_toks ; (* fails, doesn't recognize what follows ',' *)
       
   305   P.list (P.maybe P.nat) proper_toks ; (* fails, end of input *)
       
   306   P.list (P.maybe P.nat) (proper_toks @ [tok_eof]) ; (* OK *)
       
   307   val toks : token list = OuterSyntax.scan "44 and 55 and 66 and 77" ;
       
   308   P.and_list P.nat (List.filter T.is_proper toks @ [tok_eof]) ; (* ??? *)
       
   309   \end{verbatim}
       
   310 
       
   311   The following code helps run examples:
       
   312   \begin{verbatim}
       
   313   fun parse_str tlp str = 
       
   314   let val toks : token list = OuterSyntax.scan str ;
       
   315   val proper_toks = List.filter T.is_proper toks @ [tok_eof] ;
       
   316   val (res, rem_toks) = tlp proper_toks ;
       
   317   val rem_str = String.concat
       
   318   (Library.separate " " (List.map T.unparse rem_toks)) ;
       
   319   in (res, rem_str) end ;
       
   320   \end{verbatim}
       
   321 
       
   322   Some examples from \texttt{src/Pure/Isar/outer\_parse.ML}
       
   323   \begin{verbatim}
       
   324   val type_args =
       
   325   type_ident >> Library.single ||
       
   326   $$$ "(" |-- !!! (list1 type_ident --| $$$ ")") ||
       
   327   Scan.succeed [];
       
   328   \end{verbatim}
       
   329   There are three ways parsing a list of type arguments can succeed.
       
   330   The first line reads a single type argument, and turns it into a singleton
       
   331   list.
       
   332   The second line reads "(", and then the remainder, ignoring the "(" ;
       
   333   the remainder consists of a list of type identifiers (at least one),
       
   334   and then a ")" which is also ignored.
       
   335   The \texttt{!!!} ensures that if the parsing proceeds this far and then fails,
       
   336   it won't try the third line (see the description of \texttt{Scan.!!}).
       
   337   The third line consumes no input and returns the empty list.
       
   338 
       
   339   \begin{verbatim}
       
   340   fun triple2 (x, (y, z)) = (x, y, z);
       
   341   val arity = xname -- ($$$ "::" |-- !!! (
       
   342   Scan.optional ($$$ "(" |-- !!! (list1 sort --| $$$ ")")) []
       
   343   -- sort)) >> triple2;
       
   344   \end{verbatim}
       
   345   The parser \texttt{arity} reads a typename $t$, then ``\texttt{::}'' (which is
       
   346   ignored), then optionally a list $ss$ of sorts and then another sort $s$.
       
   347   The result $(t, (ss, s))$ is transformed by \texttt{triple2} to $(t, ss, s)$.
       
   348   The second line reads the optional list of sorts:
       
   349   it reads first ``\texttt{(}'' and last ``\texttt{)}'', which are both ignored,
       
   350   and between them a comma-separated list of sorts.
       
   351   If this list is absent, the default \texttt{[]} provides the list of sorts.
       
   352 
       
   353   \begin{verbatim}
       
   354   parse_str P.type_args "('a, 'b) ntyp" ;
       
   355   parse_str P.type_args "'a ntyp" ;
       
   356   parse_str P.type_args "ntyp" ;
       
   357   parse_str P.arity "ty :: tycl" ;
       
   358   parse_str P.arity "ty :: (tycl1, tycl2) tycl" ;
       
   359   \end{verbatim}
       
   360 
       
   361 *}
       
   362 
       
   363 section {* The \texttt{SpecParse} structure *}
       
   364 
       
   365 text {*
       
   366   The source file is \texttt{src/Pure/Isar/spec\_parse.ML}.
       
   367   This structure contains token list parsers for more complicated values.
       
   368   For example, 
       
   369   \begin{verbatim}
       
   370   open SpecParse ;
       
   371   attrib : Attrib.src tok_rdr ; 
       
   372   attribs : Attrib.src list tok_rdr ;
       
   373   opt_attribs : Attrib.src list tok_rdr ;
       
   374   xthm : (thmref * Attrib.src list) tok_rdr ;
       
   375   xthms1 : (thmref * Attrib.src list) list tok_rdr ;
       
   376   
       
   377   parse_str attrib "simp" ;
       
   378   parse_str opt_attribs "hello" ;
       
   379   val (ass, "") = parse_str attribs "[standard, xxxx, simp, intro, OF sym]" ;
       
   380   map Args.dest_src ass ;
       
   381   val (asrc, "") = parse_str attrib "THEN trans [THEN sym]" ;
       
   382   
       
   383   parse_str xthm "mythm [attr]" ;
       
   384   parse_str xthms1 "thm1 [attr] thms2" ;
       
   385   \end{verbatim}
       
   386   
       
   387   As you can see, attributes are described using types of the \texttt{Args}
       
   388   structure, described below.
       
   389 *}
       
   390 
       
   391 section{* The \texttt{Args} structure *}
       
   392 
       
   393 text {*
       
   394   The source file is \texttt{src/Pure/Isar/args.ML}.
       
   395   The primary type of this structure is the \texttt{src} datatype;
       
   396   the single constructors not published in the signature, but 
       
   397   \texttt{Args.src} and \texttt{Args.dest\_src}
       
   398   are in fact the constructor and destructor functions.
       
   399   Note that the types \texttt{Attrib.src} and \texttt{Method.src}
       
   400   are in fact \texttt{Args.src}.
       
   401 
       
   402   \begin{verbatim}
       
   403   src : (string * Args.T list) * Position.T -> Args.src ;
       
   404   dest_src : Args.src -> (string * Args.T list) * Position.T ;
       
   405   Args.pretty_src : Proof.context -> Args.src -> Pretty.T ;
       
   406   fun pr_src ctxt src = Pretty.string_of (Args.pretty_src ctxt src) ;
       
   407 
       
   408   val thy = ML_Context.the_context () ;
       
   409   val ctxt = ProofContext.init thy ;
       
   410   map (pr_src ctxt) ass ;
       
   411   \end{verbatim}
       
   412 
       
   413   So an \texttt{Args.src} consists of the first word, then a list of further 
       
   414   ``arguments'', of type \texttt{Args.T}, with information about position in the
       
   415   input.
       
   416   \begin{verbatim}
       
   417   (* how an Args.src is parsed *)
       
   418   P.position : 'a tlp -> ('a * Position.T) tlp ;
       
   419   P.arguments : Args.T list tlp ;
       
   420 
       
   421   val parse_src : Args.src tlp =
       
   422   P.position (P.xname -- P.arguments) >> Args.src ;
       
   423   \end{verbatim}
       
   424 
       
   425   \begin{verbatim}
       
   426   val ((first_word, args), pos) = Args.dest_src asrc ;
       
   427   map Args.string_of args ;
       
   428   \end{verbatim}
       
   429 
       
   430   The \texttt{Args} structure contains more parsers and parser transformers 
       
   431   for which the input source type is \texttt{Args.T list}.  For example,
       
   432   \begin{verbatim}
       
   433   type 'a atlp = Args.T list -> 'a * Args.T list ;
       
   434   open Args ;
       
   435   nat : int atlp ; (* also Args.int *)
       
   436   thm_sel : PureThy.interval list atlp ;
       
   437   list : 'a atlp -> 'a list atlp ;
       
   438   attribs : (string -> string) -> Args.src list atlp ;
       
   439   opt_attribs : (string -> string) -> Args.src list atlp ;
       
   440   
       
   441   (* parse_atl_str : 'a atlp -> (string -> 'a * string) ;
       
   442   given an Args.T list parser, to get a string parser *)
       
   443   fun parse_atl_str atlp str = 
       
   444   let val (ats, rem_str) = parse_str P.arguments str ;
       
   445   val (res, rem_ats) = atlp ats ;
       
   446   in (res, String.concat (Library.separate " "
       
   447   (List.map Args.string_of rem_ats @ [rem_str]))) end ;
       
   448 
       
   449   parse_atl_str Args.int "-1-," ;
       
   450   parse_atl_str (Scan.option Args.int) "x1-," ;
       
   451   parse_atl_str Args.thm_sel "(1-,4,13-22)" ;
       
   452 
       
   453   val (ats as atsrc :: _, "") = parse_atl_str (Args.attribs I)
       
   454   "[THEN trans [THEN sym], simp, OF sym]" ;
       
   455   \end{verbatim}
       
   456 
       
   457   From here, an attribute is interpreted using \texttt{Attrib.attribute}.
       
   458 
       
   459   \texttt{Args} has a large number of functions which parse an \texttt{Args.src}
       
   460   and also refer to a generic context.  
       
   461   Note the use of \texttt{Scan.lift} for this.
       
   462   (as does \texttt{Attrib} - RETHINK THIS)
       
   463   
       
   464   (\texttt{Args.syntax} shown below has type specialised)
       
   465 
       
   466   \begin{verbatim}
       
   467   type ('res, 'src) parse_fn = 'src -> 'res * 'src ;
       
   468   type 'a cgatlp = ('a, Context.generic * Args.T list) parse_fn ;
       
   469   Scan.lift : 'a atlp -> 'a cgatlp ;
       
   470   term : term cgatlp ;
       
   471   typ : typ cgatlp ;
       
   472   
       
   473   Args.syntax : string -> 'res cgatlp -> src -> ('res, Context.generic) parse_fn ;
       
   474   Attrib.thm : thm cgatlp ;
       
   475   Attrib.thms : thm list cgatlp ;
       
   476   Attrib.multi_thm : thm list cgatlp ;
       
   477   
       
   478   (* parse_cgatl_str : 'a cgatlp -> (string -> 'a * string) ;
       
   479   given a (Context.generic * Args.T list) parser, to get a string parser *)
       
   480   fun parse_cgatl_str cgatlp str = 
       
   481   let 
       
   482     (* use the current generic context *)
       
   483     val generic = Context.Theory thy ;
       
   484     val (ats, rem_str) = parse_str P.arguments str ;
       
   485     (* ignore any change to the generic context *)
       
   486     val (res, (_, rem_ats)) = cgatlp (generic, ats) ;
       
   487   in (res, String.concat (Library.separate " "
       
   488       (List.map Args.string_of rem_ats @ [rem_str]))) end ;
       
   489   \end{verbatim}
       
   490 *}
       
   491 
       
   492 section{* Attributes, and the \texttt{Attrib} structure *}
       
   493 
       
   494 text {*
       
   495   The type \texttt{attribute} is declared in \texttt{src/Pure/thm.ML}.
       
   496   The source file for the \texttt{Attrib} structure is
       
   497   \texttt{src/Pure/Isar/attrib.ML}.
       
   498   Most attributes use a theorem to change a generic context (for example, 
       
   499   by declaring that the theorem should be used, by default, in simplification),
       
   500   or change a theorem (which most often involves referring to the current
       
   501   theory). 
       
   502   The functions \texttt{Thm.rule\_attribute} and
       
   503   \texttt{Thm.declaration\_attribute} create attributes of these kinds.
       
   504 
       
   505   \begin{verbatim}
       
   506   type attribute = Context.generic * thm -> Context.generic * thm;
       
   507   type 'a trf = 'a -> 'a ; (* transformer of a given type *)
       
   508   Thm.rule_attribute  : (Context.generic -> thm -> thm) -> attribute ;
       
   509   Thm.declaration_attribute : (thm -> Context.generic trf) -> attribute ;
       
   510 
       
   511   Attrib.print_attributes : theory -> unit ;
       
   512   Attrib.pretty_attribs : Proof.context -> src list -> Pretty.T list ;
       
   513 
       
   514   List.app Pretty.writeln (Attrib.pretty_attribs ctxt ass) ;
       
   515   \end{verbatim}
       
   516 
       
   517   An attribute is stored in a theory as indicated by:
       
   518   \begin{verbatim}
       
   519   Attrib.add_attributes : 
       
   520   (bstring * (src -> attribute) * string) list -> theory trf ; 
       
   521   (*
       
   522   Attrib.add_attributes [("THEN", THEN_att, "resolution with rule")] ;
       
   523   *)
       
   524   \end{verbatim}
       
   525   where the first and third arguments are name and description of the attribute,
       
   526   and the second is a function which parses the attribute input text 
       
   527   (including the attribute name, which has necessarily already been parsed).
       
   528   Here, \texttt{THEN\_att} is a function declared in the code for the
       
   529   structure \texttt{Attrib}, but not published in its signature.
       
   530   The source file \texttt{src/Pure/Isar/attrib.ML} shows the use of 
       
   531   \texttt{Attrib.add\_attributes} to add a number of attributes.
       
   532 
       
   533   \begin{verbatim}
       
   534   FullAttrib.THEN_att : src -> attribute ;
       
   535   FullAttrib.THEN_att atsrc (generic, ML_Context.thm "sym") ;
       
   536   FullAttrib.THEN_att atsrc (generic, ML_Context.thm "all_comm") ;
       
   537   \end{verbatim}
       
   538 
       
   539   \begin{verbatim}
       
   540   Attrib.syntax : attribute cgatlp -> src -> attribute ;
       
   541   Attrib.no_args : attribute -> src -> attribute ;
       
   542   \end{verbatim}
       
   543   When this is called as \texttt{syntax scan src (gc, th)}
       
   544   the generic context \texttt{gc} is used 
       
   545   (and potentially changed to \texttt{gc'})
       
   546   by \texttt{scan} in parsing to obtain an attribute \texttt{attr} which would
       
   547   then be applied to \texttt{(gc', th)}.
       
   548   The source for parsing the attribute is the arguments part of \texttt{src},
       
   549   which must all be consumed by the parse.
       
   550 
       
   551   For example, for \texttt{Attrib.no\_args attr src}, the attribute parser 
       
   552   simply returns \texttt{attr}, requiring that the arguments part of
       
   553   \texttt{src} must be empty.
       
   554 
       
   555   Some examples from \texttt{src/Pure/Isar/attrib.ML}, modified:
       
   556   \begin{verbatim}
       
   557   fun rot_att_n n (gc, th) = (gc, rotate_prems n th) ;
       
   558   rot_att_n : int -> attribute ;
       
   559   val rot_arg = Scan.lift (Scan.optional Args.int 1 : int atlp) : int cgatlp ;
       
   560   val rotated_att : src -> attribute =
       
   561   Attrib.syntax (rot_arg >> rot_att_n : attribute cgatlp) ;
       
   562   
       
   563   val THEN_arg : int cgatlp = Scan.lift 
       
   564   (Scan.optional (Args.bracks Args.nat : int atlp) 1 : int atlp) ;
       
   565 
       
   566   Attrib.thm : thm cgatlp ;
       
   567 
       
   568   THEN_arg -- Attrib.thm : (int * thm) cgatlp ;
       
   569 
       
   570   fun THEN_att_n (n, tht) (gc, th) = (gc, th RSN (n, tht)) ;
       
   571   THEN_att_n : int * thm -> attribute ;
       
   572 
       
   573   val THEN_att : src -> attribute = Attrib.syntax
       
   574   (THEN_arg -- Attrib.thm >> THEN_att_n : attribute cgatlp);
       
   575   \end{verbatim}
       
   576   The functions I've called \texttt{rot\_arg} and \texttt{THEN\_arg}
       
   577   read an optional argument, which for \texttt{rotated} is an integer, 
       
   578   and for \texttt{THEN} is a natural enclosed in square brackets;
       
   579   the default, if the argument is absent, is 1 in each case.
       
   580   Functions \texttt{rot\_att\_n} and \texttt{THEN\_att\_n} turn these into
       
   581   attributes, where \texttt{THEN\_att\_n} also requires a theorem, which is
       
   582   parsed by \texttt{Attrib.thm}.  
       
   583   Infix operators \texttt{--} and \texttt{>>} are in the structure \texttt{Scan}.
       
   584 
       
   585 *}
       
   586 
       
   587 section{* Methods, and the \texttt{Method} structure *}
       
   588 
       
   589 text {*
       
   590   The source file is \texttt{src/Pure/Isar/method.ML}.
       
   591   The type \texttt{method} is defined by the datatype declaration
       
   592   \begin{verbatim}
       
   593   (* datatype method = Meth of thm list -> cases_tactic; *)
       
   594   RuleCases.NO_CASES : tactic -> cases_tactic ;
       
   595   \end{verbatim}
       
   596   In fact \texttt{RAW\_METHOD\_CASES} (below) is exactly the constructor
       
   597   \texttt{Meth}.
       
   598   A \texttt{cases\_tactic} is an elaborated version of a tactic.
       
   599   \texttt{NO\_CASES tac} is a \texttt{cases\_tactic} which consists of a
       
   600   \texttt{cases\_tactic} without any further case information.
       
   601   For further details see the description of structure \texttt{RuleCases} below.
       
   602   The list of theorems to be passed to a method consists of the current
       
   603   \emph{facts} in the proof.
       
   604   
       
   605   \begin{verbatim}
       
   606   RAW_METHOD : (thm list -> tactic) -> method ;
       
   607   METHOD : (thm list -> tactic) -> method ;
       
   608   
       
   609   SIMPLE_METHOD : tactic -> method ;
       
   610   SIMPLE_METHOD' : (int -> tactic) -> method ;
       
   611   SIMPLE_METHOD'' : ((int -> tactic) -> tactic) -> (int -> tactic) -> method ;
       
   612 
       
   613   RAW_METHOD_CASES : (thm list -> cases_tactic) -> method ;
       
   614   METHOD_CASES : (thm list -> cases_tactic) -> method ;
       
   615   \end{verbatim}
       
   616   A method is, in its simplest form, a tactic; applying the method is to apply
       
   617   the tactic to the current goal state.
       
   618 
       
   619   Applying \texttt{RAW\_METHOD tacf} creates a tactic by applying 
       
   620   \texttt{tacf} to the current {facts}, and applying that tactic to the
       
   621   goal state.
       
   622 
       
   623   \texttt{METHOD} is similar but also first applies
       
   624   \texttt{Goal.conjunction\_tac} to all subgoals.
       
   625 
       
   626   \texttt{SIMPLE\_METHOD tac} inserts the facts into all subgoals and then
       
   627   applies \texttt{tacf}.
       
   628 
       
   629   \texttt{SIMPLE\_METHOD' tacf} inserts the facts and then
       
   630   applies \texttt{tacf} to subgoal 1.
       
   631 
       
   632   \texttt{SIMPLE\_METHOD'' quant tacf} does this for subgoal(s) selected by
       
   633   \texttt{quant}, which may be, for example,
       
   634   \texttt{ALLGOALS} (all subgoals),
       
   635   \texttt{TRYALL} (try all subgoals, failure is OK),
       
   636   \texttt{FIRSTGOAL} (try subgoals until it succeeds once), 
       
   637   \texttt{(fn tacf => tacf 4)} (subgoal 4), etc
       
   638   (see the \texttt{Tactical} structure, \cite[Chapter 4]{ref}).
       
   639 
       
   640   A method is stored in a theory as indicated by:
       
   641   \begin{verbatim}
       
   642   Method.add_method : 
       
   643   (bstring * (src -> Proof.context -> method) * string) -> theory trf ; 
       
   644   ( *
       
   645   * )
       
   646   \end{verbatim}
       
   647   where the first and third arguments are name and description of the method,
       
   648   and the second is a function which parses the method input text 
       
   649   (including the method name, which has necessarily already been parsed).
       
   650 
       
   651   Here, \texttt{xxx} is a function declared in the code for the
       
   652   structure \texttt{Method}, but not published in its signature.
       
   653   The source file \texttt{src/Pure/Isar/method.ML} shows the use of 
       
   654   \texttt{Method.add\_method} to add a number of methods.
       
   655 
       
   656 
       
   657 *}
       
   658 
       
   659 
       
   660 end