ProgTutorial/Parsing.thy
changeset 344 83d5bca38bec
parent 328 c0cae24b9d46
child 346 0fea8b7a14a1
--- a/ProgTutorial/Parsing.thy	Sun Oct 11 22:45:29 2009 +0200
+++ b/ProgTutorial/Parsing.thy	Sun Oct 11 23:16:34 2009 +0200
@@ -37,7 +37,7 @@
 text {*
 
   Let us first have a look at parsing strings using generic parsing
-  combinators. The function @{ML_ind  "$$"} takes a string as argument and will
+  combinators. The function @{ML_ind "$$" in Scan} takes a string as argument and will
   ``consume'' this string from a given input list of strings. ``Consume'' in
   this context means that it will return a pair consisting of this string and
   the rest of the input list. For example:
@@ -71,11 +71,12 @@
   However, note that these exceptions are private to the parser and cannot be accessed
   by the programmer (for example to handle them).
 
-  In the examples above we use the function @{ML_ind Symbol.explode}, instead of the 
-  more standard library function @{ML_ind explode}, for obtaining an input list for 
-  the parser. The reason is that @{ML_ind Symbol.explode} is aware of character sequences, 
-  for example @{text "\<foo>"}, that have a special meaning in Isabelle. To see
-  the difference consider
+  In the examples above we use the function @{ML_ind explode in Symbol} in the
+  structure @{ML_struct Symbol}, instead of the more standard library function
+  @{ML_ind explode}, for obtaining an input list for the parser. The reason is
+  that @{ML_ind explode} in @{ML_struct Symbol} is aware of character
+  sequences, for example @{text "\<foo>"}, that have a special meaning in
+  Isabelle. To see the difference consider
 
 @{ML_response_fake [display,gray]
 "let 
@@ -87,7 +88,7 @@
  [\"\<foo>\", \" \", \"b\", \"a\", \"r\"])"}
 
   Slightly more general than the parser @{ML "$$"} is the function 
-  @{ML_ind  one in Scan}, in that it takes a predicate as argument and 
+  @{ML_ind one in Scan}, in that it takes a predicate as argument and 
   then parses exactly
   one item from the input list satisfying this predicate. For example the
   following parser either consumes an @{text [quotes] "h"} or a @{text
@@ -103,7 +104,7 @@
 end"
     "((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
 
-  Two parsers can be connected in sequence by using the function @{ML_ind  "--"}. 
+  Two parsers can be connected in sequence by using the function @{ML_ind "--" in Scan}. 
   For example parsing @{text "h"}, @{text "e"} and @{text "l"} (in this 
   order) you can achieve by:
 
@@ -121,7 +122,7 @@
   "(\"hell\", [\"o\"])"}
 
   Parsers that explore alternatives can be constructed using the function 
-  @{ML_ind  "||"}. The parser @{ML "(p || q)" for p q} returns the
+  @{ML_ind  "||" in Scan}. The parser @{ML "(p || q)" for p q} returns the
   result of @{text "p"}, in case it succeeds, otherwise it returns the
   result of @{text "q"}. For example:
 
@@ -136,7 +137,7 @@
 end"
   "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
 
-  The functions @{ML_ind "|--"} and @{ML_ind "--|"} work like the sequencing
+  The functions @{ML_ind "|--" in Scan} and @{ML_ind "--|" in Scan} work like the sequencing
   function for parsers, except that they discard the item being parsed by the
   first (respectively second) parser. For example:
   
@@ -164,7 +165,7 @@
 end" 
  "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
 
-  The function @{ML_ind  option in Scan} works similarly, except no default value can
+  The function @{ML_ind option in Scan} works similarly, except no default value can
   be given. Instead, the result is wrapped as an @{text "option"}-type. For example:
 
 @{ML_response [display,gray]
@@ -176,14 +177,14 @@
   (p input1, p input2)
 end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
 
-  The function @{ML_ind  ahead in Scan} parses some input, but leaves the original
+  The function @{ML_ind ahead in Scan} parses some input, but leaves the original
   input unchanged. For example:
 
   @{ML_response [display,gray]
   "Scan.ahead (Scan.this_string \"foo\") (Symbol.explode \"foo\")" 
   "(\"foo\", [\"f\", \"o\", \"o\"])"} 
 
-  The function @{ML_ind  "!!"} helps with producing appropriate error messages
+  The function @{ML_ind "!!" in Scan} helps with producing appropriate error messages
   during parsing. For example if you want to parse @{text p} immediately 
   followed by @{text q}, or start a completely different parser @{text r},
   you might write:
@@ -219,13 +220,13 @@
 
   then the parsing aborts and the error message @{text "foo"} is printed. In order to
   see the error message properly, you need to prefix the parser with the function 
-  @{ML_ind  error in Scan}. For example:
+  @{ML_ind error in Scan}. For example:
 
   @{ML_response_fake [display,gray] 
   "Scan.error (!! (fn _ => \"foo\") ($$ \"h\"))"
   "Exception Error \"foo\" raised"}
 
-  This ``prefixing'' is usually done by wrappers such as @{ML_ind  local_theory in OuterSyntax} 
+  This ``prefixing'' is usually done by wrappers such as @{ML_ind local_theory in OuterSyntax} 
   (see Section~\ref{sec:newcommand} which explains this function in more detail). 
   
   Let us now return to our example of parsing @{ML "(p -- q) || r" for p q
@@ -262,14 +263,14 @@
   @{ML_response [display,gray] "Scan.repeat ($$ \"h\") (Symbol.explode \"hhhhello\")" 
                 "([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"}
   
-  Note that @{ML_ind  repeat in Scan} stores the parsed items in a list. The function
-  @{ML_ind  repeat1 in Scan} is similar, but requires that the parser @{text "p"} 
+  Note that @{ML_ind repeat in Scan} stores the parsed items in a list. The function
+  @{ML_ind repeat1 in Scan} is similar, but requires that the parser @{text "p"} 
   succeeds at least once.
 
   Also note that the parser would have aborted with the exception @{text MORE}, if
   you had it run with the string @{text [quotes] "hhhh"}. This can be avoided by using
-  the wrapper @{ML_ind  finite in Scan} and the ``stopper-token'' 
-  @{ML_ind  stopper in Symbol}. With them you can write:
+  the wrapper @{ML_ind finite in Scan} and the ``stopper-token'' 
+  @{ML_ind stopper in Symbol}. With them you can write:
   
   @{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (Symbol.explode \"hhhh\")" 
                 "([\"h\", \"h\", \"h\", \"h\"], [])"}
@@ -278,7 +279,7 @@
   other stoppers need to be used when parsing, for example, tokens. However, this kind of 
   manually wrapping is often already done by the surrounding infrastructure. 
 
-  The function @{ML_ind  repeat in Scan} can be used with @{ML_ind  one in Scan} to read any 
+  The function @{ML_ind repeat in Scan} can be used with @{ML_ind one in Scan} to read any 
   string as in
 
   @{ML_response [display,gray] 
@@ -290,10 +291,10 @@
 end" 
 "([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"}
 
-  where the function @{ML_ind  not_eof in Symbol} ensures that we do not read beyond the 
+  where the function @{ML_ind not_eof in Symbol} ensures that we do not read beyond the 
   end of the input string (i.e.~stopper symbol).
 
-  The function @{ML_ind  unless in Scan} takes two parsers: if the first one can 
+  The function @{ML_ind unless in Scan} takes two parsers: if the first one can 
   parse the input, then the whole parser fails; if not, then the second is tried. Therefore
   
   @{ML_response_fake_both [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (Symbol.explode \"hello\")"
@@ -306,7 +307,7 @@
 
   succeeds. 
 
-  The functions @{ML_ind  repeat in Scan} and @{ML_ind  unless in Scan} can 
+  The functions @{ML_ind repeat in Scan} and @{ML_ind unless in Scan} can 
   be combined to read any input until a certain marker symbol is reached. In the 
   example below the marker symbol is a @{text [quotes] "*"}.
 
@@ -324,7 +325,7 @@
 
   
   After parsing is done, you almost always want to apply a function to the parsed 
-  items. One way to do this is the function @{ML_ind ">>"} where 
+  items. One way to do this is the function @{ML_ind ">>" in Scan} where 
   @{ML "(p >> f)" for p f} runs 
   first the parser @{text p} and upon successful completion applies the 
   function @{text f} to the result. For example
@@ -352,7 +353,7 @@
   where the single-character strings in the parsed output are transformed
   back into one string.
 
-  The function @{ML_ind  lift in Scan} takes a parser and a pair as arguments. This function applies
+  The function @{ML_ind lift in Scan} takes a parser and a pair as arguments. This function applies
   the given parser to the second component of the pair and leaves the  first component 
   untouched. For example
 
@@ -393,11 +394,11 @@
   \end{readmore}
 
   The structure @{ML_struct  OuterLex} defines several kinds of tokens (for
-  example @{ML_ind  Ident in OuterLex} for identifiers, @{ML Keyword in
-  OuterLex} for keywords and @{ML_ind  Command in OuterLex} for commands). Some
+  example @{ML_ind Ident in OuterLex} for identifiers, @{ML Keyword in
+  OuterLex} for keywords and @{ML_ind Command in OuterLex} for commands). Some
   token parsers take into account the kind of tokens. The first example shows
   how to generate a token list out of a string using the function 
-  @{ML_ind  scan in OuterSyntax}. It is given the argument 
+  @{ML_ind scan in OuterSyntax}. It is given the argument 
   @{ML "Position.none"} since,
   at the moment, we are not interested in generating precise error
   messages. The following code\footnote{Note that because of a possible bug in
@@ -415,7 +416,7 @@
   other syntactic category. The second indicates a space.
 
   We can easily change what is recognised as a keyword with the function
-  @{ML_ind  keyword in OuterKeyword}. For example calling it with 
+  @{ML_ind keyword in OuterKeyword}. For example calling it with 
 *}
 
 ML{*val _ = OuterKeyword.keyword "hello"*}
@@ -430,7 +431,7 @@
 
   Many parsing functions later on will require white space, comments and the like
   to have already been filtered out.  So from now on we are going to use the 
-  functions @{ML filter} and @{ML_ind  is_proper in OuterLex} to do this. 
+  functions @{ML filter} and @{ML_ind is_proper in OuterLex} to do this. 
   For example:
 
 @{ML_response_fake [display,gray]
@@ -468,10 +469,10 @@
 end" 
 "([\"}\", \"{\", \<dots>], [\"\<rightleftharpoons>\", \"\<leftharpoondown>\", \<dots>])"}
 
-  You might have to adjust the @{ML_ind  print_depth} in order to
+  You might have to adjust the @{ML_ind print_depth} in order to
   see the complete list.
 
-  The parser @{ML_ind  "$$$" in OuterParse} parses a single keyword. For example:
+  The parser @{ML_ind "$$$" in OuterParse} parses a single keyword. For example:
 
 @{ML_response [display,gray]
 "let 
@@ -482,7 +483,7 @@
 end"
 "((\"where\",\<dots>), (\"|\",\<dots>))"}
 
-  Any non-keyword string can be parsed with the function @{ML_ind  reserved in OuterParse}.
+  Any non-keyword string can be parsed with the function @{ML_ind reserved in OuterParse}.
   For example:
 
   @{ML_response [display,gray]
@@ -494,7 +495,7 @@
 end"
   "(\"bar\",[])"}
 
-  Like before, you can sequentially connect parsers with @{ML_ind  "--"}. For example: 
+  Like before, you can sequentially connect parsers with @{ML "--"}. For example: 
 
 @{ML_response [display,gray]
 "let 
@@ -558,7 +559,7 @@
 
   \begin{exercise} (FIXME)
   A type-identifier, for example @{typ "'a"}, is a token of 
-  kind @{ML_ind  Keyword in OuterLex}. It can be parsed using 
+  kind @{ML_ind Keyword in OuterLex}. It can be parsed using 
   the function @{ML type_ident in OuterParse}.
   \end{exercise}
 
@@ -739,7 +740,7 @@
   variables with optional type-annotation and syntax-annotation, and a list of
   rules where every rule has optionally a name and an attribute.
 
-  The function @{ML_ind  "fixes" in OuterParse} in Line 2 of the parser reads an 
+  The function @{ML_ind "fixes" in OuterParse} in Line 2 of the parser reads an 
   \isacommand{and}-separated 
   list of variables that can include optional type annotations and syntax translations. 
   For example:\footnote{Note that in the code we need to write 
@@ -763,8 +764,8 @@
   not yet used to type the variables: this must be done by type-inference later
   on. Since types are part of the inner syntax they are strings with some
   encoded information (see previous section). If a mixfix-syntax is
-  present for a variable, then it is stored in the @{ML_ind  Mixfix} data structure;
-  no syntax translation is indicated by @{ML_ind  NoSyn}.
+  present for a variable, then it is stored in the @{ML_ind Mixfix} data structure;
+  no syntax translation is indicated by @{ML_ind NoSyn}.
 
   \begin{readmore}
   The data structure for mixfix annotations is defined in @{ML_file "Pure/Syntax/mixfix.ML"}.
@@ -773,7 +774,7 @@
   Lines 3 to 7 in the function @{ML spec_parser} implement the parser for a
   list of introduction rules, that is propositions with theorem annotations
   such as rule names and attributes. The introduction rules are propositions
-  parsed by @{ML_ind  prop  in OuterParse}. However, they can include an optional
+  parsed by @{ML_ind prop in OuterParse}. However, they can include an optional
   theorem name plus some attributes. For example
 
 @{ML_response [display,gray] "let 
@@ -783,8 +784,8 @@
   (name, map Args.dest_src attrib)
 end" "(foo_lemma, [((\"intro\", []), \<dots>), ((\"dest\", [\<dots>]), \<dots>)])"}
  
-  The function @{ML_ind  opt_thm_name in SpecParse} is the ``optional'' variant of
-  @{ML_ind  thm_name in SpecParse}. Theorem names can contain attributes. The name 
+  The function @{ML_ind opt_thm_name in SpecParse} is the ``optional'' variant of
+  @{ML_ind thm_name in SpecParse}. Theorem names can contain attributes. The name 
   has to end with @{text [quotes] ":"}---see the argument of 
   the function @{ML SpecParse.opt_thm_name} in Line 7.
 
@@ -799,7 +800,7 @@
   Have a look at how the parser @{ML SpecParse.where_alt_specs} is implemented
   in file @{ML_file "Pure/Isar/spec_parse.ML"}. This parser corresponds
   to the ``where-part'' of the introduction rules given above. Below
-  we paraphrase the code of @{ML_ind  where_alt_specs in SpecParse} adapted to our
+  we paraphrase the code of @{ML_ind where_alt_specs in SpecParse} adapted to our
   purposes. 
   \begin{isabelle}
 *}
@@ -850,7 +851,7 @@
 end *}
 
 text {*
-  The crucial function @{ML_ind  local_theory in OuterSyntax} expects a name for the command, a
+  The crucial function @{ML_ind local_theory in OuterSyntax} expects a name for the command, a
   short description, a kind indicator (which we will explain later more thoroughly) and a
   parser producing a local theory transition (its purpose will also explained
   later). 
@@ -996,7 +997,7 @@
 
   The crucial part of a command is the function that determines the behaviour
   of the command. In the code above we used a ``do-nothing''-function, which
-  because of @{ML_ind  succeed in Scan} does not parse any argument, but immediately
+  because of @{ML_ind succeed in Scan} does not parse any argument, but immediately
   returns the simple function @{ML "LocalTheory.theory I"}. We can
   replace this code by a function that first parses a proposition (using the
   parser @{ML OuterParse.prop}), then prints out the tracing
@@ -1022,7 +1023,7 @@
 foobar_trace "True \<and> False"
 
 text {*
-  Note that so far we used @{ML_ind  thy_decl in OuterKeyword} as the kind
+  Note that so far we used @{ML_ind thy_decl in OuterKeyword} as the kind
   indicator for the command.  This means that the command finishes as soon as
   the arguments are processed. Examples of this kind of commands are
   \isacommand{definition} and \isacommand{declare}.  In other cases, commands
@@ -1030,7 +1031,7 @@
   ``open up'' a proof in order to prove the proposition (for example
   \isacommand{lemma}) or prove some other properties (for example
   \isacommand{function}). To achieve this kind of behaviour, you have to use
-  the kind indicator @{ML_ind  thy_goal in OuterKeyword} and the function @{ML
+  the kind indicator @{ML_ind thy_goal in OuterKeyword} and the function @{ML
   "local_theory_to_proof" in OuterSyntax} to set up the command.  Note,
   however, once you change the ``kind'' of a command from @{ML thy_decl in
   OuterKeyword} to @{ML thy_goal in OuterKeyword} then the keyword file needs
@@ -1059,8 +1060,8 @@
 text {*
   The function @{text goal_prop} in Lines 2 to 7 takes a string (the proposition to be
   proved) and a context as argument.  The context is necessary in order to be able to use
-  @{ML_ind  read_prop in Syntax}, which converts a string into a proper proposition.
-  In Line 6 the function @{ML_ind  theorem_i in Proof} starts the proof for the
+  @{ML_ind read_prop in Syntax}, which converts a string into a proper proposition.
+  In Line 6 the function @{ML_ind theorem_i in Proof} starts the proof for the
   proposition. Its argument @{ML NONE} stands for a locale (which we chose to
   omit); the argument @{ML "(K I)"} stands for a function that determines what
   should be done with the theorem once it is proved (we chose to just forget
@@ -1085,7 +1086,7 @@
 text {*
   {\bf TBD below}
 
-  (FIXME: read a name and show how to store theorems; see @{ML_ind  note in LocalTheory})
+  (FIXME: read a name and show how to store theorems; see @{ML_ind note in LocalTheory})
   
 *}
 
@@ -1266,7 +1267,7 @@
   It defines the method @{text foo}, which takes no arguments (therefore the
   parser @{ML Scan.succeed}) and only applies a single tactic, namely the tactic which 
   applies @{thm [source] conjE} and then @{thm [source] conjI}. The function 
-  @{ML_ind  SIMPLE_METHOD}
+  @{ML_ind SIMPLE_METHOD in Method}
   turns such a tactic into a method. The method @{text "foo"} can be used as follows
 *}