diff -r de49d5780f57 -r 74f0a06f751f ProgTutorial/Parsing.thy --- a/ProgTutorial/Parsing.thy Thu Aug 20 14:19:39 2009 +0200 +++ b/ProgTutorial/Parsing.thy Thu Aug 20 22:30:20 2009 +0200 @@ -37,7 +37,7 @@ text {* Let us first have a look at parsing strings using generic parsing - combinators. The function @{ML_ind [index] "$$"} takes a string as argument and will + combinators. The function @{ML_ind "$$"} takes a string as argument and will ``consume'' this string from a given input list of strings. ``Consume'' in this context means that it will return a pair consisting of this string and the rest of the input list. For example: @@ -71,11 +71,9 @@ However, note that these exceptions are private to the parser and cannot be accessed by the programmer (for example to handle them). - \indexdef{explode@ {\tt\slshape{}explode}}{in {\tt\slshape Symbol}} - \index{explode@ {\tt\slshape{}explode}} - In the examples above we use the function @{ML Symbol.explode}, instead of the - more standard library function @{ML explode}, for obtaining an input list for - the parser. The reason is that @{ML Symbol.explode} is aware of character sequences, + In the examples above we use the function @{ML_ind Symbol.explode}, instead of the + more standard library function @{ML_ind explode}, for obtaining an input list for + the parser. The reason is that @{ML_ind Symbol.explode} is aware of character sequences, for example @{text "\"}, that have a special meaning in Isabelle. To see the difference consider @@ -89,7 +87,7 @@ [\"\\", \" \", \"b\", \"a\", \"r\"])"} Slightly more general than the parser @{ML "$$"} is the function - @{ML_ind [index] one in Scan}, in that it takes a predicate as argument and + @{ML_ind one in Scan}, in that it takes a predicate as argument and then parses exactly one item from the input list satisfying this predicate. For example the following parser either consumes an @{text [quotes] "h"} or a @{text @@ -105,7 +103,7 @@ end" "((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"} - Two parsers can be connected in sequence by using the function @{ML_ind [index] "--"}. + Two parsers can be connected in sequence by using the function @{ML_ind "--"}. For example parsing @{text "h"}, @{text "e"} and @{text "l"} (in this order) you can achieve by: @@ -116,14 +114,14 @@ Note how the result of consumed strings builds up on the left as nested pairs. If, as in the previous example, you want to parse a particular string, - then you should use the function @{ML_ind [index] this_string in Scan}: + then you should use the function @{ML_ind this_string in Scan}: @{ML_response [display,gray] "Scan.this_string \"hell\" (Symbol.explode \"hello\")" "(\"hell\", [\"o\"])"} Parsers that explore alternatives can be constructed using the function - @{ML_ind [index] "||"}. The parser @{ML "(p || q)" for p q} returns the + @{ML_ind "||"}. The parser @{ML "(p || q)" for p q} returns the result of @{text "p"}, in case it succeeds, otherwise it returns the result of @{text "q"}. For example: @@ -138,7 +136,7 @@ end" "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"} - The functions @{ML_ind [index] "|--"} and @{ML_ind [index] "--|"} work like the sequencing function + The functions @{ML_ind "|--"} and @{ML_ind "--|"} work like the sequencing function for parsers, except that they discard the item being parsed by the first (respectively second) parser. For example: @@ -152,7 +150,6 @@ end" "((\"e\", [\"l\", \"l\", \"o\"]), (\"h\", [\"l\", \"l\", \"o\"]))"} - \indexdef{optional@ {\tt\slshape{optional}}}{in {\tt\slshape Scan}} The parser @{ML "Scan.optional p x" for p x} returns the result of the parser @{text "p"}, if it succeeds; otherwise it returns the default value @{text "x"}. For example: @@ -167,7 +164,7 @@ end" "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"} - The function @{ML_ind [index] option in Scan} works similarly, except no default value can + The function @{ML_ind option in Scan} works similarly, except no default value can be given. Instead, the result is wrapped as an @{text "option"}-type. For example: @{ML_response [display,gray] @@ -179,7 +176,7 @@ (p input1, p input2) end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"} - The function @{ML_ind [index] "!!"} helps to produce appropriate error messages + The function @{ML_ind "!!"} helps to produce appropriate error messages for parsing. For example if you want to parse @{text p} immediately followed by @{text q}, or start a completely different parser @{text r}, you might write: @@ -215,13 +212,13 @@ then the parsing aborts and the error message @{text "foo"} is printed. In order to see the error message properly, you need to prefix the parser with the function - @{ML_ind [index] error in Scan}. For example: + @{ML_ind error in Scan}. For example: @{ML_response_fake [display,gray] "Scan.error (!! (fn _ => \"foo\") ($$ \"h\"))" "Exception Error \"foo\" raised"} - This ``prefixing'' is usually done by wrappers such as @{ML_ind [index] local_theory in OuterSyntax} + This ``prefixing'' is usually done by wrappers such as @{ML_ind local_theory in OuterSyntax} (see Section~\ref{sec:newcommand} which explains this function in more detail). Let us now return to our example of parsing @{ML "(p -- q) || r" for p q @@ -258,14 +255,14 @@ @{ML_response [display,gray] "Scan.repeat ($$ \"h\") (Symbol.explode \"hhhhello\")" "([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"} - Note that @{ML_ind [index] repeat in Scan} stores the parsed items in a list. The function - @{ML_ind [index] repeat1 in Scan} is similar, but requires that the parser @{text "p"} + Note that @{ML_ind repeat in Scan} stores the parsed items in a list. The function + @{ML_ind repeat1 in Scan} is similar, but requires that the parser @{text "p"} succeeds at least once. Also note that the parser would have aborted with the exception @{text MORE}, if you had run it only on just @{text [quotes] "hhhh"}. This can be avoided by using - the wrapper @{ML_ind [index] finite in Scan} and the ``stopper-token'' - @{ML_ind [index] stopper in Symbol}. With them you can write: + the wrapper @{ML_ind finite in Scan} and the ``stopper-token'' + @{ML_ind stopper in Symbol}. With them you can write: @{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (Symbol.explode \"hhhh\")" "([\"h\", \"h\", \"h\", \"h\"], [])"} @@ -274,7 +271,7 @@ other stoppers need to be used when parsing, for example, tokens. However, this kind of manually wrapping is often already done by the surrounding infrastructure. - The function @{ML_ind [index] repeat in Scan} can be used with @{ML_ind [index] one in Scan} to read any + The function @{ML_ind repeat in Scan} can be used with @{ML_ind one in Scan} to read any string as in @{ML_response [display,gray] @@ -286,10 +283,10 @@ end" "([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"} - where the function @{ML_ind [index] not_eof in Symbol} ensures that we do not read beyond the + where the function @{ML_ind not_eof in Symbol} ensures that we do not read beyond the end of the input string (i.e.~stopper symbol). - The function @{ML_ind [index] unless in Scan} takes two parsers: if the first one can + The function @{ML_ind unless in Scan} takes two parsers: if the first one can parse the input, then the whole parser fails; if not, then the second is tried. Therefore @{ML_response_fake_both [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (Symbol.explode \"hello\")" @@ -302,7 +299,7 @@ succeeds. - The functions @{ML_ind [index] repeat in Scan} and @{ML_ind [index] unless in Scan} can + The functions @{ML_ind repeat in Scan} and @{ML_ind unless in Scan} can be combined to read any input until a certain marker symbol is reached. In the example below the marker symbol is a @{text [quotes] "*"}. @@ -320,7 +317,7 @@ After parsing is done, you almost always want to apply a function to the parsed - items. One way to do this is the function @{ML_ind [index]">>"} where + items. One way to do this is the function @{ML_ind ">>"} where @{ML "(p >> f)" for p f} runs first the parser @{text p} and upon successful completion applies the function @{text f} to the result. For example @@ -349,14 +346,14 @@ (FIXME: move to an earlier place) - The function @{ML_ind [index] ahead in Scan} parses some input, but leaves the original + The function @{ML_ind ahead in Scan} parses some input, but leaves the original input unchanged. For example: @{ML_response [display,gray] "Scan.ahead (Scan.this_string \"foo\") (Symbol.explode \"foo\")" "(\"foo\", [\"f\", \"o\", \"o\"])"} - The function @{ML_ind [index] lift in Scan} takes a parser and a pair as arguments. This function applies + The function @{ML_ind lift in Scan} takes a parser and a pair as arguments. This function applies the given parser to the second component of the pair and leaves the first component untouched. For example @@ -396,12 +393,12 @@ The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}. \end{readmore} - The structure @{ML_struct [index] OuterLex} defines several kinds of tokens (for - example @{ML_ind [index] Ident in OuterLex} for identifiers, @{ML Keyword in - OuterLex} for keywords and @{ML_ind [index] Command in OuterLex} for commands). Some + The structure @{ML_struct OuterLex} defines several kinds of tokens (for + example @{ML_ind Ident in OuterLex} for identifiers, @{ML Keyword in + OuterLex} for keywords and @{ML_ind Command in OuterLex} for commands). Some token parsers take into account the kind of tokens. The first example shows how to generate a token list out of a string using the function - @{ML_ind [index] scan in OuterSyntax}. It is given the argument + @{ML_ind scan in OuterSyntax}. It is given the argument @{ML "Position.none"} since, at the moment, we are not interested in generating precise error messages. The following code\footnote{Note that because of a possible bug in @@ -419,7 +416,7 @@ other syntactic category. The second indicates a space. We can easily change what is recognised as a keyword with - @{ML_ind [index] keyword in OuterKeyword}. For example calling this function + @{ML_ind keyword in OuterKeyword}. For example calling this function *} ML{*val _ = OuterKeyword.keyword "hello"*} @@ -434,7 +431,7 @@ Many parsing functions later on will require white space, comments and the like to have already been filtered out. So from now on we are going to use the - functions @{ML filter} and @{ML_ind [index] is_proper in OuterLex} to do this. + functions @{ML filter} and @{ML_ind is_proper in OuterLex} to do this. For example: @{ML_response_fake [display,gray] @@ -472,10 +469,10 @@ end" "([\"}\", \"{\", \], [\"\\", \"\\", \])"} - You might have to adjust the @{ML_ind [index] print_depth} in order to + You might have to adjust the @{ML_ind print_depth} in order to see the complete list. - The parser @{ML_ind [index] "$$$" in OuterParse} parses a single keyword. For example: + The parser @{ML_ind "$$$" in OuterParse} parses a single keyword. For example: @{ML_response [display,gray] "let @@ -486,7 +483,7 @@ end" "((\"where\",\), (\"|\",\))"} - Any non-keyword string can be parsed with the function @{ML_ind [index] reserved in OuterParse}. + Any non-keyword string can be parsed with the function @{ML_ind reserved in OuterParse}. For example: @{ML_response [display,gray] @@ -498,7 +495,7 @@ end" "(\"bar\",[])"} - Like before, you can sequentially connect parsers with @{ML_ind [index] "--"}. For example: + Like before, you can sequentially connect parsers with @{ML_ind "--"}. For example: @{ML_response [display,gray] "let @@ -508,7 +505,6 @@ end" "((\"|\", \"in\"), [])"} - \indexdef{enum@ {\tt\slshape{enum}}}{in {\tt\slshape OuterParse}} The parser @{ML "OuterParse.enum s p" for s p} parses a possibly empty list of items recognised by the parser @{text p}, where the items being parsed are separated by the string @{text s}. For example: @@ -521,7 +517,7 @@ end" "([\"in\", \"in\", \"in\"], [\])"} - @{ML_ind [index] enum1 in OuterParse} works similarly, except that the parsed list must + @{ML_ind enum1 in OuterParse} works similarly, except that the parsed list must be non-empty. Note that we had to add a string @{text [quotes] "foo"} at the end of the parsed string, otherwise the parser would have consumed all tokens and then failed with the exception @{text "MORE"}. Like in the @@ -545,7 +541,7 @@ text {* - The function @{ML_ind [index] "!!!" in OuterParse} can be used to force termination of the + The function @{ML_ind "!!!" in OuterParse} can be used to force termination of the parser in case of a dead end, just like @{ML "Scan.!!"} (see previous section). Except that the error message of @{ML "OuterParse.!!!"} is fixed to be @{text [quotes] "Outer syntax error"} @@ -564,7 +560,7 @@ \begin{exercise} (FIXME) A type-identifier, for example @{typ "'a"}, is a token of - kind @{ML_ind [index] Keyword in OuterLex}. It can be parsed using + kind @{ML_ind Keyword in OuterLex}. It can be parsed using the function @{ML type_ident in OuterParse}. \end{exercise} @@ -626,7 +622,7 @@ text {* There is usually no need to write your own parser for parsing inner syntax, that is for terms and types: you can just call the predefined parsers. Terms can - be parsed using the function @{ML_ind [index] term in OuterParse}. For example: + be parsed using the function @{ML_ind term in OuterParse}. For example: @{ML_response [display,gray] "let @@ -636,10 +632,10 @@ end" "(\"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\", [])"} - The function @{ML_ind [index] prop in OuterParse} is similar, except that it gives a different + The function @{ML_ind prop in OuterParse} is similar, except that it gives a different error message, when parsing fails. As you can see, the parser not just returns the parsed string, but also some encoded information. You can decode the - information with the function @{ML_ind [index] parse in YXML}. For example + information with the function @{ML_ind parse in YXML}. For example @{ML_response [display,gray] "YXML.parse \"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\"" @@ -723,7 +719,7 @@ variables with optional type-annotation and syntax-annotation, and a list of rules where every rule has optionally a name and an attribute. - The function @{ML_ind [index] "fixes" in OuterParse} in Line 2 of the parser reads an + The function @{ML_ind "fixes" in OuterParse} in Line 2 of the parser reads an \isacommand{and}-separated list of variables that can include optional type annotations and syntax translations. For example:\footnote{Note that in the code we need to write @@ -747,8 +743,8 @@ not yet used to type the variables: this must be done by type-inference later on. Since types are part of the inner syntax they are strings with some encoded information (see previous section). If a mixfix-syntax is - present for a variable, then it is stored in the @{ML_ind [index] Mixfix} data structure; - no syntax translation is indicated by @{ML_ind [index] NoSyn}. + present for a variable, then it is stored in the @{ML_ind Mixfix} data structure; + no syntax translation is indicated by @{ML_ind NoSyn}. \begin{readmore} The data structure for mixfix annotations is defined in @{ML_file "Pure/Syntax/mixfix.ML"}. @@ -757,7 +753,7 @@ Lines 3 to 7 in the function @{ML spec_parser} implement the parser for a list of introduction rules, that is propositions with theorem annotations such as rule names and attributes. The introduction rules are propositions - parsed by @{ML_ind [index] prop in OuterParse}. However, they can include an optional + parsed by @{ML_ind prop in OuterParse}. However, they can include an optional theorem name plus some attributes. For example @{ML_response [display,gray] "let @@ -767,8 +763,8 @@ (name, map Args.dest_src attrib) end" "(foo_lemma, [((\"intro\", []), \), ((\"dest\", [\]), \)])"} - The function @{ML_ind [index] opt_thm_name in SpecParse} is the ``optional'' variant of - @{ML_ind [index] thm_name in SpecParse}. Theorem names can contain attributes. The name + The function @{ML_ind opt_thm_name in SpecParse} is the ``optional'' variant of + @{ML_ind thm_name in SpecParse}. Theorem names can contain attributes. The name has to end with @{text [quotes] ":"}---see the argument of the function @{ML SpecParse.opt_thm_name} in Line 7. @@ -783,7 +779,7 @@ Have a look at how the parser @{ML SpecParse.where_alt_specs} is implemented in file @{ML_file "Pure/Isar/spec_parse.ML"}. This parser corresponds to the ``where-part'' of the introduction rules given above. Below - we paraphrase the code of @{ML_ind [index] where_alt_specs in SpecParse} adapted to our + we paraphrase the code of @{ML_ind where_alt_specs in SpecParse} adapted to our purposes. \begin{isabelle} *} @@ -834,7 +830,7 @@ end *} text {* - The crucial function @{ML_ind [index] local_theory in OuterSyntax} expects a name for the command, a + The crucial function @{ML_ind local_theory in OuterSyntax} expects a name for the command, a short description, a kind indicator (which we will explain later more thoroughly) and a parser producing a local theory transition (its purpose will also explained later). @@ -974,7 +970,7 @@ The crucial part of a command is the function that determines the behaviour of the command. In the code above we used a ``do-nothing''-function, which - because of @{ML_ind [index] succeed in Scan} does not parse any argument, but immediately + because of @{ML_ind succeed in Scan} does not parse any argument, but immediately returns the simple function @{ML "LocalTheory.theory I"}. We can replace this code by a function that first parses a proposition (using the parser @{ML OuterParse.prop}), then prints out the tracing @@ -1004,7 +1000,7 @@ and see the proposition in the tracing buffer. - Note that so far we used @{ML_ind [index] thy_decl in OuterKeyword} as the kind + Note that so far we used @{ML_ind thy_decl in OuterKeyword} as the kind indicator for the command. This means that the command finishes as soon as the arguments are processed. Examples of this kind of commands are \isacommand{definition} and \isacommand{declare}. In other cases, commands @@ -1012,7 +1008,7 @@ ``open up'' a proof in order to prove the proposition (for example \isacommand{lemma}) or prove some other properties (for example \isacommand{function}). To achieve this kind of behaviour, you have to use - the kind indicator @{ML_ind [index] thy_goal in OuterKeyword} and the function @{ML + the kind indicator @{ML_ind thy_goal in OuterKeyword} and the function @{ML "local_theory_to_proof" in OuterSyntax} to set up the command. Note, however, once you change the ``kind'' of a command from @{ML thy_decl in OuterKeyword} to @{ML thy_goal in OuterKeyword} then the keyword file needs @@ -1041,8 +1037,8 @@ text {* The function @{text prove_prop} in Lines 2 to 7 takes a string (the proposition to be proved) and a context as argument. The context is necessary in order to be able to use - @{ML_ind [index] read_prop in Syntax}, which converts a string into a proper proposition. - In Line 6 the function @{ML_ind [index] theorem_i in Proof} starts the proof for the + @{ML_ind read_prop in Syntax}, which converts a string into a proper proposition. + In Line 6 the function @{ML_ind theorem_i in Proof} starts the proof for the proposition. Its argument @{ML NONE} stands for a locale (which we chose to omit); the argument @{ML "(K I)"} stands for a function that determines what should be done with the theorem once it is proved (we chose to just forget @@ -1066,7 +1062,7 @@ \isacommand{done} \end{isabelle} - (FIXME: read a name and show how to store theorems; see @{ML_ind [index] note in LocalTheory}) + (FIXME: read a name and show how to store theorems; see @{ML_ind note in LocalTheory}) *} @@ -1099,7 +1095,7 @@ It defines the method @{text foo}, which takes no arguments (therefore the parser @{ML Scan.succeed}) and only applies a single tactic, namely the tactic which applies @{thm [source] conjE} and then @{thm [source] conjI}. The function - @{ML_ind [index] SIMPLE_METHOD} + @{ML_ind SIMPLE_METHOD} turns such a tactic into a method. The method @{text "foo"} can be used as follows *}