diff -r 123401a5c8e9 -r 5e309df58557 CookBook/Parsing.thy --- a/CookBook/Parsing.thy Fri Feb 06 06:19:52 2009 +0000 +++ b/CookBook/Parsing.thy Sat Feb 07 12:05:02 2009 +0000 @@ -86,14 +86,14 @@ Two parser can be connected in sequence by using the function @{ML "(op --)"}. For example parsing @{text "h"}, @{text "e"} and @{text "l"} in this - sequence can be achieved by + sequence you can achieve by: @{ML_response [display,gray] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")" "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"} Note how the result of consumed strings builds up on the left as nested pairs. - If, as in the previous example, one wants to parse a particular string, + If, as in the previous example, you want to parse a particular string, then one should use the function @{ML Scan.this_string}: @{ML_response [display,gray] "Scan.this_string \"hell\" (explode \"hello\")" @@ -102,7 +102,7 @@ Parsers that explore alternatives can be constructed using the function @{ML "(op ||)"}. For example, the parser @{ML "(p || q)" for p q} returns the result of @{text "p"}, in case it succeeds, otherwise it returns the - result of @{text "q"}. For example + result of @{text "q"}. For example: @{ML_response [display,gray] @@ -117,7 +117,7 @@ The functions @{ML "(op |--)"} and @{ML "(op --|)"} work like the sequencing function for parsers, except that they discard the item being parsed by the first (respectively second) - parser. For example + parser. For example: @{ML_response [display,gray] "let @@ -156,24 +156,25 @@ end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"} The function @{ML "(op !!)"} helps to produce appropriate error messages - during parsing. For example if one wants to parse that @{text p} is immediately + during parsing. For example if you want to parse that @{text p} is immediately followed by @{text q}, or start a completely different parser @{text r}, - one might write + you might write @{ML [display,gray] "(p -- q) || r" for p q r} - However, this parser is problematic for producing an appropriate error message, in case - the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses the information - that @{text p} should be followed by @{text q}. To see this consider the case in - which @{text p} - is present in the input, but not @{text q}. That means @{ML "(p -- q)" for p q} will fail - and the - alternative parser @{text r} will be tried. However in many circumstance this will be the wrong - parser for the input ``p-followed-by-q'' and therefore will also fail. The error message is then - caused by the - failure of @{text r}, not by the absence of @{text q} in the input. This kind of situation - can be avoided when using the function @{ML "(op !!)"}. This function aborts the whole process of - parsing in case of a failure and prints an error message. For example if we invoke the parser + However, this parser is problematic for producing an appropriate error + message, in case the parsing of @{ML "(p -- q)" for p q} fails. Because in + that case you lose the information that @{text p} should be followed by + @{text q}. To see this consider the case in which @{text p} is present in + the input, but not @{text q}. That means @{ML "(p -- q)" for p q} will fail + and the alternative parser @{text r} will be tried. However in many + circumstance this will be the wrong parser for the input ``p-followed-by-q'' + and therefore will also fail. The error message is then caused by the + failure of @{text r}, not by the absence of @{text q} in the input. This + kind of situation can be avoided when using the function @{ML "(op + !!)"}. This function aborts the whole process of parsing in case of a + failure and prints an error message. For example if you invoke the parser + @{ML [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\"))"} @@ -183,24 +184,24 @@ "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} - but if we invoke it on @{text [quotes] "world"} + but if you invoke it on @{text [quotes] "world"} @{ML_response_fake [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")" "Exception ABORT raised"} then the parsing aborts and the error message @{text "foo"} is printed out. In order to see the error message, we need to prefix the parser with the function - @{ML "Scan.error"}. For example + @{ML "Scan.error"}. For example: @{ML_response_fake [display,gray] "Scan.error (!! (fn _ => \"foo\") ($$ \"h\"))" "Exception Error \"foo\" raised"} This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"} - (FIXME: give reference to later place). + (see Section~\ref{sec:newcommand} which explains this function in more detail). - Let us now return to our example of parsing @{ML "(p -- q) || r" for p q r}. If we want - to generate the correct error message for p-followed-by-q, then - we have to write: + Let us now return to our example of parsing @{ML "(p -- q) || r" for p q + r}. If you want to generate the correct error message for p-followed-by-q, + then you have to write: *} ML{*fun p_followed_by_q p q r = @@ -226,7 +227,7 @@ yields the expected parsing. The function @{ML "Scan.repeat p" for p} will apply a parser @{text p} as - often as it succeeds. For example + often as it succeeds. For example: @{ML_response [display,gray] "Scan.repeat ($$ \"h\") (explode \"hhhhello\")" "([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"} @@ -236,15 +237,15 @@ succeeds at least once. Also note that the parser would have aborted with the exception @{text MORE}, if - we had run it only on just @{text [quotes] "hhhh"}. This can be avoided by using + you had run it only on just @{text [quotes] "hhhh"}. This can be avoided by using the wrapper @{ML Scan.finite} and the ``stopper-token'' @{ML Symbol.stopper}. With - them we can write + them you can write: @{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")" "([\"h\", \"h\", \"h\", \"h\"], [])"} @{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings; - other stoppers need to be used when parsing token, for example. However, this kind of + other stoppers need to be used when parsing tokens, for example. However, this kind of manually wrapping is often already done by the surrounding infrastructure. The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any @@ -291,14 +292,14 @@ "(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []), ([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"} - After parsing is done, one nearly always wants to apply a function on the parsed + After parsing is done, you nearly always want to apply a function on the parsed items. To do this the function @{ML "(p >> f)" for p f} can be employed, which runs first the parser @{text p} and upon successful completion applies the function @{text f} to the result. For example @{ML_response [display,gray] "let - fun double (x,y) = (x^x,y^y) + fun double (x,y) = (x ^ x, y ^ y) in (($$ \"h\") -- ($$ \"e\") >> double) (explode \"hello\") end" @@ -323,7 +324,7 @@ inside @{text "(*\*)"} is replaced by a the same comment but enclosed inside @{text "(**\**)"} in the output string. To enclose a string, you can use the function @{ML "enclose s1 s2 s" for s1 s2 s} which produces the string @{ML - "s1^s^s2" for s1 s2 s}. + "s1 ^ s ^ s2" for s1 s2 s}. \end{exercise} @@ -341,11 +342,12 @@ section {* Parsing Theory Syntax *} text {* - Most of the time, however, Isabelle developers have to deal with parsing tokens, not strings. - This is because the parsers for the theory syntax, as well as the parsers for the - argument syntax of proof methods and attributes use the type - @{ML_type OuterLex.token} (which is identical to the type @{ML_type OuterParse.token}). - There are also parsers for ML-expressions and ML-files. + Most of the time, however, Isabelle developers have to deal with parsing + tokens, not strings. This is because the parsers for the theory syntax, as + well as the parsers for the argument syntax of proof methods and attributes + use the type @{ML_type OuterLex.token} (which is identical to the type + @{ML_type OuterParse.token}). There are also parsers for ML-expressions and + ML-files, which can be sometimes handy. \begin{readmore} The parser functions for the theory syntax are contained in the structure @@ -360,11 +362,10 @@ *} text {* - For the examples below, we can generate a token list out of a string using - the function @{ML "OuterSyntax.scan"}, which we give below @{ML - "Position.none"} as argument since, at the moment, we are not interested in - generating precise error messages. The following code - + With the examples below, you can generate a token list out of a string using + the function @{ML "OuterSyntax.scan"}, which is given @{ML "Position.none"} + as argument since, at the moment, we are not interested in generating + precise error messages. The following code @{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\"" "[Token (\,(Ident, \"hello\"),\), @@ -374,13 +375,12 @@ produces three tokens where the first and the last are identifiers, since @{text [quotes] "hello"} and @{text [quotes] "world"} do not match any other syntactic category.\footnote{Note that because of a possible a bug in - the PolyML runtime system the result is printed as @{text "?"}, instead of + the PolyML runtime system the result is printed as @{text [quotes] "?"}, instead of the tokens.} The second indicates a space. Many parsing functions later on will require spaces, comments and the like to have already been filtered out. So from now on we are going to use the - functions @{ML filter} and @{ML OuterLex.is_proper} do this. For example - + functions @{ML filter} and @{ML OuterLex.is_proper} do this. For example: @{ML_response_fake [display,gray] "let @@ -390,7 +390,7 @@ end" "[Token (\,(Ident, \"hello\"), \), Token (\,(Ident, \"world\"), \)]"} - For convenience we define the function + For convenience we define the function: *} @@ -399,7 +399,7 @@ text {* - If we parse + If you now parse @{ML_response_fake [display,gray] "filtered_input \"inductive | for\"" @@ -407,7 +407,7 @@ Token (\,(Keyword, \"|\"),\), Token (\,(Keyword, \"for\"),\)]"} - we obtain a list consisting of only a command and two keyword tokens. + you obtain a list consisting of only a command and two keyword tokens. If you want to see which keywords and commands are currently known, type in the following code (you might have to adjust the @{ML print_depth} in order to see the complete list): @@ -420,7 +420,7 @@ end" "([\"}\",\"{\",\],[\"\\",\"\\",\])"} - Now the parser @{ML "OuterParse.$$$"} parses a single keyword. For example + The parser @{ML "OuterParse.$$$"} parses a single keyword. For example: @{ML_response [display,gray] "let @@ -431,7 +431,7 @@ end" "((\"where\",\),(\"|\",\))"} - Like before, we can sequentially connect parsers with @{ML "(op --)"}. For example + Like before, you can sequentially connect parsers with @{ML "(op --)"}. For example: @{ML_response [display,gray] "let @@ -443,7 +443,7 @@ The parser @{ML "OuterParse.enum s p" for s p} parses a possibly empty list of items recognised by the parser @{text p}, where the items being parsed - are separated by the string @{text s}. For example + are separated by the string @{text s}. For example: @{ML_response [display,gray] "let @@ -454,12 +454,12 @@ "([\"in\",\"in\",\"in\"],[\])"} @{ML "OuterParse.enum1"} works similarly, except that the parsed list must - be non-empty. Note that we had to add a string @{text [quotes] "foo"} at the end - of the parsed string, otherwise the parser would have consumed all tokens - and then failed with the exception @{text "MORE"}. Like in the previous - section, we can avoid this exception using the wrapper @{ML + be non-empty. Note that we had to add a string @{text [quotes] "foo"} at the + end of the parsed string, otherwise the parser would have consumed all + tokens and then failed with the exception @{text "MORE"}. Like in the + previous section, we can avoid this exception using the wrapper @{ML Scan.finite}. This time, however, we have to use the ``stopper-token'' @{ML - OuterLex.stopper}. We can write + OuterLex.stopper}. We can write: @{ML_response [display,gray] "let @@ -528,9 +528,9 @@ section {* Parsing Specifications *} text {* - There are a number of special purpose parsers that help for parsing specifications. - For example the function @{ML OuterParse.target} reads a target in order to indicate - a locale. + There are a number of special purpose parsers that help with parsing specifications + of functions, inductive definitions and so on. For example the function + @{ML OuterParse.target} reads a target in order to indicate a locale. @{ML_response [display,gray] "let @@ -539,12 +539,13 @@ parse OuterParse.target input end" "(\"test\",[])"} - The function @{ML OuterParse.opt_target} makes this parser ``optional''. + The function @{ML OuterParse.opt_target} makes this parser ``optional'', that + is wrapping the result into an option type and returning @{ML NONE} if no + target is present. The function @{ML OuterParse.fixes} reads a list of constants - which can include a type annotation and a syntax translation. - The list needs to be separated by \isacommand{and}. - + that can include type annotations and syntax translations. + The list is separated by \isacommand{and}. For example: *} text {* @@ -560,14 +561,28 @@ (bar, SOME \, NoSyn), (blonk, NONE, NoSyn)],[])"} - The types of the constants is stored in the @{ML SOME}s. - We need to write @{text "\\\"int \ bool\\\""} to properly - escape the type. + Whenever types are given, then they are stored in the @{ML SOME}s. + Note that we had to write @{text "\\\"int \ bool\\\""} to properly escape + the double quotes in the compound type. @{ML OuterParse.for_fixes} is an ``optional'' that prefixes @{ML OuterParse.fixes} with the comman \isacommand{for}. + (FIXME give an example and explain more) + +@{ML_response [display,gray] +"let + val input = filtered_input + \"for foo::\\\"int \ bool\\\" (\\\"FOO\\\" [100] 100) and bar::nat and blonk\" +in + parse OuterParse.for_fixes input +end" +"([(foo, SOME \, Mixfix (\"FOO\",[100],100)), + (bar, SOME \, NoSyn), + (blonk, NONE, NoSyn)],[])"} + *} + ML{*let val input = filtered_input "test_lemma[intro,foo,elim,dest!,bar]:" in @@ -614,7 +629,7 @@ and a list of rules where each rule has optionally a name and an attribute. *} -section {* New Commands and Keyword Files *} +section {* New Commands and Keyword Files\label{sec:newcommand} *} text {* Often new commands, for example for providing new definitional principles, @@ -625,7 +640,7 @@ To keep things simple, let us start with a ``silly'' command that does nothing at all. We shall name this command \isacommand{foobar}. On the ML-level it can be - defined as + defined as: *} ML{*let @@ -688,23 +703,22 @@ @{text Pure} and @{text HOL} are usually compiled during the installation of Isabelle. So log files for them should be already available. If not, then they can be conveniently compiled with the help of the build-script from the Isabelle - distribution - + distribution. @{text [display] "$ ./build -m \"Pure\" $ ./build -m \"HOL\""} - The @{text "Pure-ProofGeneral"} theory needs to be compiled with + The @{text "Pure-ProofGeneral"} theory needs to be compiled with: @{text [display] "$ ./build -m \"Pure-ProofGeneral\" \"Pure\""} For the theory @{text "Command.thy"}, you first need to create a ``managed'' subdirectory - with + with: @{text [display] "$ isabelle mkdir FoobarCommand"} - This generates a directory containing the files + This generates a directory containing the files: @{text [display] "./IsaMakefile @@ -718,7 +732,7 @@ @{text [display] "use_thy \"Command\";"} - to the file @{text "./FoobarCommand/ROOT.ML"}. You can now compile the theory by just typing + to the file @{text "./FoobarCommand/ROOT.ML"}. You can now compile the theory by just typing: @{text [display] "$ isabelle make"} @@ -732,7 +746,7 @@ @{text [display] "$ ISABELLE_LOGS=\"$(isabelle getenv -b ISABELLE_OUTPUT)\"/log"} - on the Unix prompt. The directory should include the files + on the Unix prompt. The directory should include the files: @{text [display] "Pure.gz @@ -754,10 +768,10 @@ non-empty.} This keyword file needs to be copied into the directory @{text "~/.isabelle/etc"}. To make Isabelle aware of this keyword file, you have to start Isabelle with the option @{text - "-k foobar"}, i.e. + "-k foobar"}, that is: - @{text [display] "$ isabelle -k foobar a_theory_file"} + @{text [display] "$ isabelle emacs -k foobar a_theory_file"} If you now build a theory on top of @{text "Command.thy"}, then the command \isacommand{foobar} can be used. @@ -765,8 +779,8 @@ At the moment \isacommand{foobar} is not very useful. Let us refine it a bit - next by taking a proposition as argument and printing this proposition inside - the tracing buffer. + next by letting it take a proposition as argument and printing this proposition + inside the tracing buffer. The crucial part of a command is the function that determines the behaviour of the command. In the code above we used a ``do-nothing''-function, which @@ -775,7 +789,7 @@ replace this code by a function that first parses a proposition (using the parser @{ML OuterParse.prop}), then prints out the tracing information (using a new top-level function @{text trace_top_lvl}) and - finally does nothing. For this we can write + finally does nothing. For this you can write: *} ML{*let @@ -790,7 +804,7 @@ end *} text {* - Now we can type + Now you can type \begin{isabelle} \isacommand{foobar}~@{text [quotes] "True \ False"}\\ @@ -810,8 +824,8 @@ indicator @{ML thy_goal in OuterKeyword}. Below we change \isacommand{foobar} so that it takes a proposition as - argument and then starts a proof in order to prove it. Therefore in Line 13 - below, we set the kind indicator to @{ML thy_goal in OuterKeyword}. + argument and then starts a proof in order to prove it. Therefore in Line 13, + we set the kind indicator to @{ML thy_goal in OuterKeyword}. *} ML%linenumbers{*let @@ -833,24 +847,26 @@ text {* The function @{text set_up_thm} in Lines 2 to 7 takes a string (the proposition to be - proved) and a context. The context is necessary in order to be able to use + proved) and a context as argument. The context is necessary in order to be able to use @{ML Syntax.read_prop}, which converts a string into a proper proposition. In Line 6 the function @{ML Proof.theorem_i} starts the proof for the proposition. Its argument @{ML NONE} stands for a locale (which we chose to omit); the argument @{ML "(K I)"} stands for a function that determines what should be done with the theorem once it is proved (we chose to just forget - about it). In Lines 9 to 11 contain the parser for the proposition. + about it). Lines 9 to 11 contain the parser for the proposition. + + (FIXME: explain @{ML Toplevel.print} etc) - If we now type \isacommand{foobar}~@{text [quotes] "True \ True"}, we obtain the following + If you now type \isacommand{foobar}~@{text [quotes] "True \ True"}, you obtain the following proof state: - + \begin{isabelle} \isacommand{foobar}~@{text [quotes] "True \ True"}\\ - @{text "goal (1 subgoal)"}\\ + @{text "goal (1 subgoal):"}\\ @{text "1. True \ True"} \end{isabelle} - and we can build the proof + and you can build the proof \begin{isabelle} \isacommand{foobar}~@{text [quotes] "True \ True"}\\ @@ -866,11 +882,8 @@ @{ML "Toplevel.print"} @{ML Toplevel.local_theory}?) - - (FIXME read a name and show how to store theorems) - (FIXME possibly also read a locale) *} (*<*)