diff -r 14c3dd5ee2ad -r 7b8c4fe235aa CookBook/Parsing.thy --- a/CookBook/Parsing.thy Wed Jan 14 21:46:04 2009 +0000 +++ b/CookBook/Parsing.thy Wed Jan 14 23:44:14 2009 +0000 @@ -42,13 +42,14 @@ return a pair consisting of this string and the rest of the input list. For example: - @{ML_response [display] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} - @{ML_response [display] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"} + @{ML_response [display,gray] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} + + @{ML_response [display,gray] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"} This function will either succeed (as in the two examples above) or raise the exception @{text "FAIL"} if no string can be consumed. For example trying to parse - @{ML_response_fake [display] "($$ \"x\") (explode \"world\")" + @{ML_response_fake [display,gray] "($$ \"x\") (explode \"world\")" "Exception FAIL raised"} will raise the exception @{text "FAIL"}. @@ -73,7 +74,7 @@ [quotes] "w"}: -@{ML_response [display] +@{ML_response [display,gray] "let val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\") val input1 = (explode \"hello\") @@ -87,7 +88,7 @@ For example parsing @{text "h"}, @{text "e"} and @{text "l"} in this sequence can be achieved by - @{ML_response [display] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")" + @{ML_response [display,gray] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")" "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"} Note how the result of consumed strings builds up on the left as nested pairs. @@ -95,7 +96,7 @@ If, as in the previous example, one wants to parse a particular string, then one should use the function @{ML Scan.this_string}: - @{ML_response [display] "Scan.this_string \"hell\" (explode \"hello\")" + @{ML_response [display,gray] "Scan.this_string \"hell\" (explode \"hello\")" "(\"hell\", [\"o\"])"} Parsers that explore alternatives can be constructed using the function @{ML @@ -104,7 +105,7 @@ result of @{text "q"}. For example -@{ML_response [display] +@{ML_response [display,gray] "let val hw = ($$ \"h\") || ($$ \"w\") val input1 = (explode \"hello\") @@ -118,7 +119,7 @@ for parsers, except that they discard the item being parsed by the first (respectively second) parser. For example -@{ML_response [display] +@{ML_response [display,gray] "let val just_e = ($$ \"h\") |-- ($$ \"e\") val just_h = ($$ \"h\") --| ($$ \"e\") @@ -132,7 +133,7 @@ @{text "p"}, if it succeeds; otherwise it returns the default value @{text "x"}. For example -@{ML_response [display] +@{ML_response [display,gray] "let val p = Scan.optional ($$ \"h\") \"x\" val input1 = (explode \"hello\") @@ -145,7 +146,7 @@ The function @{ML Scan.option} works similarly, except no default value can be given. Instead, the result is wrapped as an @{text "option"}-type. For example: -@{ML_response [display] +@{ML_response [display,gray] "let val p = Scan.option ($$ \"h\") val input1 = (explode \"hello\") @@ -159,7 +160,7 @@ followed by @{text q}, or start a completely different parser @{text r}, one might write - @{ML [display] "(p -- q) || r" for p q r} + @{ML [display,gray] "(p -- q) || r" for p q r} However, this parser is problematic for producing an appropriate error message, in case the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses the information @@ -174,24 +175,24 @@ can be avoided when using the function @{ML "(op !!)"}. This function aborts the whole process of parsing in case of a failure and prints an error message. For example if we invoke the parser - @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"} + @{ML [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\"))"} on @{text [quotes] "hello"}, the parsing succeeds - @{ML_response [display] + @{ML_response [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} but if we invoke it on @{text [quotes] "world"} - @{ML_response_fake [display] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")" + @{ML_response_fake [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")" "Exception ABORT raised"} then the parsing aborts and the error message @{text "foo"} is printed out. In order to see the error message properly, we need to prefix the parser with the function @{ML "Scan.error"}. For example - @{ML_response_fake [display] "Scan.error ((!! (fn _ => \"foo\") ($$ \"h\")))" + @{ML_response_fake [display,gray] "Scan.error ((!! (fn _ => \"foo\") ($$ \"h\")))" "Exception Error \"foo\" raised"} This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"} @@ -214,12 +215,12 @@ Running this parser with the @{text [quotes] "h"} and @{text [quotes] "e"}, and the input @{text [quotes] "holle"} - @{ML_response_fake [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"holle\")" + @{ML_response_fake [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"holle\")" "Exception ERROR \"h is not followed by e\" raised"} produces the correct error message. Running it with - @{ML_response [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"wworld\")" + @{ML_response [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"wworld\")" "((\"w\", \"w\"), [\"o\", \"r\", \"l\", \"d\"])"} yields the expected parsing. @@ -227,7 +228,7 @@ The function @{ML "Scan.repeat p" for p} will apply a parser @{text p} as often as it succeeds. For example - @{ML_response [display] "Scan.repeat ($$ \"h\") (explode \"hhhhello\")" + @{ML_response [display,gray] "Scan.repeat ($$ \"h\") (explode \"hhhhello\")" "([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"} Note that @{ML "Scan.repeat"} stores the parsed items in a list. The function @@ -239,7 +240,7 @@ the wrapper @{ML Scan.finite} and the ``stopper-token'' @{ML Symbol.stopper}. With them we can write - @{ML_response [display] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")" + @{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")" "([\"h\", \"h\", \"h\", \"h\"], [])"} @{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings; @@ -249,7 +250,7 @@ The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any string as in - @{ML_response [display] + @{ML_response [display,gray] "let val p = Scan.repeat (Scan.one Symbol.not_eof) val input = (explode \"foo bar foo\") @@ -264,12 +265,12 @@ The function @{ML "Scan.unless p q" for p q} takes two parsers: if the first one can parse the input, then the whole parser fails; if not, then the second is tried. Therefore - @{ML_response_fake_both [display] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"hello\")" + @{ML_response_fake_both [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"hello\")" "Exception FAIL raised"} fails, while - @{ML_response [display] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"world\")" + @{ML_response [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"world\")" "(\"w\",[\"o\", \"r\", \"l\", \"d\"])"} succeeds. @@ -278,7 +279,7 @@ input until a certain marker symbol is reached. In the example below the marker symbol is a @{text [quotes] "*"}. - @{ML_response [display] + @{ML_response [display,gray] "let val p = Scan.repeat (Scan.unless ($$ \"*\") (Scan.one Symbol.not_eof)) val input1 = (explode \"fooooo\") @@ -295,7 +296,7 @@ first the parser @{text p} and upon successful completion applies the function @{text f} to the result. For example -@{ML_response [display] +@{ML_response [display,gray] "let fun double (x,y) = (x^x,y^y) in @@ -305,7 +306,7 @@ doubles the two parsed input strings. Or - @{ML_response [display] + @{ML_response [display,gray] "let val p = Scan.repeat (Scan.one Symbol.not_eof) >> implode val input = (explode \"foo bar foo\") @@ -330,7 +331,7 @@ the given parser to the second component of the pair and leaves the first component untouched. For example -@{ML_response [display] +@{ML_response [display,gray] "Scan.lift (($$ \"h\") -- ($$ \"e\")) (1,(explode \"hello\"))" "((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"} @@ -365,7 +366,7 @@ generating precise error messages. The following -@{ML_response_fake [display] "OuterSyntax.scan Position.none \"hello world\"" +@{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\"" "[Token (\,(Ident, \"hello\"),\), Token (\,(Space, \" \"),\), Token (\,(Ident, \"world\"),\)]"} @@ -381,7 +382,7 @@ functions @{ML filter} and @{ML OuterLex.is_proper} do this. For example -@{ML_response_fake [display] +@{ML_response_fake [display,gray] "let val input = OuterSyntax.scan Position.none \"hello world\" in @@ -400,7 +401,7 @@ If we parse -@{ML_response_fake [display] +@{ML_response_fake [display,gray] "filtered_input \"inductive | for\"" "[Token (\,(Command, \"inductive\"),\), Token (\,(Keyword, \"|\"),\), @@ -411,7 +412,7 @@ the following (you might have to adjust the @{ML print_depth} in order to see the complete list): -@{ML_response_fake [display] +@{ML_response_fake [display,gray] "let val (keywords, commands) = OuterKeyword.get_lexicons () in @@ -421,7 +422,7 @@ Now the parser @{ML "OuterParse.$$$"} parses a single keyword. For example -@{ML_response [display] +@{ML_response [display,gray] "let val input1 = filtered_input \"where for\" val input2 = filtered_input \"| in\" @@ -432,7 +433,7 @@ Like before, we can sequentially connect parsers with @{ML "(op --)"}. For example -@{ML_response [display] +@{ML_response [display,gray] "let val input = filtered_input \"| in\" in @@ -444,7 +445,7 @@ list of items recognised by the parser @{text p}, where the items being parsed are separated by the string @{text s}. For example -@{ML_response [display] +@{ML_response [display,gray] "let val input = filtered_input \"in | in | in foo\" in @@ -460,7 +461,7 @@ Scan.finite}. This time, however, we have to use the ``stopper-token'' @{ML OuterLex.stopper}. We can write -@{ML_response [display] +@{ML_response [display,gray] "let val input = filtered_input \"in | in | in\" in @@ -482,7 +483,7 @@ except that the error message is fixed to be @{text [quotes] "Outer syntax error"} with a relatively precise description of the failure. For example: -@{ML_response_fake [display] +@{ML_response_fake [display,gray] "let val input = filtered_input \"in |\" val parse_bar_then_in = OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\" @@ -728,7 +729,7 @@ text {* Now we can type for example - @{ML_response_fake_both [display] "foobar \"True \ False\"" "True \ False"} + @{ML_response_fake_both [display,gray] "foobar \"True \ False\"" "True \ False"} and see the proposition in the tracing buffer. @@ -774,13 +775,13 @@ If we now type @{text "foobar \"True \ True\""}, we obtain the following proof state: - @{ML_response_fake_both [display] "foobar \"True \ True\"" + @{ML_response_fake_both [display,gray] "foobar \"True \ True\"" "goal (1 subgoal): 1. True \ True"} and we can build the proof - @{text [display] "foobar \"True \ True\" + @{text [display,gray] "foobar \"True \ True\" apply(rule conjI) apply(rule TrueI)+ done"}