# HG changeset patch # User Christian Urban # Date 1224279694 14400 # Node ID 35e1dff0d9bbb4ba8c8dc0a2fd819bc9616608df # Parent 631d12c25bde39bd4e923b827639e65a4ee60c70 more on the parsing section diff -r 631d12c25bde -r 35e1dff0d9bb CookBook/FirstSteps.thy --- a/CookBook/FirstSteps.thy Fri Oct 17 05:02:04 2008 -0400 +++ b/CookBook/FirstSteps.thy Fri Oct 17 17:41:34 2008 -0400 @@ -5,7 +5,6 @@ chapter {* First Steps *} - text {* Isabelle programming is done in Standard ML. Just like lemmas and proofs, code in Isabelle is part of a @@ -50,7 +49,7 @@ then Isabelle's undo operation has no effect on the definition of @{ML_text "foo"}. Note that from now on we will drop the \isacommand{ML} @{ML_text "{"}@{ML_text "* \ *"}@{ML_text "}"} whenever - we show code. + we show code and its response. Once a portion of code is relatively stable, one usually wants to export it to a separate ML-file. Such files can then be included in a @@ -572,7 +571,7 @@ ml_val ys ("let open " ^ space_implode " " xs ^ " in " ^ txt ^ " end"); fun ml_pat (rhs, pat) = - let val pat' = implode (map (fn "\\" => "_" | s => s) + let val pat' = implode (map (fn "\" => "_" | s => s) (Symbol.explode pat)) in "val " ^ pat' ^ " = " ^ rhs diff -r 631d12c25bde -r 35e1dff0d9bb CookBook/Parsing.thy --- a/CookBook/Parsing.thy Fri Oct 17 05:02:04 2008 -0400 +++ b/CookBook/Parsing.thy Fri Oct 17 17:41:34 2008 -0400 @@ -50,12 +50,12 @@ There are three exceptions used in the parsing combinators: - (FIXME: describe) + (FIXME: describe exceptions) \begin{itemize} \item @{ML_text "FAIL"} - \item @{ML_text "MORE"} - \item @{ML_text "ABORT"} + \item @{ML_text "MORE"} @{ML_text "($$ \"h\") []"} + \item @{ML_text "ABORT"} dead end \end{itemize} Slightly more general than @{ML "(op $$)"} is the function @{ML Scan.one} in that it @@ -64,7 +64,8 @@ or a @{ML_text "w"}: @{ML_response [display] -"let val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\") +"let + val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\") val input1 = (explode \"hello\") val input2 = (explode \"world\") in @@ -86,7 +87,8 @@ otherwise it returns the result of @{ML_text "q"}. For example @{ML_response [display] -"let val hw = ($$ \"h\") || ($$ \"w\") +"let + val hw = ($$ \"h\") || ($$ \"w\") val input1 = (explode \"hello\") val input2 = (explode \"world\") in @@ -94,13 +96,13 @@ end" "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"} - will in the first case consume the @{ML_text "h"} and in the second the @{ML_text "w"}. The functions @{ML "(op |--)"} and @{ML "(op --|)"} work like the sequencing funtion for parsers, except that they discard the item parsed by the first (respectively second) parser. For example @{ML_response [display] -"let val just_h = ($$ \"h\") |-- ($$ \"e\") +"let + val just_h = ($$ \"h\") |-- ($$ \"e\") val just_e = ($$ \"h\") --| ($$ \"e\") val input = (explode \"hello\") in @@ -113,7 +115,8 @@ the default value @{ML_text "x"}. For example @{ML_response [display] -"let val p = Scan.optional ($$ \"h\") \"x\" +"let + val p = Scan.optional ($$ \"h\") \"x\" val input1 = (explode \"hello\") val input2 = (explode \"world\") in @@ -122,32 +125,115 @@ "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"} The function @{ML "(op !!)"} helps to produce appropriate error messages - during parsing. + during parsing. For example if one wants to parse @{ML_text p} immediately + followed by @{ML_text q}, or start a completely different parser @{ML_text r}, + one might write + + @{ML_open [display] "(p -- q) || r" for p q r} + + However, this way is problematic for producing an appropriate error message, in case + the parsing of @{ML_open "(p -- q)" for p q} fails. Because one loses the information + that @{ML_text p} should be followed by @{ML_text q}. To see this consider the case that @{ML_text p} + is present in the input, but not @{ML_text q}. So @{ML_open "(p -- q)" for p q} will fail and the + alternative parser @{ML_text r} will be tried. In many circumstances this will be the wrong + parser for the input and therefore probably fail. However, the error message is then caused by the + failure of @{ML_text r}, not by the absense of @{ML_text p} in the input. These situations + can be avoided using the funtion @{ML "(op !!)"}, which aborts the whole process of + parsing and invokes an error message. For example if we invoke the parser + + @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"} + + on @{ML_text "hello"}, the parsing succeeds + @{ML_response [display] + "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"hello\")" + "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} + + In contrast if we invoke it on @{ML_text "world"} + + @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")"} + + the parsing aborts and the error message @{ML_text "foo"} is printed out. In order to + see the error message properly, we need to prefix the parser with the function + @{ML "Scan.error"}. For example + + @{ML [display] "Scan.error ((!! (fn _ => \"foo\") ($$ \"h\")))"} + + This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"} + (FIXME: see below). + + Lets return to our example of parsing @{ML_open "(p -- q) || r" for p q r}. If we want + to generate the correct error message for @{ML_text q} not following @{ML_text p}, then + we have to write *} ML {* - - val err_fn = (fn _ => "foo"); - val p = (!! err_fn ($$ "h")) || ($$ "w"); - val input1 = (explode "hello"); - val input2 = (explode "world"); -*} - -ML {* - - (*Scan.error p input2;*) + fun p_followed_by_q p q r = + let + val err = (fn _ => p ^ " is not followed by " ^ q) + in + (($$ p) -- (!! err ($$ q))) || (($$ r) -- ($$ r)) + end *} -text {* (FIXME: why does @{ML_text "p input2"} not do anything with foo?) *} +text {* + Running this parser with + + @{ML_text [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"holle\")"} + + gives the correct error message and running it with + + @{ML_response [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"wworld\")" + "((\"w\", \"w\"), [\"o\", \"r\", \"l\", \"d\"])"} + + yields the expected parsing. -text {* (FIXME: explain function application) *} + The function @{ML "Scan.repeat"} will apply a parser as often as it succeeds. For examle + + @{ML_response "Scan.repeat ($$ \"h\") (explode \"hhhhello\")" + "([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"} + + Note that @{ML "Scan.repeat"} stores the parsed items in a list. The function + @{ML "Scan.repeat1"} is similar, but requires that in @{ML_open "Scan.repeat1 p" for p} + the parse @{ML_text "p"} succeeds at least once. +*} + +text {* + After parsing succeeded, one wants to apply functions on the parsed items. This is + done using the function @{ML_open "(p >> f)" for p f} which applies first the + parser @{ML_text p} upon successful completion applies the function @{ML_text f}. + For example -ML {* fun parse_fn (x,y) = (x,y^y) *} +@{ML_response [display] +"let + fun double (x,y) = (x^x,y^y) +in + (($$ \"h\") -- ($$ \"e\") >> double) (explode \"hello\") +end" +"((\"hh\", \"ee\"), [\"l\", \"l\", \"o\"])"} + + The function @{ML Scan.lift} takes a parser and a pair as arguments. This function applies + the given parser to the second component of the pair and leaves the first component + untouched. For example -ML {* ((($$ "h") -- ($$ "e")) >> parse_fn) (explode "hello") *} +@{ML_response [display] +"Scan.lift (($$ \"h\") -- ($$ \"e\")) (1,(explode \"hello\"))" +"((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"} + + (FIXME: In which situations is this useful?) +*} + +section {* Parsing Tokens *} -text {* (FIXME: explain @{ML_text "lift"}) *} +text {* + Most of the time, however, we will have to deal with tokens that are not just strings. + The parsers for the theory syntax, as well as the parsers for the argument syntax + of proof methods and attributes use the token type @{ML_type OuterParse.token}, + which is identical to @{ML_type OuterLex.token}. + The parser functions for the theory syntax are contained in the structure + @{ML_struct OuterParse} defined in the file @{ML_file "Pure/Isar/outer_parse.ML"}. +*} + chapter {* Parsing *} diff -r 631d12c25bde -r 35e1dff0d9bb cookbook.pdf Binary file cookbook.pdf has changed