isabelle-cookbook: comparison CookBook/Parsing.thy

equal deleted inserted replaced

-:14c3dd5ee2ad
+:7b8c4fe235aa
 The function @{ML "(op $$)"} takes a string as argument and will ``consume'' this string from
 a given input list of strings. ``Consume'' in this context means that it will
 return a pair consisting of this string and the rest of the input list.
 For example:
-@{ML_response [display] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
+@{ML_response [display,gray] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
-@{ML_response [display] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"}
+@{ML_response [display,gray] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"}
 This function will either succeed (as in the two examples above) or raise the exception
 @{text "FAIL"} if no string can be consumed. For example trying to parse
-@{ML_response_fake [display] "($$ \"x\") (explode \"world\")"
+@{ML_response_fake [display,gray] "($$ \"x\") (explode \"world\")"
 "Exception FAIL raised"}
 will raise the exception @{text "FAIL"}.
 There are three exceptions used in the parsing combinators:
 one item from the input list satisfying this predicate. For example the
 following parser either consumes an @{text [quotes] "h"} or a @{text
 [quotes] "w"}:
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\")
 val input1 = (explode \"hello\")
 val input2 = (explode \"world\")
 in
 Two parser can be connected in sequence by using the function @{ML "(op --)"}.
 For example parsing @{text "h"}, @{text "e"} and @{text "l"} in this
 sequence can be achieved by
-@{ML_response [display] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")"
+@{ML_response [display,gray] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")"
 "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"}
 Note how the result of consumed strings builds up on the left as nested pairs.
 If, as in the previous example, one wants to parse a particular string,
 then one should use the function @{ML Scan.this_string}:
-@{ML_response [display] "Scan.this_string \"hell\" (explode \"hello\")"
+@{ML_response [display,gray] "Scan.this_string \"hell\" (explode \"hello\")"
 "(\"hell\", [\"o\"])"}
 Parsers that explore alternatives can be constructed using the function @{ML
 "(op ||)"}. For example, the parser @{ML "(p || q)" for p q} returns the
 result of @{text "p"}, in case it succeeds, otherwise it returns the
 result of @{text "q"}. For example
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val hw = ($$ \"h\") || ($$ \"w\")
 val input1 = (explode \"hello\")
 val input2 = (explode \"world\")
 in
 The functions @{ML "(op |--)"} and @{ML "(op --|)"} work like the sequencing function
 for parsers, except that they discard the item being parsed by the first (respectively second)
 parser. For example
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val just_e = ($$ \"h\") |-- ($$ \"e\")
 val just_h = ($$ \"h\") --| ($$ \"e\")
 val input = (explode \"hello\")
 in
 The parser @{ML "Scan.optional p x" for p x} returns the result of the parser
 @{text "p"}, if it succeeds; otherwise it returns
 the default value @{text "x"}. For example
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val p = Scan.optional ($$ \"h\") \"x\"
 val input1 = (explode \"hello\")
 val input2 = (explode \"world\")
 in
 "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
 The function @{ML Scan.option} works similarly, except no default value can
 be given. Instead, the result is wrapped as an @{text "option"}-type. For example:
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val p = Scan.option ($$ \"h\")
 val input1 = (explode \"hello\")
 val input2 = (explode \"world\")
 in
 The function @{ML "(op !!)"} helps to produce appropriate error messages
 during parsing. For example if one wants to parse that @{text p} is immediately
 followed by @{text q}, or start a completely different parser @{text r},
 one might write
-@{ML [display] "(p -- q) || r" for p q r}
+@{ML [display,gray] "(p -- q) || r" for p q r}
 However, this parser is problematic for producing an appropriate error message, in case
 the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses the information
 that @{text p} should be followed by @{text q}. To see this consider the case in
 which @{text p}
 caused by the
 failure of @{text r}, not by the absence of @{text q} in the input. This kind of situation
 can be avoided when using the function @{ML "(op !!)"}. This function aborts the whole process of
 parsing in case of a failure and prints an error message. For example if we invoke the parser
-@{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"}
+@{ML [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\"))"}
 on @{text [quotes] "hello"}, the parsing succeeds
-@{ML_response [display]
+@{ML_response [display,gray]
 "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"hello\")"
 "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
 but if we invoke it on @{text [quotes] "world"}
-@{ML_response_fake [display] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")"
+@{ML_response_fake [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")"
 "Exception ABORT raised"}
 then the parsing aborts and the error message @{text "foo"} is printed out. In order to
 see the error message properly, we need to prefix the parser with the function
 @{ML "Scan.error"}. For example
-@{ML_response_fake [display] "Scan.error ((!! (fn _ => \"foo\") ($$ \"h\")))"
+@{ML_response_fake [display,gray] "Scan.error ((!! (fn _ => \"foo\") ($$ \"h\")))"
 "Exception Error \"foo\" raised"}
 This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"}
 (FIXME: give reference to later place).
 text {*
 Running this parser with the @{text [quotes] "h"} and @{text [quotes] "e"}, and
 the input @{text [quotes] "holle"}
-@{ML_response_fake [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"holle\")"
+@{ML_response_fake [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"holle\")"
 "Exception ERROR \"h is not followed by e\" raised"}
 produces the correct error message. Running it with
-@{ML_response [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"wworld\")"
+@{ML_response [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"wworld\")"
 "((\"w\", \"w\"), [\"o\", \"r\", \"l\", \"d\"])"}
 yields the expected parsing.
 The function @{ML "Scan.repeat p" for p} will apply a parser @{text p} as
 often as it succeeds. For example
-@{ML_response [display] "Scan.repeat ($$ \"h\") (explode \"hhhhello\")"
+@{ML_response [display,gray] "Scan.repeat ($$ \"h\") (explode \"hhhhello\")"
 "([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"}
 Note that @{ML "Scan.repeat"} stores the parsed items in a list. The function
 @{ML "Scan.repeat1"} is similar, but requires that the parser @{text "p"}
 succeeds at least once.
 Also note that the parser would have aborted with the exception @{text MORE}, if
 we had run it only on just @{text [quotes] "hhhh"}. This can be avoided by using
 the wrapper @{ML Scan.finite} and the ``stopper-token'' @{ML Symbol.stopper}. With
 them we can write
-@{ML_response [display] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")"
+@{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")"
 "([\"h\", \"h\", \"h\", \"h\"], [])"}
 @{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings;
 other stoppers need to be used when parsing token, for example. However, this kind of
 manually wrapping is often already done by the surrounding infrastructure.
 The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any
 string as in
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val p = Scan.repeat (Scan.one Symbol.not_eof)
 val input = (explode \"foo bar foo\")
 in
 Scan.finite Symbol.stopper p input
 end of the input string (i.e.~stopper symbol).
 The function @{ML "Scan.unless p q" for p q} takes two parsers: if the first one can
 parse the input, then the whole parser fails; if not, then the second is tried. Therefore
-@{ML_response_fake_both [display] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"hello\")"
+@{ML_response_fake_both [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"hello\")"
 "Exception FAIL raised"}
 fails, while
-@{ML_response [display] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"world\")"
+@{ML_response [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"world\")"
 "(\"w\",[\"o\", \"r\", \"l\", \"d\"])"}
 succeeds.
 The functions @{ML Scan.repeat} and @{ML Scan.unless} can be combined to read any
 input until a certain marker symbol is reached. In the example below the marker
 symbol is a @{text [quotes] "*"}.
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val p = Scan.repeat (Scan.unless ($$ \"*\") (Scan.one Symbol.not_eof))
 val input1 = (explode \"fooooo\")
 val input2 = (explode \"foo*ooo\")
 in
 After parsing is done, one nearly always wants to apply a function on the parsed
 items. To do this the function @{ML "(p >> f)" for p f} can be employed, which runs
 first the parser @{text p} and upon successful completion applies the
 function @{text f} to the result. For example
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 fun double (x,y) = (x^x,y^y)
 in
 (($$ \"h\") -- ($$ \"e\") >> double) (explode \"hello\")
 end"
 "((\"hh\", \"ee\"), [\"l\", \"l\", \"o\"])"}
 doubles the two parsed input strings. Or
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val p = Scan.repeat (Scan.one Symbol.not_eof) >> implode
 val input = (explode \"foo bar foo\")
 in
 Scan.finite Symbol.stopper p input
 The function @{ML Scan.lift} takes a parser and a pair as arguments. This function applies
 the given parser to the second component of the pair and leaves the  first component
 untouched. For example
-@{ML_response [display]
+@{ML_response [display,gray]
 "Scan.lift (($$ \"h\") -- ($$ \"e\")) (1,(explode \"hello\"))"
 "((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"}
 (FIXME: In which situations is this useful? Give examples.)
 *}
 the function @{ML "OuterSyntax.scan"}, which we give below @{ML
 "Position.none"} as argument since, at the moment, we are not interested in
 generating precise error messages. The following
-@{ML_response_fake [display] "OuterSyntax.scan Position.none \"hello world\""
+@{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\""
 "[Token (\<dots>,(Ident, \"hello\"),\<dots>),
 Token (\<dots>,(Space, \" \"),\<dots>),
 Token (\<dots>,(Ident, \"world\"),\<dots>)]"}
 produces three tokens where the first and the last are identifiers, since
 Many parsing functions later on will require spaces, comments and the like
 to have already been filtered out.  So from now on we are going to use the
 functions @{ML filter} and @{ML OuterLex.is_proper} do this. For example
-@{ML_response_fake [display]
+@{ML_response_fake [display,gray]
 "let
 val input = OuterSyntax.scan Position.none \"hello world\"
 in
 filter OuterLex.is_proper input
 end"
 text {*
 If we parse
-@{ML_response_fake [display]
+@{ML_response_fake [display,gray]
 "filtered_input \"inductive | for\""
 "[Token (\<dots>,(Command, \"inductive\"),\<dots>),
 Token (\<dots>,(Keyword, \"|\"),\<dots>),
 Token (\<dots>,(Keyword, \"for\"),\<dots>)]"}
 we obtain a list consisting of only a command and two keyword tokens.
 If you want to see which keywords and commands are currently known, type in
 the following (you might have to adjust the @{ML print_depth} in order to
 see the complete list):
-@{ML_response_fake [display]
+@{ML_response_fake [display,gray]
 "let
 val (keywords, commands) = OuterKeyword.get_lexicons ()
 in
 (Scan.dest_lexicon commands, Scan.dest_lexicon keywords)
 end"
 "([\"}\",\"{\",\<dots>],[\"\<rightleftharpoons>\",\"\<leftharpoondown>\",\<dots>])"}
 Now the parser @{ML "OuterParse.$$$"} parses a single keyword. For example
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val input1 = filtered_input \"where for\"
 val input2 = filtered_input \"| in\"
 in
 (OuterParse.$$$ \"where\" input1, OuterParse.$$$ \"|\" input2)
 end"
 "((\"where\",\<dots>),(\"|\",\<dots>))"}
 Like before, we can sequentially connect parsers with @{ML "(op --)"}. For example
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val input = filtered_input \"| in\"
 in
 (OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\") input
 end"
 The parser @{ML "OuterParse.enum s p" for s p} parses a possibly empty
 list of items recognised by the parser @{text p}, where the items being parsed
 are separated by the string @{text s}. For example
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val input = filtered_input \"in | in | in foo\"
 in
 (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
 end"
 and then failed with the exception @{text "MORE"}. Like in the previous
 section, we can avoid this exception using the wrapper @{ML
 Scan.finite}. This time, however, we have to use the ``stopper-token'' @{ML
 OuterLex.stopper}. We can write
-@{ML_response [display]
+@{ML_response [display,gray]
 "let
 val input = filtered_input \"in | in | in\"
 in
 Scan.finite OuterLex.stopper
 (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
 The function @{ML "OuterParse.!!!"} can be used to force termination of the
 parser in case of a dead end, just like @{ML "Scan.!!"} (see previous section),
 except that the error message is fixed to be @{text [quotes] "Outer syntax error"}
 with a relatively precise description of the failure. For example:
-@{ML_response_fake [display]
+@{ML_response_fake [display,gray]
 "let
 val input = filtered_input \"in |\"
 val parse_bar_then_in = OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\"
 in
 parse (OuterParse.!!! parse_bar_then_in) input
 end *}
 text {*
 Now we can type for example
-@{ML_response_fake_both [display] "foobar \"True \<and> False\"" "True \<and> False"}
+@{ML_response_fake_both [display,gray] "foobar \"True \<and> False\"" "True \<and> False"}
 and see the proposition in the tracing buffer.
 Note that so far we used @{ML thy_decl in OuterKeyword} as kind indicator
 for the command.  This means that the command finishes as soon as the
 11 contain the parser for the proposition.
 If we now type @{text "foobar \"True \<and> True\""}, we obtain the following
 proof state:
-@{ML_response_fake_both [display] "foobar \"True \<and> True\""
+@{ML_response_fake_both [display,gray] "foobar \"True \<and> True\""
 "goal (1 subgoal):
 1. True \<and> True"}
 and we can build the proof
-@{text [display] "foobar \"True \<and> True\"
+@{text [display,gray] "foobar \"True \<and> True\"
 apply(rule conjI)
 apply(rule TrueI)+
 done"}
 (FIXME What does @{text "Toplevel.theory"}?)

changeset 72	7b8c4fe235aa
parent 69	19106a9975c1
child 74	f6f8f8ba1eb1