isabelle-cookbook: comparison ProgTutorial/Parsing.thy

equal deleted inserted replaced

-:be23597e81db
+:f875a25aa72d
 ``consume'' this string from a given input list of strings. ``Consume'' in
 this context means that it will return a pair consisting of this string and
 the rest of the input list. For example:
 @{ML_matchresult [display,gray]
-"($$ \"h\") (Symbol.explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
+\<open>($$ "h") (Symbol.explode "hello")\<close> \<open>("h", ["e", "l", "l", "o"])\<close>}
 @{ML_matchresult [display,gray]
-"($$ \"w\") (Symbol.explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"}
+\<open>($$ "w") (Symbol.explode "world")\<close> \<open>("w", ["o", "r", "l", "d"])\<close>}
-The function @{ML "$$"} will either succeed (as in the two examples above)
+The function @{ML \<open>$$\<close>} will either succeed (as in the two examples above)
 or raise the exception \<open>FAIL\<close> if no string can be consumed. For
 example trying to parse
 @{ML_matchresult_fake [display,gray]
-"($$ \"x\") (Symbol.explode \"world\")"
+\<open>($$ "x") (Symbol.explode "world")\<close>
-"Exception FAIL raised"}
+\<open>Exception FAIL raised\<close>}
 will raise the exception \<open>FAIL\<close>. The function @{ML_ind "$$" in Scan} will also
 fail if you try to consume more than a single character.
 There are three exceptions that are raised by the parsing combinators:
 \item \<open>FAIL\<close> indicates that alternative routes of parsing
 might be explored.
 \item \<open>MORE\<close> indicates that there is not enough input for the parser. For example
 in \<open>($$ "h") []\<close>.
 \item \<open>ABORT\<close> indicates that a dead end is reached.
-It is used for example in the function @{ML "!!"} (see below).
+It is used for example in the function @{ML \<open>!!\<close>} (see below).
 \end{itemize}
 However, note that these exceptions are private to the parser and cannot be accessed
 by the programmer (for example to handle them).
 that @{ML explode in Symbol} is aware of character
 sequences, for example \<open>\<foo>\<close>, that have a special meaning in
 Isabelle. To see the difference consider
 @{ML_matchresult_fake [display,gray]
-"let
+\<open>let
-val input = \"\<foo> bar\"
+val input = "\<foo> bar"
 in
 (String.explode input, Symbol.explode input)
-end"
+end\<close>
-"([\"\\\", \"<\", \"f\", \"o\", \"o\", \">\", \" \", \"b\", \"a\", \"r\"],
+\<open>(["\\", "<", "f", "o", "o", ">", " ", "b", "a", "r"],
-[\"\<foo>\", \" \", \"b\", \"a\", \"r\"])"}
+["\<foo>", " ", "b", "a", "r"])\<close>}
-Slightly more general than the parser @{ML "$$"} is the function
+Slightly more general than the parser @{ML \<open>$$\<close>} is the function
 @{ML_ind one in Scan}, in that it takes a predicate as argument and
 then parses exactly
 one item from the input list satisfying this predicate. For example the
 following parser either consumes an @{text [quotes] "h"} or a @{text
 [quotes] "w"}:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\")
+val hw = Scan.one (fn x => x = "h" orelse x = "w")
-val input1 = Symbol.explode \"hello\"
+val input1 = Symbol.explode "hello"
-val input2 = Symbol.explode \"world\"
+val input2 = Symbol.explode "world"
 in
 (hw input1, hw input2)
-end"
+end\<close>
-"((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
+\<open>(("h", ["e", "l", "l", "o"]),("w", ["o", "r", "l", "d"]))\<close>}
 Two parsers can be connected in sequence by using the function @{ML_ind "--" in Scan}.
 For example parsing \<open>h\<close>, \<open>e\<close> and \<open>l\<close> (in this
 order) you can achieve by:
 @{ML_matchresult [display,gray]
-"($$ \"h\" -- $$ \"e\" -- $$ \"l\") (Symbol.explode \"hello\")"
+\<open>($$ "h" -- $$ "e" -- $$ "l") (Symbol.explode "hello")\<close>
-"(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"}
+\<open>((("h", "e"), "l"), ["l", "o"])\<close>}
 Note how the result of consumed strings builds up on the left as nested pairs.
 If, as in the previous example, you want to parse a particular string,
 then you can use the function @{ML_ind this_string in Scan}.
 @{ML_matchresult [display,gray]
-"Scan.this_string \"hell\" (Symbol.explode \"hello\")"
+\<open>Scan.this_string "hell" (Symbol.explode "hello")\<close>
-"(\"hell\", [\"o\"])"}
+\<open>("hell", ["o"])\<close>}
 Parsers that explore alternatives can be constructed using the function
-@{ML_ind  "||" in Scan}. The parser @{ML "(p || q)" for p q} returns the
+@{ML_ind  "||" in Scan}. The parser @{ML \<open>(p || q)\<close> for p q} returns the
 result of \<open>p\<close>, in case it succeeds; otherwise it returns the
 result of \<open>q\<close>. For example:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val hw = $$ \"h\" || $$ \"w\"
+val hw = $$ "h" || $$ "w"
-val input1 = Symbol.explode \"hello\"
+val input1 = Symbol.explode "hello"
-val input2 = Symbol.explode \"world\"
+val input2 = Symbol.explode "world"
 in
 (hw input1, hw input2)
-end"
+end\<close>
-"((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
+\<open>(("h", ["e", "l", "l", "o"]), ("w", ["o", "r", "l", "d"]))\<close>}
 The functions @{ML_ind "|--" in Scan} and @{ML_ind "--|" in Scan} work like the sequencing
 function for parsers, except that they discard the item being parsed by the
 first (respectively second) parser. That means the item being dropped is the
 one that @{ML_ind "|--" in Scan} and @{ML_ind "--|" in Scan} ``point'' away.
 For example:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val just_e = $$ \"h\" |-- $$ \"e\"
+val just_e = $$ "h" |-- $$ "e"
-val just_h = $$ \"h\" --| $$ \"e\"
+val just_h = $$ "h" --| $$ "e"
-val input = Symbol.explode \"hello\"
+val input = Symbol.explode "hello"
 in
 (just_e input, just_h input)
-end"
+end\<close>
-"((\"e\", [\"l\", \"l\", \"o\"]), (\"h\", [\"l\", \"l\", \"o\"]))"}
+\<open>(("e", ["l", "l", "o"]), ("h", ["l", "l", "o"]))\<close>}
-The parser @{ML "Scan.optional p x" for p x} returns the result of the parser
+The parser @{ML \<open>Scan.optional p x\<close> for p x} returns the result of the parser
 \<open>p\<close>, in case it succeeds; otherwise it returns
 the default value \<open>x\<close>. For example:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val p = Scan.optional ($$ \"h\") \"x\"
+val p = Scan.optional ($$ "h") "x"
-val input1 = Symbol.explode \"hello\"
+val input1 = Symbol.explode "hello"
-val input2 = Symbol.explode \"world\"
+val input2 = Symbol.explode "world"
 in
 (p input1, p input2)
-end"
+end\<close>
-"((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
+\<open>(("h", ["e", "l", "l", "o"]), ("x", ["w", "o", "r", "l", "d"]))\<close>}
 The function @{ML_ind option in Scan} works similarly, except no default value can
 be given. Instead, the result is wrapped as an \<open>option\<close>-type. For example:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val p = Scan.option ($$ \"h\")
+val p = Scan.option ($$ "h")
-val input1 = Symbol.explode \"hello\"
+val input1 = Symbol.explode "hello"
-val input2 = Symbol.explode \"world\"
+val input2 = Symbol.explode "world"
 in
 (p input1, p input2)
-end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
+end\<close> \<open>((SOME "h", ["e", "l", "l", "o"]), (NONE, ["w", "o", "r", "l", "d"]))\<close>}
 The function @{ML_ind ahead in Scan} parses some input, but leaves the original
 input unchanged. For example:
 @{ML_matchresult [display,gray]
-"Scan.ahead (Scan.this_string \"foo\") (Symbol.explode \"foo\")"
+\<open>Scan.ahead (Scan.this_string "foo") (Symbol.explode "foo")\<close>
-"(\"foo\", [\"f\", \"o\", \"o\"])"}
+\<open>("foo", ["f", "o", "o"])\<close>}
 The function @{ML_ind "!!" in Scan} helps with producing appropriate error messages
 during parsing. For example if you want to parse \<open>p\<close> immediately
 followed by \<open>q\<close>, or start a completely different parser \<open>r\<close>,
 you might write:
-@{ML [display,gray] "(p -- q) || r" for p q r}
+@{ML [display,gray] \<open>(p -- q) || r\<close> for p q r}
 However, this parser is problematic for producing a useful error
-message, if the parsing of @{ML "(p -- q)" for p q} fails. Because with the
+message, if the parsing of @{ML \<open>(p -- q)\<close> for p q} fails. Because with the
 parser above you lose the information that \<open>p\<close> should be followed by \<open>q\<close>.
 To see this assume that \<open>p\<close> is present in the input, but it is not
-followed by \<open>q\<close>. That means @{ML "(p -- q)" for p q} will fail and
+followed by \<open>q\<close>. That means @{ML \<open>(p -- q)\<close> for p q} will fail and
 hence the alternative parser \<open>r\<close> will be tried. However, in many
 circumstances this will be the wrong parser for the input ``\<open>p\<close>-followed-by-something''
 and therefore will also fail. The error message is then caused by the failure
 of \<open>r\<close>, not by the absence of \<open>q\<close> in the input. Such
-situation can be avoided when using the function @{ML "!!"}.  This function
+situation can be avoided when using the function @{ML \<open>!!\<close>}.  This function
 aborts the whole process of parsing in case of a failure and prints an error
 message. For example if you invoke the parser
-@{ML [display,gray] "!! (fn _ => fn _ =>\"foo\") ($$ \"h\")"}
+@{ML [display,gray] \<open>!! (fn _ => fn _ =>"foo") ($$ "h")\<close>}
 on @{text [quotes] "hello"}, the parsing succeeds
 @{ML_matchresult [display,gray]
-"(!! (fn _ => fn _ => \"foo\") ($$ \"h\")) (Symbol.explode \"hello\")"
+\<open>(!! (fn _ => fn _ => "foo") ($$ "h")) (Symbol.explode "hello")\<close>
-"(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
+\<open>("h", ["e", "l", "l", "o"])\<close>}
 but if you invoke it on @{text [quotes] "world"}
-@{ML_matchresult_fake [display,gray] "(!! (fn _ => fn _ => \"foo\") ($$ \"h\")) (Symbol.explode \"world\")"
+@{ML_matchresult_fake [display,gray] \<open>(!! (fn _ => fn _ => "foo") ($$ "h")) (Symbol.explode "world")\<close>
-"Exception ABORT raised"}
+\<open>Exception ABORT raised\<close>}
 then the parsing aborts and the error message \<open>foo\<close> is printed. In order to
 see the error message properly, you need to prefix the parser with the function
 @{ML_ind error in Scan}. For example:
 @{ML_matchresult_fake [display,gray]
-"Scan.error (!! (fn _ => fn _ => \"foo\") ($$ \"h\")) (Symbol.explode \"world\")"
+\<open>Scan.error (!! (fn _ => fn _ => "foo") ($$ "h")) (Symbol.explode "world")\<close>
-"Exception Error \"foo\" raised"}
+\<open>Exception Error "foo" raised\<close>}
 This kind of ``prefixing'' to see the correct error message is usually done by wrappers
 such as @{ML_ind local_theory in Outer_Syntax}
 (see Section~\ref{sec:newcommand} which explains this function in more detail).
-Let us now return to our example of parsing @{ML "(p -- q) || r" for p q
+Let us now return to our example of parsing @{ML \<open>(p -- q) || r\<close> for p q
 r}. If you want to generate the correct error message for failure
 of parsing \<open>p\<close>-followed-by-\<open>q\<close>, then you have to write:
 \<close>
 ML %grayML\<open>fun p_followed_by_q p q r =
 text \<open>
 Running this parser with the arguments
 @{text [quotes] "h"}, @{text [quotes] "e"} and @{text [quotes] "w"}, and
 the input @{text [quotes] "holle"}
-@{ML_matchresult_fake [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (Symbol.explode \"holle\")"
+@{ML_matchresult_fake [display,gray] \<open>Scan.error (p_followed_by_q "h" "e" "w") (Symbol.explode "holle")\<close>
-"Exception ERROR \"h is not followed by e\" raised"}
+\<open>Exception ERROR "h is not followed by e" raised\<close>}
 produces the correct error message. Running it with
-@{ML_matchresult [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (Symbol.explode \"wworld\")"
+@{ML_matchresult [display,gray] \<open>Scan.error (p_followed_by_q "h" "e" "w") (Symbol.explode "wworld")\<close>
-"((\"w\", \"w\"), [\"o\", \"r\", \"l\", \"d\"])"}
+\<open>(("w", "w"), ["o", "r", "l", "d"])\<close>}
 yields the expected parsing.
-The function @{ML "Scan.repeat p" for p} will apply a parser \<open>p\<close> as
+The function @{ML \<open>Scan.repeat p\<close> for p} will apply a parser \<open>p\<close> as
 long as it succeeds. For example:
-@{ML_matchresult [display,gray] "Scan.repeat ($$ \"h\") (Symbol.explode \"hhhhello\")"
+@{ML_matchresult [display,gray] \<open>Scan.repeat ($$ "h") (Symbol.explode "hhhhello")\<close>
-"([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"}
+\<open>(["h", "h", "h", "h"], ["e", "l", "l", "o"])\<close>}
 Note that @{ML_ind repeat in Scan} stores the parsed items in a list. The function
 @{ML_ind repeat1 in Scan} is similar, but requires that the parser \<open>p\<close>
 succeeds at least once.
 Also note that the parser would have aborted with the exception \<open>MORE\<close>, if
 you had it run with the string @{text [quotes] "hhhh"}. This can be avoided by using
 the wrapper @{ML_ind finite in Scan} and the ``stopper-token''
 @{ML_ind stopper in Symbol}. With them you can write:
-@{ML_matchresult [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (Symbol.explode \"hhhh\")"
+@{ML_matchresult [display,gray] \<open>Scan.finite Symbol.stopper (Scan.repeat ($$ "h")) (Symbol.explode "hhhh")\<close>
-"([\"h\", \"h\", \"h\", \"h\"], [])"}
+\<open>(["h", "h", "h", "h"], [])\<close>}
 The function @{ML stopper in Symbol} is the ``end-of-input'' indicator for parsing strings;
 other stoppers need to be used when parsing, for example, tokens. However, this kind of
 manually wrapping is often already done by the surrounding infrastructure.
 The function @{ML_ind repeat in Scan} can be used with @{ML_ind one in Scan} to read any
 string as in
 @{ML_matchresult [display,gray]
-"let
+\<open>let
 val p = Scan.repeat (Scan.one Symbol.not_eof)
-val input = Symbol.explode \"foo bar foo\"
+val input = Symbol.explode "foo bar foo"
 in
 Scan.finite Symbol.stopper p input
-end"
+end\<close>
-"([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"}
+\<open>(["f", "o", "o", " ", "b", "a", "r", " ", "f", "o", "o"], [])\<close>}
 where the function @{ML_ind not_eof in Symbol} ensures that we do not read beyond the
 end of the input string (i.e.~stopper symbol).
 The function @{ML_ind unless in Scan} takes two parsers: if the first one can
 parse the input, then the whole parser fails; if not, then the second is tried. Therefore
-@{ML_matchresult_fake_both [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (Symbol.explode \"hello\")"
+@{ML_matchresult_fake_both [display,gray] \<open>Scan.unless ($$ "h") ($$ "w") (Symbol.explode "hello")\<close>
-"Exception FAIL raised"}
+\<open>Exception FAIL raised\<close>}
 fails, while
-@{ML_matchresult [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (Symbol.explode \"world\")"
+@{ML_matchresult [display,gray] \<open>Scan.unless ($$ "h") ($$ "w") (Symbol.explode "world")\<close>
-"(\"w\",[\"o\", \"r\", \"l\", \"d\"])"}
+\<open>("w",["o", "r", "l", "d"])\<close>}
 succeeds.
 The functions @{ML_ind repeat in Scan} and @{ML_ind unless in Scan} can
 be combined to read any input until a certain marker symbol is reached. In the
 example below the marker symbol is a @{text [quotes] "*"}.
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val p = Scan.repeat (Scan.unless ($$ \"*\") (Scan.one Symbol.not_eof))
+val p = Scan.repeat (Scan.unless ($$ "*") (Scan.one Symbol.not_eof))
-val input1 = Symbol.explode \"fooooo\"
+val input1 = Symbol.explode "fooooo"
-val input2 = Symbol.explode \"foo*ooo\"
+val input2 = Symbol.explode "foo*ooo"
 in
 (Scan.finite Symbol.stopper p input1,
 Scan.finite Symbol.stopper p input2)
-end"
+end\<close>
-"(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []),
+\<open>((["f", "o", "o", "o", "o", "o"], []),
-([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"}
+(["f", "o", "o"], ["*", "o", "o", "o"]))\<close>}
 After parsing is done, you almost always want to apply a function to the parsed
 items. One way to do this is the function @{ML_ind ">>" in Scan} where
-@{ML "(p >> f)" for p f} runs
+@{ML \<open>(p >> f)\<close> for p f} runs
 first the parser \<open>p\<close> and upon successful completion applies the
 function \<open>f\<close> to the result. For example
 @{ML_matchresult [display,gray]
-"let
+\<open>let
 fun double (x, y) = (x ^ x, y ^ y)
-val parser = $$ \"h\" -- $$ \"e\"
+val parser = $$ "h" -- $$ "e"
 in
-(parser >> double) (Symbol.explode \"hello\")
+(parser >> double) (Symbol.explode "hello")
-end"
+end\<close>
-"((\"hh\", \"ee\"), [\"l\", \"l\", \"o\"])"}
+\<open>(("hh", "ee"), ["l", "l", "o"])\<close>}
 doubles the two parsed input strings; or
 @{ML_matchresult [display,gray]
-"let
+\<open>let
 val p = Scan.repeat (Scan.one Symbol.not_eof)
-val input = Symbol.explode \"foo bar foo\"
+val input = Symbol.explode "foo bar foo"
 in
 Scan.finite Symbol.stopper (p >> implode) input
-end"
+end\<close>
-"(\"foo bar foo\",[])"}
+\<open>("foo bar foo",[])\<close>}
 where the single-character strings in the parsed output are transformed
 back into one string.
 The function @{ML_ind lift in Scan} takes a parser and a pair as arguments. This function applies
 the given parser to the second component of the pair and leaves the  first component
 untouched. For example
 @{ML_matchresult [display,gray]
-"Scan.lift ($$ \"h\" -- $$ \"e\") (1, Symbol.explode \"hello\")"
+\<open>Scan.lift ($$ "h" -- $$ "e") (1, Symbol.explode "hello")\<close>
-"((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"}
+\<open>(("h", "e"), (1, ["l", "l", "o"]))\<close>}
 \footnote{\bf FIXME: In which situations is \<open>lift\<close> useful? Give examples.}
 Be aware of recursive parsers. Suppose you want to read strings separated by
 commas and by parentheses into a tree datastructure; for example, generating
 separated by a comma, or acts like @{ML parse_basic}. Unfortunately,
 because of the mutual recursion, this parser will immediately run into a
 loop, even if it is called without any input. For example
 @{ML_matchresult_fake_both [display, gray]
-"parse_tree \"A\""
+\<open>parse_tree "A"\<close>
-"*** Exception- TOPLEVEL_ERROR raised"}
+\<open>*** Exception- TOPLEVEL_ERROR raised\<close>}
 raises an exception indicating that the stack limit is reached. Such
 looping parser are not useful, because of ML's strict evaluation of
 arguments. Therefore we need to delay the execution of the
 parser until an input is given. This can be done by adding the parsed
 changed: with this version of the parser the execution is delayed until
 some string is  applied for the argument \<open>xs\<close>. This gives us
 exactly the parser what we wanted. An example is as follows:
 @{ML_matchresult [display, gray]
-"let
+\<open>let
-val input = Symbol.explode \"(A,((A))),A\"
+val input = Symbol.explode "(A,((A))),A"
 in
-Scan.finite Symbol.stopper (parse_tree \"A\") input
+Scan.finite Symbol.stopper (parse_tree "A") input
-end"
+end\<close>
-"(Br (Br (Lf \"A\", Lf \"A\"), Lf \"A\"), [])"}
+\<open>(Br (Br (Lf "A", Lf "A"), Lf "A"), [])\<close>}
 \begin{exercise}\label{ex:scancmts}
 Write a parser that parses an input string so that any comment enclosed
 within \<open>(*\<dots>*)\<close> is replaced by the same comment but enclosed within
 \<open>(**\<dots>**)\<close> in the output string. To enclose a string, you can use the
-function @{ML "enclose s1 s2 s" for s1 s2 s} which produces the string @{ML
+function @{ML \<open>enclose s1 s2 s\<close> for s1 s2 s} which produces the string @{ML \<open>s1 ^ s ^ s2\<close> for s1 s2 s}. Hint: To simplify the task ignore the proper
-"s1 ^ s ^ s2" for s1 s2 s}. Hint: To simplify the task ignore the proper
 nesting of comments.
 \end{exercise}
 \<close>
 section \<open>Parsing Theory Syntax\<close>
 example @{ML_ind Ident in Token} for identifiers, @{ML Keyword in
 Token} for keywords and @{ML_ind Command in Token} for commands). Some
 token parsers take into account the kind of tokens. The first example shows
 how to generate a token list out of a string using the function
 @{ML_ind explode in Token}. It is given the argument
-@{ML "Position.none"} since,
+@{ML \<open>Position.none\<close>} since,
 at the moment, we are not interested in generating precise error
 messages. The following code
 @{ML_matchresult_fake [display,gray] \<open>Token.explode
 text \<open>
 then lexing @{text [quotes] "hello world"} will produce
-@{ML_matchresult_fake [display,gray] "Token.explode
+@{ML_matchresult_fake [display,gray] \<open>Token.explode
-(Thy_Header.get_keywords' @{context}) Position.none \"hello world\""
+(Thy_Header.get_keywords' @{context}) Position.none "hello world"\<close>
-"[Token (_,(Keyword, \"hello\"),_),
+\<open>[Token (_,(Keyword, "hello"),_),
-Token (_,(Space, \" \"),_),
+Token (_,(Space, " "),_),
-Token (_,(Ident, \"world\"),_)]"}
+Token (_,(Ident, "world"),_)]\<close>}
 Many parsing functions later on will require white space, comments and the like
 to have already been filtered out.  So from now on we are going to use the
 functions @{ML filter} and @{ML_ind is_proper in Token} to do this.
 For example:
 @{ML_matchresult_fake [display,gray]
-"let
+\<open>let
-val input = Token.explode (Thy_Header.get_keywords' @{context}) Position.none \"hello world\"
+val input = Token.explode (Thy_Header.get_keywords' @{context}) Position.none "hello world"
 in
 filter Token.is_proper input
-end"
+end\<close>
-"[Token (_,(Ident, \"hello\"), _), Token (_,(Ident, \"world\"), _)]"}
+\<open>[Token (_,(Ident, "hello"), _), Token (_,(Ident, "world"), _)]\<close>}
 For convenience we define the function:
 \<close>
 ML %grayML\<open>fun filtered_input str =
 ML \<open>filtered_input "inductive | for"\<close>
 text \<open>
 If you now parse
 @{ML_matchresult_fake [display,gray]
-"filtered_input \"inductive | for\""
+\<open>filtered_input "inductive | for"\<close>
-"[Token (_,(Command, \"inductive\"),_),
+\<open>[Token (_,(Command, "inductive"),_),
-Token (_,(Keyword, \"|\"),_),
+Token (_,(Keyword, "|"),_),
-Token (_,(Keyword, \"for\"),_)]"}
+Token (_,(Keyword, "for"),_)]\<close>}
 you obtain a list consisting of only one command and two keyword tokens.
 If you want to see which keywords and commands are currently known to Isabelle,
 use the function @{ML_ind get_keywords' in Thy_Header}:
 see the complete list.
 The parser @{ML_ind "$$$" in Parse} parses a single keyword. For example:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val input1 = filtered_input \"where for\"
+val input1 = filtered_input "where for"
-val input2 = filtered_input \"| in\"
+val input2 = filtered_input "| in"
 in
-(Parse.$$$ \"where\" input1, Parse.$$$ \"|\" input2)
+(Parse.$$$ "where" input1, Parse.$$$ "|" input2)
-end"
+end\<close>
-"((\"where\",_), (\"|\",_))"}
+\<open>(("where",_), ("|",_))\<close>}
 Any non-keyword string can be parsed with the function @{ML_ind reserved in Parse}.
 For example:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val p = Parse.reserved \"bar\"
+val p = Parse.reserved "bar"
-val input = filtered_input \"bar\"
+val input = filtered_input "bar"
 in
 p input
-end"
+end\<close>
-"(\"bar\",[])"}
+\<open>("bar",[])\<close>}
-Like before, you can sequentially connect parsers with @{ML "--"}. For example:
+Like before, you can sequentially connect parsers with @{ML \<open>--\<close>}. For example:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val input = filtered_input \"| in\"
+val input = filtered_input "| in"
 in
-(Parse.$$$ \"|\" -- Parse.$$$ \"in\") input
+(Parse.$$$ "|" -- Parse.$$$ "in") input
-end"
+end\<close>
-"((\"|\", \"in\"), [])"}
+\<open>(("|", "in"), [])\<close>}
-The parser @{ML "Parse.enum s p" for s p} parses a possibly empty
+The parser @{ML \<open>Parse.enum s p\<close> for s p} parses a possibly empty
 list of items recognised by the parser \<open>p\<close>, where the items being parsed
 are separated by the string \<open>s\<close>. For example:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val input = filtered_input \"in | in | in foo\"
+val input = filtered_input "in | in | in foo"
 in
-(Parse.enum \"|\" (Parse.$$$ \"in\")) input
+(Parse.enum "|" (Parse.$$$ "in")) input
-end"
+end\<close>
-"([\"in\", \"in\", \"in\"], [_])"}
+\<open>(["in", "in", "in"], [_])\<close>}
 The function @{ML_ind enum1 in Parse} works similarly, except that the
 parsed list must be non-empty. Note that we had to add a string @{text
 [quotes] "foo"} at the end of the parsed string, otherwise the parser would
 have consumed all tokens and then failed with the exception \<open>MORE\<close>. Like in the previous section, we can avoid this exception using the
 wrapper @{ML Scan.finite}. This time, however, we have to use the
 ``stopper-token'' @{ML Token.stopper}. We can write:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
-val input = filtered_input \"in | in | in\"
+val input = filtered_input "in | in | in"
-val p = Parse.enum \"|\" (Parse.$$$ \"in\")
+val p = Parse.enum "|" (Parse.$$$ "in")
 in
 Scan.finite Token.stopper p input
-end"
+end\<close>
-"([\"in\", \"in\", \"in\"], [])"}
+\<open>(["in", "in", "in"], [])\<close>}
 The following function will help to run examples.
 \<close>
 ML %grayML\<open>fun parse p input = Scan.finite Token.stopper (Scan.error p) input\<close>
 text \<open>
 The function @{ML_ind "!!!" in Parse} can be used to force termination
-of the parser in case of a dead end, just like @{ML "Scan.!!"} (see previous
+of the parser in case of a dead end, just like @{ML \<open>Scan.!!\<close>} (see previous
-section). A difference, however, is that the error message of @{ML
+section). A difference, however, is that the error message of @{ML \<open>Parse.!!!\<close>} is fixed to be @{text [quotes] "Outer syntax error"}
-"Parse.!!!"} is fixed to be @{text [quotes] "Outer syntax error"}
 together with a relatively precise description of the failure. For example:
 @{ML_matchresult_fake [display,gray]
-"let
+\<open>let
-val input = filtered_input \"in |\"
+val input = filtered_input "in |"
-val parse_bar_then_in = Parse.$$$ \"|\" -- Parse.$$$ \"in\"
+val parse_bar_then_in = Parse.$$$ "|" -- Parse.$$$ "in"
 in
 parse (Parse.!!! parse_bar_then_in) input
-end"
+end\<close>
-"Exception ERROR \"Outer syntax error: keyword \"|\" expected,
+\<open>Exception ERROR "Outer syntax error: keyword "|" expected,
-but keyword in was found\" raised"
+but keyword in was found" raised\<close>
 }
 \begin{exercise} (FIXME)
 A type-identifier, for example @{typ "'a"}, is a token of
 kind @{ML_ind Keyword in Token}. It can be parsed using
 text \<open>
 where we pretend the parsed string starts on line 7. An example is
 @{ML_matchresult_fake [display,gray]
-"filtered_input' \"foo \\n bar\""
+\<open>filtered_input' "foo \\n bar"\<close>
-"[Token ((\"foo\", ({line=7, end_line=7}, {line=7})), (Ident, \"foo\"), _),
+\<open>[Token (("foo", ({line=7, end_line=7}, {line=7})), (Ident, "foo"), _),
-Token ((\"bar\", ({line=8, end_line=8}, {line=8})), (Ident, \"bar\"), _)]"}
+Token (("bar", ({line=8, end_line=8}, {line=8})), (Ident, "bar"), _)]\<close>}
 in which the @{text [quotes] "\\n"} causes the second token to be in
 line 8.
 By using the parser @{ML position in Parse} you can access the token
 position and return it as part of the parser result. For example
 @{ML_matchresult_fake [display,gray]
-"let
+\<open>let
-val input = filtered_input' \"where\"
+val input = filtered_input' "where"
 in
-parse (Parse.position (Parse.$$$ \"where\")) input
+parse (Parse.position (Parse.$$$ "where")) input
-end"
+end\<close>
-"((\"where\", {line=7, end_line=7}), [])"}
+\<open>(("where", {line=7, end_line=7}), [])\<close>}
 \begin{readmore}
 The functions related to positions are implemented in the file
 @{ML_file "Pure/General/position.ML"}.
 \end{readmore}
 There is usually no need to write your own parser for parsing inner syntax, that is
 for terms and  types: you can just call the predefined parsers. Terms can
 be parsed using the function @{ML_ind term in Parse}. For example:
 @{ML_matchresult_fake [display,gray]
-"let
+\<open>let
-val input = Token.explode (Thy_Header.get_keywords' @{context}) Position.none \"foo\"
+val input = Token.explode (Thy_Header.get_keywords' @{context}) Position.none "foo"
 in
 Parse.term input
-end"
+end\<close>
-"(\"<markup>\", [])"}
+\<open>("<markup>", [])\<close>}
 The function @{ML_ind prop in Parse} is similar, except that it gives a different
 error message, when parsing fails. As you can see, the parser not just returns
 the parsed string, but also some markup information. You can decode the
 information with the function @{ML_ind parse in YXML} in @{ML_structure YXML}.
 The result of the decoding is an XML-tree. You can see better what is going on if
-you replace @{ML Position.none} by @{ML "Position.line 42"}, say:
+you replace @{ML Position.none} by @{ML \<open>Position.line 42\<close>}, say:
 @{ML_matchresult_fake [display,gray]
-"let
+\<open>let
-val input = Token.explode (Thy_Header.get_keywords' @{context}) (Position.line 42) \"foo\"
+val input = Token.explode (Thy_Header.get_keywords' @{context}) (Position.line 42) "foo"
 in
 YXML.parse (fst (Parse.term input))
-end"
+end\<close>
-"Elem (\"token\", [(\"line\", \"42\"), (\"end_line\", \"42\")], [XML.Text \"foo\"])"}
+\<open>Elem ("token", [("line", "42"), ("end_line", "42")], [XML.Text "foo"])\<close>}
 The positional information is stored as part of an XML-tree so that code
 called later on will be able to give more precise error messages.
 \begin{readmore}
 To see what the parser returns, let us parse the string corresponding to the
 definition of @{term even} and @{term odd}:
 @{ML_matchresult [display,gray]
-"let
+\<open>let
 val input = filtered_input
-(\"even and odd \" ^
+("even and odd " ^
-\"where \" ^
+"where " ^
-\"  even0[intro]: \\\"even 0\\\" \" ^
+"  even0[intro]: \"even 0\" " ^
-\"| evenS[intro]: \\\"odd n \<Longrightarrow> even (Suc n)\\\" \" ^
+"| evenS[intro]: \"odd n \<Longrightarrow> even (Suc n)\" " ^
-\"| oddS[intro]:  \\\"even n \<Longrightarrow> odd (Suc n)\\\"\")
+"| oddS[intro]:  \"even n \<Longrightarrow> odd (Suc n)\"")
 in
 parse spec_parser input
-end"
+end\<close>
-"(([(even, NONE, NoSyn), (odd, NONE, NoSyn)],
+\<open>(([(even, NONE, NoSyn), (odd, NONE, NoSyn)],
 [((even0,_),_),
 ((evenS,_),_),
-((oddS,_),_)]), [])"}
+((oddS,_),_)]), [])\<close>}
 \<close>
 ML \<open>let
 val input = filtered_input
 "foo::\"int \<Rightarrow> bool\" and bar::nat (\"BAR\" 100) and blonk"
 The function @{ML_ind "vars" in Parse} in Line 2 of the parser reads an
 \isacommand{and}-separated
 list of variables that can include optional type annotations and syntax translations.
 For example:\footnote{Note that in the code we need to write
-\<open>\"int \<Rightarrow> bool\"\<close> in order to properly escape the double quotes
+\<open>"int \<Rightarrow> bool"\<close> in order to properly escape the double quotes
 in the compound type.}
 @{ML_matchresult_fake [display,gray]
-"let
+\<open>let
 val input = filtered_input
-\"foo::\\\"int \<Rightarrow> bool\\\" and bar::nat (\\\"BAR\\\" 100) and blonk\"
+"foo::\"int \<Rightarrow> bool\" and bar::nat (\"BAR\" 100) and blonk"
 in
 parse Parse.vars input
-end"
+end\<close>
-"([(foo, SOME _, NoSyn),
+\<open>([(foo, SOME _, NoSyn),
-(bar, SOME _, Mixfix (Source {text=\"BAR\",...}, [], 100, _)),
+(bar, SOME _, Mixfix (Source {text="BAR",...}, [], 100, _)),
-(blonk, NONE, NoSyn)],[])"}
+(blonk, NONE, NoSyn)],[])\<close>}
 \<close>
 text \<open>
 Whenever types are given, they are stored in the @{ML SOME}s. The types are
 not yet used to type the variables: this must be done by type-inference later
 list of introduction rules, that is propositions with theorem annotations
 such as rule names and attributes. The introduction rules are propositions
 parsed by @{ML_ind prop in Parse}. However, they can include an optional
 theorem name plus some attributes. For example
-@{ML_matchresult [display,gray] "let
+@{ML_matchresult [display,gray] \<open>let
-val input = filtered_input \"foo_lemma[intro,dest!]:\"
+val input = filtered_input "foo_lemma[intro,dest!]:"
-val ((name, attrib), _) = parse (Parse_Spec.thm_name \":\") input
+val ((name, attrib), _) = parse (Parse_Spec.thm_name ":") input
 in
 (name, map Token.name_of_src attrib)
-end" "(foo_lemma, [(\"intro\", _), (\"dest\", _)])"}
+end\<close> \<open>(foo_lemma, [("intro", _), ("dest", _)])\<close>}
 The function @{ML_ind opt_thm_name in Parse_Spec} is the ``optional'' variant of
 @{ML_ind thm_name in Parse_Spec}. Theorem names can contain attributes. The name
 has to end with @{text [quotes] ":"}---see the argument of
 the function @{ML Parse_Spec.opt_thm_name} in Line 7.
 The crucial part of a command is the function that determines the
 behaviour of the command. In the code above we used a
 ``do-nothing''-function, which because of the parser @{ML_ind succeed in Scan}
 does not parse any argument, but immediately returns the simple
-function @{ML "Local_Theory.background_theory I"}. We can replace
+function @{ML \<open>Local_Theory.background_theory I\<close>}. We can replace
 this code by a function that first parses a proposition (using the
 parser @{ML Parse.prop}), then prints out some tracing information
 (using the function \<open>trace_prop\<close>) and finally does
 nothing. For this you can write:
 \<close>
 \isacommand{definition} and \isacommand{declare}.  In other cases, commands
 are expected to parse some arguments, for example a proposition, and then
 ``open up'' a proof in order to prove the proposition (for example
 \isacommand{lemma}) or prove some other properties (for example
 \isacommand{function}). To achieve this kind of behaviour, you have to use
-the kind indicator @{ML_ind thy_goal in Keyword} and the function @{ML
+the kind indicator @{ML_ind thy_goal in Keyword} and the function @{ML \<open>local_theory_to_proof\<close> in Outer_Syntax} to set up the command.
-"local_theory_to_proof" in Outer_Syntax} to set up the command.
 Below we show the command \isacommand{foobar\_goal} which takes a
 proposition as argument and then starts a proof in order to prove
 it. Therefore, we need to announce this command in the header
 as \<open>thy_goal\<close>.
 The function \<open>goal_prop\<close> in Lines 2 to 7 takes a string (the proposition to be
 proved) and a context as argument.  The context is necessary in order to be able to use
 @{ML_ind read_prop in Syntax}, which converts a string into a proper proposition.
 In Line 6 the function @{ML_ind theorem in Proof} starts the proof for the
 proposition. Its argument @{ML NONE} stands for a locale (which we chose to
-omit); the argument @{ML "(K I)"} stands for a function that determines what
+omit); the argument @{ML \<open>(K I)\<close>} stands for a function that determines what
 should be done with the theorem once it is proved (we chose to just forget
 about it).
 If you now type \isacommand{foobar\_goal}~@{text [quotes] "True \<and> True"},
 you obtain the following proof state:

changeset 569	f875a25aa72d
parent 567	f7c97e64cc2a
child 572	438703674711