--- a/ProgTutorial/Parsing.thy Sun Nov 15 13:47:31 2009 +0100
+++ b/ProgTutorial/Parsing.thy Tue Nov 17 18:40:11 2009 +0100
@@ -372,7 +372,70 @@
"Scan.lift ($$ \"h\" -- $$ \"e\") (1, Symbol.explode \"hello\")"
"((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"}
- (FIXME: In which situations is this useful? Give examples.)
+ \footnote{\bf FIXME: In which situations is @{text "lift"} useful? Give examples.}
+
+ Be aware of recursive parsers. Suppose you want to read strings
+ separated by commas and by parentheses into a tree. We assume
+ the trees are represented by the datatype:
+*}
+
+ML{*datatype tree =
+ Lf of string
+ | Br of tree * tree*}
+
+text {*
+ Since nested parentheses should be treated in a meaningful way---for example
+ the string @{text [quotes] "((A))"} should be read into a single
+ leaf---you might implement the following parser.
+*}
+
+ML{*fun parse_basic s =
+ $$ s >> Lf
+ || $$ "(" |-- parse_tree s --| $$ ")"
+and parse_tree s =
+ parse_basic s --| $$ "," -- parse_tree s >> Br
+ || parse_basic s*}
+
+text {*
+ The parameter @{text "s"} is the string over which the tree is parsed. The
+ parser @{ML parse_basic} reads either a leaf or a tree enclosed in
+ parentheses. The parser @{ML parse_tree} reads either a pair of trees
+ separated by a comma, or acts like @{ML parse_basic}. Unfortunately,
+ because of the mutual recursion, this parser will immediately run into a
+ loop, even if it is called without any input. For example
+
+ @{ML_response_fake_both [display, gray]
+ "parse_tree \"A\""
+ "*** Exception- TOPLEVEL_ERROR raised"}
+
+ raises an exception indicating that the stack limit is reached. Such
+ looping parser are not useful at all, because of ML's strict evaluation of
+ arguments. Therefore we need to delay the execution of the
+ parser until an input is given. This can be done by adding the parsed
+ string as an explicit argument.
+*}
+
+ML{*fun parse_basic s xs =
+ ($$ s >> Lf
+ || $$ "(" |-- parse_tree s --| $$ ")") xs
+and parse_tree s xs =
+ (parse_basic s --| $$ "," -- parse_tree s >> Br
+ || parse_basic s) xs*}
+
+text {*
+ While the type of the parser is unchanged by the addition, its behaviour
+ changed: with this version of the parser the execution is delayed until
+ some string is applied for the argument @{text "xs"}. This gives us
+ exactly the parser what we wanted. An example is as follows:
+
+ @{ML_response [display, gray]
+ "let
+ val input = Symbol.explode \"(A,((A))),A\"
+in
+ Scan.finite Symbol.stopper (parse_tree \"A\") input
+end"
+ "(Br (Br (Lf \"A\", Lf \"A\"), Lf \"A\"), [])"}
+
\begin{exercise}\label{ex:scancmts}
Write a parser that parses an input string so that any comment enclosed