diff -r 4cc6df387ce9 -r 8ad407e77ea0 ProgTutorial/Parsing.thy --- a/ProgTutorial/Parsing.thy Sun Nov 15 13:47:31 2009 +0100 +++ b/ProgTutorial/Parsing.thy Tue Nov 17 18:40:11 2009 +0100 @@ -372,7 +372,70 @@ "Scan.lift ($$ \"h\" -- $$ \"e\") (1, Symbol.explode \"hello\")" "((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"} - (FIXME: In which situations is this useful? Give examples.) + \footnote{\bf FIXME: In which situations is @{text "lift"} useful? Give examples.} + + Be aware of recursive parsers. Suppose you want to read strings + separated by commas and by parentheses into a tree. We assume + the trees are represented by the datatype: +*} + +ML{*datatype tree = + Lf of string + | Br of tree * tree*} + +text {* + Since nested parentheses should be treated in a meaningful way---for example + the string @{text [quotes] "((A))"} should be read into a single + leaf---you might implement the following parser. +*} + +ML{*fun parse_basic s = + $$ s >> Lf + || $$ "(" |-- parse_tree s --| $$ ")" +and parse_tree s = + parse_basic s --| $$ "," -- parse_tree s >> Br + || parse_basic s*} + +text {* + The parameter @{text "s"} is the string over which the tree is parsed. The + parser @{ML parse_basic} reads either a leaf or a tree enclosed in + parentheses. The parser @{ML parse_tree} reads either a pair of trees + separated by a comma, or acts like @{ML parse_basic}. Unfortunately, + because of the mutual recursion, this parser will immediately run into a + loop, even if it is called without any input. For example + + @{ML_response_fake_both [display, gray] + "parse_tree \"A\"" + "*** Exception- TOPLEVEL_ERROR raised"} + + raises an exception indicating that the stack limit is reached. Such + looping parser are not useful at all, because of ML's strict evaluation of + arguments. Therefore we need to delay the execution of the + parser until an input is given. This can be done by adding the parsed + string as an explicit argument. +*} + +ML{*fun parse_basic s xs = + ($$ s >> Lf + || $$ "(" |-- parse_tree s --| $$ ")") xs +and parse_tree s xs = + (parse_basic s --| $$ "," -- parse_tree s >> Br + || parse_basic s) xs*} + +text {* + While the type of the parser is unchanged by the addition, its behaviour + changed: with this version of the parser the execution is delayed until + some string is applied for the argument @{text "xs"}. This gives us + exactly the parser what we wanted. An example is as follows: + + @{ML_response [display, gray] + "let + val input = Symbol.explode \"(A,((A))),A\" +in + Scan.finite Symbol.stopper (parse_tree \"A\") input +end" + "(Br (Br (Lf \"A\", Lf \"A\"), Lf \"A\"), [])"} + \begin{exercise}\label{ex:scancmts} Write a parser that parses an input string so that any comment enclosed