372 "Scan.lift ($$ \"h\" -- $$ \"e\") (1, Symbol.explode \"hello\")" |
372 "Scan.lift ($$ \"h\" -- $$ \"e\") (1, Symbol.explode \"hello\")" |
373 "((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"} |
373 "((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"} |
374 |
374 |
375 \footnote{\bf FIXME: In which situations is @{text "lift"} useful? Give examples.} |
375 \footnote{\bf FIXME: In which situations is @{text "lift"} useful? Give examples.} |
376 |
376 |
377 Be aware of recursive parsers. Suppose you want to read strings |
377 Be aware of recursive parsers. Suppose you want to read strings separated by |
378 separated by commas and by parentheses into a tree datastructure. |
378 commas and by parentheses into a tree datastructure; for example, generating |
379 We assume the trees are represented by the datatype: |
379 the tree corresponding to the string @{text [quotes] "(A, A), (A, A)"} where |
|
380 the @{text "A"}s will be the leaves. We assume the trees are represented by the |
|
381 datatype: |
380 *} |
382 *} |
381 |
383 |
382 ML{*datatype tree = |
384 ML{*datatype tree = |
383 Lf of string |
385 Lf of string |
384 | Br of tree * tree*} |
386 | Br of tree * tree*} |
388 the string @{text [quotes] "((A))"} should be read into a single |
390 the string @{text [quotes] "((A))"} should be read into a single |
389 leaf---you might implement the following parser. |
391 leaf---you might implement the following parser. |
390 *} |
392 *} |
391 |
393 |
392 ML{*fun parse_basic s = |
394 ML{*fun parse_basic s = |
393 $$ s >> Lf |
395 $$ s >> Lf || $$ "(" |-- parse_tree s --| $$ ")" |
394 || $$ "(" |-- parse_tree s --| $$ ")" |
396 |
395 and parse_tree s = |
397 and parse_tree s = |
396 parse_basic s --| $$ "," -- parse_tree s >> Br |
398 parse_basic s --| $$ "," -- parse_tree s >> Br || parse_basic s*} |
397 || parse_basic s*} |
399 |
398 |
400 text {* |
399 text {* |
401 This parser corrsponds to the grammar: |
|
402 |
|
403 \begin{center} |
|
404 \begin{tabular}{lcl} |
|
405 @{text "<Basic>"} & @{text "::="} & @{text "<String> | (<Tree>)"}\\ |
|
406 @{text "<Tree>"} & @{text "::="} & @{text "<Basic>, <Tree> | <Basic>"}\\ |
|
407 \end{tabular} |
|
408 \end{center} |
|
409 |
400 The parameter @{text "s"} is the string over which the tree is parsed. The |
410 The parameter @{text "s"} is the string over which the tree is parsed. The |
401 parser @{ML parse_basic} reads either a leaf or a tree enclosed in |
411 parser @{ML parse_basic} reads either a leaf or a tree enclosed in |
402 parentheses. The parser @{ML parse_tree} reads either a pair of trees |
412 parentheses. The parser @{ML parse_tree} reads either a pair of trees |
403 separated by a comma, or acts like @{ML parse_basic}. Unfortunately, |
413 separated by a comma, or acts like @{ML parse_basic}. Unfortunately, |
404 because of the mutual recursion, this parser will immediately run into a |
414 because of the mutual recursion, this parser will immediately run into a |
410 |
420 |
411 raises an exception indicating that the stack limit is reached. Such |
421 raises an exception indicating that the stack limit is reached. Such |
412 looping parser are not useful, because of ML's strict evaluation of |
422 looping parser are not useful, because of ML's strict evaluation of |
413 arguments. Therefore we need to delay the execution of the |
423 arguments. Therefore we need to delay the execution of the |
414 parser until an input is given. This can be done by adding the parsed |
424 parser until an input is given. This can be done by adding the parsed |
415 string as an explicit argument. |
425 string as an explicit argument. So the parser above should be implemented |
|
426 as follows. |
416 *} |
427 *} |
417 |
428 |
418 ML{*fun parse_basic s xs = |
429 ML{*fun parse_basic s xs = |
419 ($$ s >> Lf |
430 ($$ s >> Lf || $$ "(" |-- parse_tree s --| $$ ")") xs |
420 || $$ "(" |-- parse_tree s --| $$ ")") xs |
431 |
421 and parse_tree s xs = |
432 and parse_tree s xs = |
422 (parse_basic s --| $$ "," -- parse_tree s >> Br |
433 (parse_basic s --| $$ "," -- parse_tree s >> Br || parse_basic s) xs*} |
423 || parse_basic s) xs*} |
|
424 |
434 |
425 text {* |
435 text {* |
426 While the type of the parser is unchanged by the addition, its behaviour |
436 While the type of the parser is unchanged by the addition, its behaviour |
427 changed: with this version of the parser the execution is delayed until |
437 changed: with this version of the parser the execution is delayed until |
428 some string is applied for the argument @{text "xs"}. This gives us |
438 some string is applied for the argument @{text "xs"}. This gives us |