# HG changeset patch # User Christian Urban # Date 1229414924 0 # Node ID 126646f2aa88f1de1b6db471089f2f50e6c57015 # Parent 0b55402ae95e5745c14402efe2ad24367de3b011 added a para on Scan.unless and an exercise about scanning comments diff -r 0b55402ae95e -r 126646f2aa88 CookBook/Parsing.thy --- a/CookBook/Parsing.thy Mon Dec 15 10:48:27 2008 +0100 +++ b/CookBook/Parsing.thy Tue Dec 16 08:08:44 2008 +0000 @@ -92,10 +92,17 @@ Note how the result of consumed strings builds up on the left as nested pairs. - Parsers that explore - alternatives can be constructed using the function @{ML "(op ||)"}. For example, the - parser @{ML "(p || q)" for p q} returns the result of @{ML_text "p"}, in case it succeeds, - otherwise it returns the result of @{ML_text "q"}. For example + If, as in the previous example, one wants to parse a particular string, + then one should rather use the function @{ML Scan.this_string}: + + @{ML_response [display] "Scan.this_string \"hel\" (explode \"hello\")" + "(\"hel\", [\"l\", \"o\"])"} + + Parsers that explore alternatives can be constructed using the function @{ML + "(op ||)"}. For example, the parser @{ML "(p || q)" for p q} returns the + result of @{ML_text "p"}, in case it succeeds, otherwise it returns the + result of @{ML_text "q"}. For example + @{ML_response [display] "let @@ -240,6 +247,50 @@ However, this kind of manually wrapping needs to be done only very rarely in practise, because it is already done by the infrastructure for you. + The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any + string as in + + @{ML_response [display] +"let + val p = Scan.repeat (Scan.one Symbol.not_eof) + val input = (explode \"foo bar foo\") +in + Scan.finite Symbol.stopper p input +end" +"([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"} + + where the function @{ML Symbol.not_eof} ensures that we do not read beyond the + end of the input string. + + The function @{ML "Scan.unless p q" for p q} takes two parsers: if the first one can + parse the input then the whole parser fails; if not, then the second is tried. Therefore + + @{ML_response_fake_both [display] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"hello\")" + "Exception FAIL raised"} + + fails, while + + @{ML_response [display] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"world\")" + "(\"w\",[\"o\", \"r\", \"l\", \"d\"])"} + + succeeds. + + The functions @{ML Scan.repeat} and @{ML Scan.unless} can be combined to read any + input until a certain marker symbol is reached. In the example below the marker + symbol is a @{text [quotes] "*"}: + + @{ML_response [display] +"let + val p = Scan.repeat (Scan.unless ($$ \"*\") (Scan.one Symbol.not_eof)) + val input1 = (explode \"fooooo\") + val input2 = (explode \"foo*ooo\") +in + (Scan.finite Symbol.stopper p input1, + Scan.finite Symbol.stopper p input2) +end" +"(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []), + ([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"} + After parsing succeeded, one nearly always wants to apply a function on the parsed items. This is done using the function @{ML "(p >> f)" for p f} which runs first the parser @{ML_text p} and upon successful completion applies the @@ -255,6 +306,14 @@ doubles the two parsed input strings. + \begin{exercise}\label{ex:scancmts} + Write a parser parses an input string so that any comment enclosed inside + @{text "(*\*)"} is enclosed inside @{text "(**\**)"} in the output + string. To enclose a string, you can use the function @{ML "enclose s1 s2 s" + for s1 s2 s} which produces the string @{ML "s1^s^s2" for s1 s2 s}. + \end{exercise} + + The function @{ML Scan.lift} takes a parser and a pair as arguments. This function applies the given parser to the second component of the pair and leaves the first component untouched. For example diff -r 0b55402ae95e -r 126646f2aa88 CookBook/Solutions.thy --- a/CookBook/Solutions.thy Mon Dec 15 10:48:27 2008 +0100 +++ b/CookBook/Solutions.thy Tue Dec 16 08:08:44 2008 +0000 @@ -23,4 +23,34 @@ HOLogic.mk_nat (HOLogic.dest_nat t1 + HOLogic.dest_nat t2) *} +text {* \solution{ex:scancmts} *} + +ML {* +val any = Scan.one (Symbol.not_eof); + +val scan_cmt = + let + val begin_cmt = Scan.this_string "(*" + val end_cmt = Scan.this_string "*)" + in + begin_cmt |-- Scan.repeat (Scan.unless end_cmt any) --| end_cmt + >> (enclose "(**" "**)" o implode) + end + +val scan_all = + Scan.finite Symbol.stopper (Scan.repeat (scan_cmt || any)) + >> implode #> fst +*} + +text {* + By using @{ML_text "#> fst"} in the last line, the function + @{ML scan_all} retruns a string instead of a pair. For example: + + @{ML_response [display] "scan_all (explode \"foo bar\")" "\"foo bar\""} + + @{ML_response [display] "scan_all (explode \"foo (*test*) bar (*test*)\")" + "\"foo (**test**) bar (**test**)\""} + +*} + end \ No newline at end of file diff -r 0b55402ae95e -r 126646f2aa88 cookbook.pdf Binary file cookbook.pdf has changed