diff -r 0b9fa606a746 -r 748d9c1a32fb CookBook/Parsing.thy --- a/CookBook/Parsing.thy Wed Feb 18 17:17:37 2009 +0000 +++ b/CookBook/Parsing.thy Thu Feb 19 01:09:16 2009 +0000 @@ -326,6 +326,12 @@ "s1 ^ s ^ s2" for s1 s2 s}. \end{exercise} + The function @{ML Scan.ahead} parses some input, but leaves the original + input unchanged. For example: + + @{ML_response [display,gray] + "Scan.ahead (Scan.this_string \"foo\") (explode \"foo\")" + "(\"foo\", [\"f\", \"o\", \"o\"])"} The function @{ML Scan.lift} takes a parser and a pair as arguments. This function applies the given parser to the second component of the pair and leaves the first component @@ -500,29 +506,78 @@ (FIXME: or give parser for numbers) + Whenever there is a possibility that the processing of user input can fail, + it is a good idea to give as much information about where the error + occured. For this Isabelle can attach positional information to tokens + and then thread this information up the processing chain. To see this, + modify the function @{ML filtered_input} described earlier to be *} - - - - -section {* Positional Information *} +ML{*fun filtered_input' str = + filter OuterLex.is_proper (OuterSyntax.scan (Position.line 7) str) *} text {* + where we pretend the parsed string starts on line 7. An example is - @{ML OuterParse.position} +@{ML_response_fake [display,gray] +"filtered_input' \"foo \\n bar\"" +"[Token ((\"foo\", ({line=7, end_line=7}, {line=7})), (Ident, \"foo\"), \), + Token ((\"bar\", ({line=8, end_line=8}, {line=8})), (Ident, \"bar\"), \)]"} + + in which the @{text [quotes] "\\n"} causes the second token to be in + line 8. + + Now by using the parser @{ML OuterParse.position} you can decode the positional + information and return it as part of the parsed input. For example + +@{ML_response_fake [display,gray] +"let + val input = (filtered_input' \"where\") +in + parse (OuterParse.position (OuterParse.$$$ \"where\")) input +end" +"((\"where\", {line=7, end_line=7}), [])"} + + \begin{readmore} + The functions related to positions are implemented in the file + @{ML_file "Pure/General/position.ML"}. + \end{readmore} *} section {* Parsing Inner Syntax *} -ML{*let - val input = OuterSyntax.scan Position.none "0" +text {* + There is usually no need to write your own parser for parsing inner syntax, that is + for terms and types: you can just call the pre-defined parsers. Terms can + be parsed using the function @{ML OuterParse.term}. For example: + +@{ML_response [display,gray] +"let + val input = OuterSyntax.scan Position.none \"foo\" in - OuterParse.prop input -end*} + OuterParse.term input +end" +"(\"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\", [])"} + + The function @{ML OuterParse.prop} is similar, except that it gives a different + error message, when parsing fails. Looking closer at the result string you will + have noticed that the parser not just returns the parsed string, but also some + encoded positional information. You can see this better if you replace + @{ML Position.none} by @{ML "(Position.line 42)"}, say. -text {* (FIXME funny output for a proposition) *} +@{ML_response [display,gray] +"let + val input = OuterSyntax.scan (Position.line 42) \"foo\" +in + OuterParse.term input +end" +"(\"\\^E\\^Ftoken\\^Fline=42\\^Fend_line=42\\^Efoo\\^E\\^F\\^E\", [])"} + + The positional information is stored so that code called later will be + able to give more precise error messages. + +*} section {* Parsing Specifications\label{sec:parsingspecs} *}