polished parser section
authorChristian Urban <urbanc@in.tum.de>
Thu, 19 Feb 2009 01:09:16 +0000 (2009-02-19)
changeset 125 748d9c1a32fb
parent 124 0b9fa606a746
child 126 fcc0e6e54dca
polished parser section
CookBook/Parsing.thy
cookbook.pdf
--- a/CookBook/Parsing.thy	Wed Feb 18 17:17:37 2009 +0000
+++ b/CookBook/Parsing.thy	Thu Feb 19 01:09:16 2009 +0000
@@ -326,6 +326,12 @@
   "s1 ^ s ^ s2" for s1 s2 s}.
   \end{exercise}
 
+  The function @{ML Scan.ahead} parses some input, but leaves the original
+  input unchanged. For example:
+
+  @{ML_response [display,gray]
+  "Scan.ahead (Scan.this_string \"foo\") (explode \"foo\")" 
+  "(\"foo\", [\"f\", \"o\", \"o\"])"} 
 
   The function @{ML Scan.lift} takes a parser and a pair as arguments. This function applies
   the given parser to the second component of the pair and leaves the  first component 
@@ -500,29 +506,78 @@
 
   (FIXME: or give parser for numbers)
 
+  Whenever there is a possibility that the processing of user input can fail, 
+  it is a good idea to give as much information about where the error 
+  occured. For this Isabelle can attach positional information to tokens
+  and then thread this information up the processing chain. To see this,
+  modify the function @{ML filtered_input} described earlier to be
 *}
 
-
-
-
-
-section {* Positional Information *}
+ML{*fun filtered_input' str = 
+       filter OuterLex.is_proper (OuterSyntax.scan (Position.line 7) str) *}
 
 text {*
+  where we pretend the parsed string starts on line 7. An example is
 
-  @{ML OuterParse.position}
+@{ML_response_fake [display,gray]
+"filtered_input' \"foo \\n bar\""
+"[Token ((\"foo\", ({line=7, end_line=7}, {line=7})), (Ident, \"foo\"), \<dots>),
+ Token ((\"bar\", ({line=8, end_line=8}, {line=8})), (Ident, \"bar\"), \<dots>)]"}
+
+  in which the @{text [quotes] "\\n"} causes the second token to be in 
+  line 8.
+
+  Now by using the parser @{ML OuterParse.position} you can decode the positional
+  information and return it as part of the parsed input. For example
+
+@{ML_response_fake [display,gray]
+"let
+  val input = (filtered_input' \"where\")
+in 
+  parse (OuterParse.position (OuterParse.$$$ \"where\")) input
+end"
+"((\"where\", {line=7, end_line=7}), [])"}
+
+  \begin{readmore}
+  The functions related to positions are implemented in the file
+  @{ML_file "Pure/General/position.ML"}.
+  \end{readmore}
 
 *}
 
 section {* Parsing Inner Syntax *}
 
-ML{*let 
-  val input = OuterSyntax.scan Position.none "0"
+text {*
+  There is usually no need to write your own parser for parsing inner syntax, that is 
+  for terms and  types: you can just call the pre-defined parsers. Terms can 
+  be parsed using the function @{ML OuterParse.term}. For example:
+
+@{ML_response [display,gray]
+"let 
+  val input = OuterSyntax.scan Position.none \"foo\"
 in 
-  OuterParse.prop input
-end*}
+  OuterParse.term input
+end"
+"(\"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\", [])"}
+
+  The function @{ML OuterParse.prop} is similar, except that it gives a different
+  error message, when parsing fails. Looking closer at the result string you will
+  have noticed that the parser not just returns the parsed string, but also some 
+  encoded positional information. You can see this better if you replace 
+  @{ML Position.none} by @{ML "(Position.line 42)"}, say.
 
-text {* (FIXME funny output for a proposition) *}
+@{ML_response [display,gray]
+"let 
+  val input = OuterSyntax.scan (Position.line 42) \"foo\"
+in 
+  OuterParse.term input
+end"
+"(\"\\^E\\^Ftoken\\^Fline=42\\^Fend_line=42\\^Efoo\\^E\\^F\\^E\", [])"}
+  
+  The positional information is stored so that code called later will be
+  able to give more precise error messages. 
+
+*}
 
 section {* Parsing Specifications\label{sec:parsingspecs} *}
 
Binary file cookbook.pdf has changed