CookBook/Parsing.thy
changeset 58 f3794c231898
parent 56 126646f2aa88
child 59 b5914f3c643c
--- a/CookBook/Parsing.thy	Tue Dec 16 17:37:39 2008 +0100
+++ b/CookBook/Parsing.thy	Tue Dec 16 17:28:05 2008 +0000
@@ -46,20 +46,20 @@
   @{ML_response [display] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"}
 
   This function will either succeed (as in the two examples above) or raise the exception 
-  @{ML_text "FAIL"} if no string can be consumed. For example trying to parse
+  @{text "FAIL"} if no string can be consumed. For example trying to parse
 
   @{ML_response_fake [display] "($$ \"x\") (explode \"world\")" 
                                "Exception FAIL raised"}
   
-  will raise the exception @{ML_text "FAIL"}.
+  will raise the exception @{text "FAIL"}.
   There are three exceptions used in the parsing combinators:
 
   \begin{itemize}
-  \item @{ML_text "FAIL"} is used to indicate that alternative routes of parsing 
+  \item @{text "FAIL"} is used to indicate that alternative routes of parsing 
   might be explored. 
-  \item @{ML_text "MORE"} indicates that there is not enough input for the parser. For example 
-  in @{ML_text "($$ \"h\") []"}.
-  \item @{ML_text "ABORT"} is the exception which is raised when a dead end is reached. 
+  \item @{text "MORE"} indicates that there is not enough input for the parser. For example 
+  in @{text "($$ \"h\") []"}.
+  \item @{text "ABORT"} is the exception which is raised when a dead end is reached. 
   It is used for example in the function @{ML "(op !!)"} (see below).
   \end{itemize}
 
@@ -69,7 +69,7 @@
   Slightly more general than the parser @{ML "(op $$)"} is the function @{ML
   Scan.one}, in that it takes a predicate as argument and then parses exactly
   one item from the input list satisfying this predicate. For example the
-  following parser either consumes an @{ML_text [quotes] "h"} or a @{ML_text
+  following parser either consumes an @{text [quotes] "h"} or a @{text
   [quotes] "w"}:
 
 
@@ -84,7 +84,7 @@
     "((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
 
   Two parser can be connected in sequence by using the function @{ML "(op --)"}. 
-  For example parsing @{ML_text "h"}, @{ML_text "e"} and @{ML_text "l"} in this 
+  For example parsing @{text "h"}, @{text "e"} and @{text "l"} in this 
   sequence can be achieved by
 
   @{ML_response [display] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")"
@@ -100,8 +100,8 @@
 
   Parsers that explore alternatives can be constructed using the function @{ML
   "(op ||)"}. For example, the parser @{ML "(p || q)" for p q} returns the
-  result of @{ML_text "p"}, in case it succeeds, otherwise it returns the
-  result of @{ML_text "q"}. For example
+  result of @{text "p"}, in case it succeeds, otherwise it returns the
+  result of @{text "q"}. For example
 
 
 @{ML_response [display] 
@@ -129,8 +129,8 @@
   "((\"e\", [\"l\", \"l\", \"o\"]),(\"h\", [\"l\", \"l\", \"o\"]))"}
 
   The parser @{ML "Scan.optional p x" for p x} returns the result of the parser 
-  @{ML_text "p"}, if it succeeds; otherwise it returns 
-  the default value @{ML_text "x"}. For example
+  @{text "p"}, if it succeeds; otherwise it returns 
+  the default value @{text "x"}. For example
 
 @{ML_response [display]
 "let 
@@ -155,8 +155,8 @@
 end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
 
   The function @{ML "(op !!)"} helps to produce appropriate error messages
-  during parsing. For example if one wants to parse that @{ML_text p} is immediately 
-  followed by @{ML_text q}, or start a completely different parser @{ML_text r},
+  during parsing. For example if one wants to parse that @{text p} is immediately 
+  followed by @{text q}, or start a completely different parser @{text r},
   one might write
 
   @{ML [display] "(p -- q) || r" for p q r}
@@ -164,31 +164,31 @@
   However, this parser is problematic for producing an appropriate error message, in case
   the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses with the parser
   above the information 
-  that @{ML_text p} should be followed by @{ML_text q}. To see this consider the case in
-  which @{ML_text p} 
-  is present in the input, but not @{ML_text q}. That means @{ML "(p -- q)" for p q} will fail 
+  that @{text p} should be followed by @{text q}. To see this consider the case in
+  which @{text p} 
+  is present in the input, but not @{text q}. That means @{ML "(p -- q)" for p q} will fail 
   and the 
-  alternative parser @{ML_text r} will be tried. However in many circumstance this will be the wrong
+  alternative parser @{text r} will be tried. However in many circumstance this will be the wrong
   parser for the input ``p-followed-by-q'' and therefore will also fail. The error message is then 
   caused by the
-  failure of @{ML_text r}, not by the absence of @{ML_text q} in the input. This kind of situation
+  failure of @{text r}, not by the absence of @{text q} in the input. This kind of situation
   can be avoided by using the function @{ML "(op !!)"}. This function aborts the whole process of
   parsing in case of a failure and invokes an error message. For example if we invoke the parser
   
   @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"}
 
-  on @{ML_text [quotes] "hello"}, the parsing succeeds
+  on @{text [quotes] "hello"}, the parsing succeeds
 
   @{ML_response [display] 
                 "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"hello\")" 
                 "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
 
-  but if we invoke it on @{ML_text [quotes] "world"}
+  but if we invoke it on @{text [quotes] "world"}
   
   @{ML_response_fake [display] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")"
                                "Exception ABORT raised"}
 
-  the parsing aborts and the error message @{ML_text "foo"} is printed out. In order to
+  the parsing aborts and the error message @{text "foo"} is printed out. In order to
   see the error message properly, we need to prefix the parser with the function 
   @{ML "Scan.error"}. For example
 
@@ -226,18 +226,18 @@
   
   yields the expected parsing. 
 
-  The function @{ML "Scan.repeat p" for p} will apply a parser @{ML_text p} as 
+  The function @{ML "Scan.repeat p" for p} will apply a parser @{text p} as 
   often as it succeeds. For example
   
   @{ML_response [display] "Scan.repeat ($$ \"h\") (explode \"hhhhello\")" 
                 "([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"}
   
   Note that @{ML "Scan.repeat"} stores the parsed items in a list. The function
-  @{ML "Scan.repeat1"} is similar, but requires that the parser @{ML_text "p"} 
+  @{ML "Scan.repeat1"} is similar, but requires that the parser @{text "p"} 
   succeeds at least once.
 
-  Also note that the parser would have aborted with the exception @{ML_text MORE}, if
-  we had run it only on just @{ML_text [quotes] "hhhh"}. This can be avoided using
+  Also note that the parser would have aborted with the exception @{text MORE}, if
+  we had run it only on just @{text [quotes] "hhhh"}. This can be avoided using
   the wrapper @{ML Scan.finite} and the ``stopper-token'' @{ML Symbol.stopper}. With
   them we can write
   
@@ -277,7 +277,7 @@
 
   The functions @{ML Scan.repeat} and @{ML Scan.unless} can be combined to read any
   input until a certain marker symbol is reached. In the example below the marker
-  symbol is a @{text [quotes] "*"}:
+  symbol is a @{text [quotes] "*"} which causes the parser to stop:
 
   @{ML_response [display]
 "let 
@@ -293,8 +293,8 @@
 
   After parsing succeeded, one nearly always wants to apply a function on the parsed 
   items. This is done using the function @{ML "(p >> f)" for p f} which runs 
-  first the parser @{ML_text p} and upon successful completion applies the 
-  function @{ML_text f} to the result. For example
+  first the parser @{text p} and upon successful completion applies the 
+  function @{text f} to the result. For example
 
 @{ML_response [display]
 "let 
@@ -307,10 +307,11 @@
   doubles the two parsed input strings.
  
   \begin{exercise}\label{ex:scancmts}
-  Write a parser parses an input string so that any comment enclosed inside
-  @{text "(*\<dots>*)"} is enclosed inside @{text "(**\<dots>**)"} in the output
-  string. To enclose a string, you can use the function @{ML "enclose s1 s2 s"
-  for s1 s2 s} which produces the string @{ML "s1^s^s2" for s1 s2 s}.
+  Write a parser that parses an input string so that any comment enclosed
+  inside @{text "(*\<dots>*)"} is replaced by a the same comment but enclosed inside
+  @{text "(**\<dots>**)"} in the output string. To enclose a string, you can use the
+  function @{ML "enclose s1 s2 s" for s1 s2 s} which produces the string @{ML
+  "s1^s^s2" for s1 s2 s}.
   \end{exercise}
 
 
@@ -358,7 +359,7 @@
  Token (\<dots>,(Ident, \"world\"),\<dots>)]"}
 
   produces three tokens where the first and the last are identifiers, since
-  @{ML_text [quotes] "hello"} and @{ML_text [quotes] "world"} do not match any
+  @{text [quotes] "hello"} and @{text [quotes] "world"} do not match any
   other syntactic category.\footnote{Note that because of a possible a bug in
   the PolyML runtime system the result is printed as @{text "?"}, instead of
   the token.} The second indicates a space.
@@ -430,8 +431,8 @@
 "((\"|\",\"in\"),[])"}
 
   The parser @{ML "OuterParse.enum s p" for s p} parses a possibly empty 
-  list of items recognised by the parser @{ML_text p}, where the items being parsed
-  are separated by the string @{ML_text s}. For example
+  list of items recognised by the parser @{text p}, where the items being parsed
+  are separated by the string @{text s}. For example
 
 @{ML_response [display]
 "let 
@@ -442,9 +443,9 @@
 "([\"in\",\"in\",\"in\"],[\<dots>])"}
 
   @{ML "OuterParse.enum1"} works similarly, except that the parsed list must
-  be non-empty. Note that we had to add an @{ML_text [quotes] "foo"} at the end
+  be non-empty. Note that we had to add an @{text [quotes] "foo"} at the end
   of the parsed string, otherwise the parser would have consumed all tokens
-  and then failed with the exception @{ML_text "MORE"}. Like in the previous
+  and then failed with the exception @{text "MORE"}. Like in the previous
   section, we can avoid this exception using the wrapper @{ML
   Scan.finite}. This time, however, we have to use the ``stopper-token'' @{ML
   OuterLex.stopper}. We can write