--- a/CookBook/Parsing.thy Fri Nov 28 05:56:28 2008 +0100
+++ b/CookBook/Parsing.thy Sat Nov 29 21:20:18 2008 +0000
@@ -94,7 +94,7 @@
Parsers that explore
alternatives can be constructed using the function @{ML "(op ||)"}. For example, the
- parser @{ML_open "(p || q)" for p q} returns the result of @{ML_text "p"}, in case it succeeds,
+ parser @{ML "(p || q)" for p q} returns the result of @{ML_text "p"}, in case it succeeds,
otherwise it returns the result of @{ML_text "q"}. For example
@{ML_response [display]
@@ -121,7 +121,7 @@
end"
"((\"e\", [\"l\", \"l\", \"o\"]),(\"h\", [\"l\", \"l\", \"o\"]))"}
- The parser @{ML_open "Scan.optional p x" for p x} returns the result of the parser
+ The parser @{ML "Scan.optional p x" for p x} returns the result of the parser
@{ML_text "p"}, if it succeeds; otherwise it returns
the default value @{ML_text "x"}. For example
@@ -152,14 +152,14 @@
followed by @{ML_text q}, or start a completely different parser @{ML_text r},
one might write
- @{ML_open [display] "(p -- q) || r" for p q r}
+ @{ML [display] "(p -- q) || r" for p q r}
However, this parser is problematic for producing an appropriate error message, in case
- the parsing of @{ML_open "(p -- q)" for p q} fails. Because in that case one loses with the parser
+ the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses with the parser
above the information
that @{ML_text p} should be followed by @{ML_text q}. To see this consider the case in
which @{ML_text p}
- is present in the input, but not @{ML_text q}. That means @{ML_open "(p -- q)" for p q} will fail
+ is present in the input, but not @{ML_text q}. That means @{ML "(p -- q)" for p q} will fail
and the
alternative parser @{ML_text r} will be tried. However in many circumstance this will be the wrong
parser for the input ``p-followed-by-q'' and therefore will also fail. The error message is then
@@ -191,7 +191,7 @@
This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"}
(FIXME: give reference to later place).
- Returning to our example of parsing @{ML_open "(p -- q) || r" for p q r}. If we want
+ Returning to our example of parsing @{ML "(p -- q) || r" for p q r}. If we want
to generate the correct error message for p-followed-by-q, then
we have to write:
*}
@@ -219,7 +219,7 @@
yields the expected parsing.
- The function @{ML_open "Scan.repeat p" for p} will apply a parser @{ML_text p} as
+ The function @{ML "Scan.repeat p" for p} will apply a parser @{ML_text p} as
often as it succeeds. For example
@{ML_response [display] "Scan.repeat ($$ \"h\") (explode \"hhhhello\")"
@@ -241,7 +241,7 @@
in practise, because it is already done by the infrastructure for you.
After parsing succeeded, one nearly always wants to apply a function on the parsed
- items. This is done using the function @{ML_open "(p >> f)" for p f} which runs
+ items. This is done using the function @{ML "(p >> f)" for p f} which runs
first the parser @{ML_text p} and upon successful completion applies the
function @{ML_text f} to the result. For example
@@ -281,14 +281,17 @@
\end{readmore}
The structure @{ML_struct OuterLex} defines several kinds of token (for example
- @{ML "OuterLex.Ident"} for identifiers, @{ML "OuterLex.Keyword"} for keywords and
- @{ML "OuterLex.Command"} for commands). Some token parsers take into account the
+ @{ML "Ident" in OuterLex} for identifiers, @{ML "Keyword" in OuterLex} for keywords and
+ @{ML "Command" in OuterLex} for commands). Some token parsers take into account the
kind of token.
-
- We can generate a token list out of a string using the function @{ML
- "OuterSyntax.scan"}, which we give below @{ML "Position.none"} as argument
- since, at the moment, we are not interested in generating precise error
- messages. The following
+*}
+
+text {*
+ For the examples below, we can generate a token list out of a string using
+ the function @{ML "OuterSyntax.scan"}, which we give below @{ML
+ "Position.none"} as argument since, at the moment, we are not interested in
+ generating precise error messages. The following
+
@{ML_response_fake [display] "OuterSyntax.scan Position.none \"hello world\""
"[Token (\<dots>,(Ident, \"hello\"),\<dots>),
@@ -302,7 +305,7 @@
the token.} The second indicates a space.
Many parsing functions later on will require spaces, comments and the like
- to have been filtered out. In what follows below, we are going to use the
+ to have already been filtered out. So from now on we are going to use the
functions @{ML filter} and @{ML OuterLex.is_proper} do this. For example
@@ -367,20 +370,20 @@
end"
"((\"|\",\"in\"),[])"}
- The parser @{ML_open "OuterParse.enum s p" for s p} parses a possibly empty
- list of items recognised by the parser @{ML_text p}, where the items are
- separated by the string @{ML_text s}. For example
+ The parser @{ML "OuterParse.enum s p" for s p} parses a possibly empty
+ list of items recognised by the parser @{ML_text p}, where the items being parsed
+ are separated by the string @{ML_text s}. For example
@{ML_response [display]
"let
- val input = filtered_input \"in | in | in end\"
+ val input = filtered_input \"in | in | in foo\"
in
(OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
end"
"([\"in\",\"in\",\"in\"],[\<dots>])"}
@{ML "OuterParse.enum1"} works similarly, except that the parsed list must
- be non-empty. Note that we had to add a @{ML_text [quotes] "end"} at the end
+ be non-empty. Note that we had to add an @{ML_text [quotes] "foo"} at the end
of the parsed string, otherwise the parser would have consumed all tokens
and then failed with the exception @{ML_text "MORE"}. Like in the previous
section, we can avoid this exception using the wrapper @{ML
@@ -391,10 +394,21 @@
"let
val input = filtered_input \"in | in | in\"
in
- Scan.finite OuterLex.stopper (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
+ Scan.finite OuterLex.stopper
+ (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
end"
"([\"in\",\"in\",\"in\"],[])"}
+ The following function will help us to run examples
+
+*}
+
+ML {*
+fun parse p input = Scan.finite OuterLex.stopper (Scan.error p) input
+*}
+
+text {*
+
The function @{ML "OuterParse.!!!"} can be used to force termination of the
parser in case of a dead end, just like @{ML "Scan.!!"} (see previous section),
except that the error message is fixed to be @{text [quotes] "Outer syntax error"}
@@ -405,14 +419,22 @@
val input = filtered_input \"in |\"
val parse_bar_then_in = OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\"
in
- Scan.error (OuterParse.!!! parse_bar_then_in) input
+ parse (OuterParse.!!! parse_bar_then_in) input
end"
"Exception ERROR \"Outer syntax error: keyword \"|\" expected,
but keyword in was found\" raised"
}
+ \begin{exercise}
+ A type-identifier, for example @{typ "'a"}, is a token of
+ kind @{ML "Keyword" in OuterLex} can be parsed
+
+ \end{exercise}
+
+
*}
+
section {* Positional Information *}
text {*