--- a/ProgTutorial/Parsing.thy Tue Sep 29 22:10:48 2009 +0200
+++ b/ProgTutorial/Parsing.thy Thu Oct 01 10:19:21 2009 +0200
@@ -20,14 +20,14 @@
\begin{readmore}
The library for writing parser combinators is split up, roughly, into two
- parts. The first part consists of a collection of generic parser combinators
+ parts: The first part consists of a collection of generic parser combinators
defined in the structure @{ML_struct Scan} in the file @{ML_file
"Pure/General/scan.ML"}. The second part of the library consists of
combinators for dealing with specific token types, which are defined in the
structure @{ML_struct OuterParse} in the file @{ML_file
- "Pure/Isar/outer_parse.ML"}. Specific parsers for packages are defined in
- @{ML_file "Pure/Isar/spec_parse.ML"}. Parsers for method arguments are
- defined in @{ML_file "Pure/Isar/args.ML"}.
+ "Pure/Isar/outer_parse.ML"}. In addition specific parsers for packages are
+ defined in @{ML_file "Pure/Isar/spec_parse.ML"}. Parsers for method arguments
+ are defined in @{ML_file "Pure/Isar/args.ML"}.
\end{readmore}
*}
@@ -73,7 +73,7 @@
In the examples above we use the function @{ML_ind Symbol.explode}, instead of the
more standard library function @{ML_ind explode}, for obtaining an input list for
- the parser. The reason is that @{ML_ind Symbol.explode} is aware of character sequences,
+ the parser. The reason is that @{ML_ind Symbol.explode} is aware of character sequences,
for example @{text "\<foo>"}, that have a special meaning in Isabelle. To see
the difference consider
@@ -114,7 +114,7 @@
Note how the result of consumed strings builds up on the left as nested pairs.
If, as in the previous example, you want to parse a particular string,
- then you should use the function @{ML_ind this_string in Scan}:
+ then you can use the function @{ML_ind this_string in Scan}.
@{ML_response [display,gray]
"Scan.this_string \"hell\" (Symbol.explode \"hello\")"
@@ -176,16 +176,23 @@
(p input1, p input2)
end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
- The function @{ML_ind "!!"} helps to produce appropriate error messages
- for parsing. For example if you want to parse @{text p} immediately
+ The function @{ML_ind ahead in Scan} parses some input, but leaves the original
+ input unchanged. For example:
+
+ @{ML_response [display,gray]
+ "Scan.ahead (Scan.this_string \"foo\") (Symbol.explode \"foo\")"
+ "(\"foo\", [\"f\", \"o\", \"o\"])"}
+
+ The function @{ML_ind "!!"} helps with producing appropriate error messages
+ during parsing. For example if you want to parse @{text p} immediately
followed by @{text q}, or start a completely different parser @{text r},
you might write:
@{ML [display,gray] "(p -- q) || r" for p q r}
- However, this parser is problematic for producing an appropriate error
- message, if the parsing of @{ML "(p -- q)" for p q} fails. Because in that
- case you lose the information that @{text p} should be followed by @{text q}.
+ However, this parser is problematic for producing a useful error
+ message, if the parsing of @{ML "(p -- q)" for p q} fails. Because with the
+ parser above you lose the information that @{text p} should be followed by @{text q}.
To see this assume that @{text p} is present in the input, but it is not
followed by @{text q}. That means @{ML "(p -- q)" for p q} will fail and
hence the alternative parser @{text r} will be tried. However, in many
@@ -222,8 +229,8 @@
(see Section~\ref{sec:newcommand} which explains this function in more detail).
Let us now return to our example of parsing @{ML "(p -- q) || r" for p q
- r}. If you want to generate the correct error message for
- @{text "p"}-followed-by-@{text "q"}, then you have to write:
+ r}. If you want to generate the correct error message for failure
+ of parsing @{text "p"}-followed-by-@{text "q"}, then you have to write:
*}
ML{*fun p_followed_by_q p q r =
@@ -260,14 +267,14 @@
succeeds at least once.
Also note that the parser would have aborted with the exception @{text MORE}, if
- you had run it only on just @{text [quotes] "hhhh"}. This can be avoided by using
+ you had it run with the string @{text [quotes] "hhhh"}. This can be avoided by using
the wrapper @{ML_ind finite in Scan} and the ``stopper-token''
@{ML_ind stopper in Symbol}. With them you can write:
@{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (Symbol.explode \"hhhh\")"
"([\"h\", \"h\", \"h\", \"h\"], [])"}
- @{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings;
+ The function @{ML stopper in Symbol} is the ``end-of-input'' indicator for parsing strings;
other stoppers need to be used when parsing, for example, tokens. However, this kind of
manually wrapping is often already done by the surrounding infrastructure.
@@ -325,8 +332,9 @@
@{ML_response [display,gray]
"let
fun double (x, y) = (x ^ x, y ^ y)
+ val parser = $$ \"h\" -- $$ \"e\"
in
- (($$ \"h\") -- ($$ \"e\") >> double) (Symbol.explode \"hello\")
+ (parser >> double) (Symbol.explode \"hello\")
end"
"((\"hh\", \"ee\"), [\"l\", \"l\", \"o\"])"}
@@ -343,15 +351,6 @@
where the single-character strings in the parsed output are transformed
back into one string.
-
- (FIXME: move to an earlier place)
-
- The function @{ML_ind ahead in Scan} parses some input, but leaves the original
- input unchanged. For example:
-
- @{ML_response [display,gray]
- "Scan.ahead (Scan.this_string \"foo\") (Symbol.explode \"foo\")"
- "(\"foo\", [\"f\", \"o\", \"o\"])"}
The function @{ML_ind lift in Scan} takes a parser and a pair as arguments. This function applies
the given parser to the second component of the pair and leaves the first component
@@ -415,8 +414,8 @@
@{text [quotes] "hello"} and @{text [quotes] "world"} do not match any
other syntactic category. The second indicates a space.
- We can easily change what is recognised as a keyword with
- @{ML_ind keyword in OuterKeyword}. For example calling this function
+ We can easily change what is recognised as a keyword with the function
+ @{ML_ind keyword in OuterKeyword}. For example calling it with
*}
ML{*val _ = OuterKeyword.keyword "hello"*}
@@ -517,20 +516,20 @@
end"
"([\"in\", \"in\", \"in\"], [\<dots>])"}
- @{ML_ind enum1 in OuterParse} works similarly, except that the parsed list must
- be non-empty. Note that we had to add a string @{text [quotes] "foo"} at the
- end of the parsed string, otherwise the parser would have consumed all
- tokens and then failed with the exception @{text "MORE"}. Like in the
- previous section, we can avoid this exception using the wrapper @{ML
- Scan.finite}. This time, however, we have to use the ``stopper-token'' @{ML
- OuterLex.stopper}. We can write:
+ The function @{ML_ind enum1 in OuterParse} works similarly, except that the
+ parsed list must be non-empty. Note that we had to add a string @{text
+ [quotes] "foo"} at the end of the parsed string, otherwise the parser would
+ have consumed all tokens and then failed with the exception @{text
+ "MORE"}. Like in the previous section, we can avoid this exception using the
+ wrapper @{ML Scan.finite}. This time, however, we have to use the
+ ``stopper-token'' @{ML OuterLex.stopper}. We can write:
@{ML_response [display,gray]
"let
val input = filtered_input \"in | in | in\"
+ val p = OuterParse.enum \"|\" (OuterParse.$$$ \"in\")
in
- Scan.finite OuterLex.stopper
- (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
+ Scan.finite OuterLex.stopper p input
end"
"([\"in\", \"in\", \"in\"], [])"}
@@ -540,11 +539,10 @@
ML{*fun parse p input = Scan.finite OuterLex.stopper (Scan.error p) input *}
text {*
-
- The function @{ML_ind "!!!" in OuterParse} can be used to force termination of the
- parser in case of a dead end, just like @{ML "Scan.!!"} (see previous section).
- Except that the error message of @{ML "OuterParse.!!!"} is fixed to be
- @{text [quotes] "Outer syntax error"}
+ The function @{ML_ind "!!!" in OuterParse} can be used to force termination
+ of the parser in case of a dead end, just like @{ML "Scan.!!"} (see previous
+ section). A difference, however, is that the error message of @{ML
+ "OuterParse.!!!"} is fixed to be @{text [quotes] "Outer syntax error"}
together with a relatively precise description of the failure. For example:
@{ML_response_fake [display,gray]
@@ -569,8 +567,8 @@
Whenever there is a possibility that the processing of user input can fail,
it is a good idea to give all available information about where the error
occurred. For this Isabelle can attach positional information to tokens
- and then thread this information up the processing chain. To see this,
- modify the function @{ML filtered_input} described earlier to
+ and then thread this information up the ``processing chain''. To see this,
+ modify the function @{ML filtered_input}, described earlier, as follows
*}
ML{*fun filtered_input' str =
@@ -587,8 +585,8 @@
in which the @{text [quotes] "\\n"} causes the second token to be in
line 8.
- By using the parser @{ML OuterParse.position} you can access the token position
- and return it as part of the parser result. For example
+ By using the parser @{ML position in OuterParse} you can access the token
+ position and return it as part of the parser result. For example
@{ML_response_fake [display,gray]
"let
@@ -605,13 +603,34 @@
*}
+section {* Parsers for ML-Code (TBD) *}
+
text {*
- (FIXME: there are also handy parsers for ML-expressions and ML-files)
+ @{ML_ind ML_source in OuterParse}
*}
section {* Context Parser (TBD) *}
text {*
+ @{ML_ind Args.context}
+*}
+(*
+ML {*
+let
+ val parser = Args.context -- Scan.lift Args.name_source
+
+ fun term_pat (ctxt, str) =
+ str |> Syntax.read_prop ctxt
+in
+ (parser >> term_pat) (Context.Proof @{context}, filtered_input "f (a::nat)")
+ |> fst
+end
+*}
+*)
+
+text {*
+ @{ML_ind Args.context}
+
Used for example in \isacommand{attribute\_setup} and \isacommand{method\_setup}.
*}
@@ -622,7 +641,7 @@
text {*
There is usually no need to write your own parser for parsing inner syntax, that is
for terms and types: you can just call the predefined parsers. Terms can
- be parsed using the function @{ML_ind term in OuterParse}. For example:
+ be parsed using the function @{ML_ind term in OuterParse}. For example:
@{ML_response [display,gray]
"let
@@ -632,10 +651,10 @@
end"
"(\"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\", [])"}
- The function @{ML_ind prop in OuterParse} is similar, except that it gives a different
+ The function @{ML_ind prop in OuterParse} is similar, except that it gives a different
error message, when parsing fails. As you can see, the parser not just returns
the parsed string, but also some encoded information. You can decode the
- information with the function @{ML_ind parse in YXML}. For example
+ information with the function @{ML_ind parse in YXML} in @{ML_struct YXML}. For example
@{ML_response [display,gray]
"YXML.parse \"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\""
@@ -954,16 +973,17 @@
@{text [display] "$ isabelle emacs -k foobar a_theory_file"}
If you now build a theory on top of @{text "Command.thy"},
- then the command \isacommand{foobar} can be used. You can just write
+ then you can use the command \isacommand{foobar}. You can just write
*}
foobar
text {*
but you will not see any action as we chose to implement this command to do
- nothing. A similarly procedure has to be done with any other new command, and
- also any new keyword that is introduced with @{ML_ind OuterKeyword.keyword}.
-
+ nothing. The point of this command is to only show the procedure of how
+ to interact with ProofGeneral. A similar procedure has to be done with any
+ other new command, and also any new keyword that is introduced with
+ @{ML_ind OuterKeyword.keyword}.
*}
ML{*val _ = OuterKeyword.keyword "blink" *}
@@ -1081,7 +1101,6 @@
end
val parser = SpecParse.opt_thm_name ":" -- OuterParse.ML_source
-
in
OuterSyntax.local_theory_to_proof "foobar_prove" "proving a proposition"
OuterKeyword.thy_goal (parser >> setup_proof)