ProgTutorial/Parsing.thy
changeset 572 438703674711
parent 569 f875a25aa72d
child 573 321e220a6baa
--- a/ProgTutorial/Parsing.thy	Fri May 17 11:21:09 2019 +0200
+++ b/ProgTutorial/Parsing.thy	Tue May 21 14:37:39 2019 +0200
@@ -69,9 +69,9 @@
   or raise the exception \<open>FAIL\<close> if no string can be consumed. For
   example trying to parse
 
-  @{ML_matchresult_fake [display,gray] 
+  @{ML_response [display,gray] 
   \<open>($$ "x") (Symbol.explode "world")\<close> 
-  \<open>Exception FAIL raised\<close>}
+  \<open>exception FAIL NONE raised\<close>}
   
   will raise the exception \<open>FAIL\<close>. The function @{ML_ind "$$" in Scan} will also
   fail if you try to consume more than a single character.
@@ -97,14 +97,13 @@
   sequences, for example \<open>\<foo>\<close>, that have a special meaning in
   Isabelle. To see the difference consider
 
-@{ML_matchresult_fake [display,gray]
-\<open>let 
-  val input = "\<foo> bar"
-in
-  (String.explode input, Symbol.explode input)
-end\<close>
-\<open>(["\\", "<", "f", "o", "o", ">", " ", "b", "a", "r"],
- ["\<foo>", " ", "b", "a", "r"])\<close>}
+@{ML_matchresult [display,gray]
+\<open>String.explode "\<foo> bar"\<close>
+\<open>[#"\\", #"<", #"f", #"o", #"o", #">", #" ", #"b", #"a", #"r"]\<close>}
+
+@{ML_matchresult [display,gray]
+\<open>Symbol.explode "\<foo> bar"\<close>
+\<open>["\<foo>", " ", "b", "a", "r"]\<close>}
 
   Slightly more general than the parser @{ML \<open>$$\<close>} is the function 
   @{ML_ind one in Scan}, in that it takes a predicate as argument and 
@@ -236,16 +235,16 @@
 
   but if you invoke it on @{text [quotes] "world"}
 
-  @{ML_matchresult_fake [display,gray] \<open>(!! (fn _ => fn _ => "foo") ($$ "h")) (Symbol.explode "world")\<close>
-                               \<open>Exception ABORT raised\<close>}
+  @{ML_response [display,gray] \<open>(!! (fn _ => fn _ => "foo") ($$ "h")) (Symbol.explode "world")\<close>
+                               \<open>exception ABORT fn raised\<close>}
 
   then the parsing aborts and the error message \<open>foo\<close> is printed. In order to
   see the error message properly, you need to prefix the parser with the function 
   @{ML_ind error in Scan}. For example:
 
-  @{ML_matchresult_fake [display,gray] 
+  @{ML_response [display,gray] 
   \<open>Scan.error (!! (fn _ => fn _ => "foo") ($$ "h")) (Symbol.explode "world")\<close>
-  \<open>Exception Error "foo" raised\<close>}
+  \<open>foo\<close>}
 
   This kind of ``prefixing'' to see the correct error message is usually done by wrappers 
   such as @{ML_ind local_theory in Outer_Syntax} 
@@ -269,8 +268,8 @@
   @{text [quotes] "h"}, @{text [quotes] "e"} and @{text [quotes] "w"}, and 
   the input @{text [quotes] "holle"} 
 
-  @{ML_matchresult_fake [display,gray] \<open>Scan.error (p_followed_by_q "h" "e" "w") (Symbol.explode "holle")\<close>
-                               \<open>Exception ERROR "h is not followed by e" raised\<close>} 
+  @{ML_response [display,gray] \<open>Scan.error (p_followed_by_q "h" "e" "w") (Symbol.explode "holle")\<close>
+                               \<open>h is not followed by e\<close>} 
 
   produces the correct error message. Running it with
  
@@ -318,9 +317,9 @@
 
   The function @{ML_ind unless in Scan} takes two parsers: if the first one can 
   parse the input, then the whole parser fails; if not, then the second is tried. Therefore
-  
-  @{ML_matchresult_fake_both [display,gray] \<open>Scan.unless ($$ "h") ($$ "w") (Symbol.explode "hello")\<close>
-                               \<open>Exception FAIL raised\<close>}
+
+  @{ML_response [display,gray] \<open>Scan.unless ($$ "h") ($$ "w") (Symbol.explode "hello")\<close>
+                               "exception FAIL NONE raised"}
 
   fails, while
 
@@ -501,11 +500,11 @@
   messages. The following code
 
 
-@{ML_matchresult_fake [display,gray] \<open>Token.explode 
+@{ML_response [display,gray] \<open>Token.explode 
   (Thy_Header.get_keywords' @{context}) Position.none "hello world"\<close>
-\<open>[Token (_,(Ident, "hello"),_), 
- Token (_,(Space, " "),_), 
- Token (_,(Ident, "world"),_)]\<close>}
+\<open>[Token (\<dots>(Ident, "hello"),\<dots>), 
+ Token (\<dots>(Space, " "),\<dots>), 
+ Token (\<dots>(Ident, "world"),\<dots>)]\<close>}
 
   produces three tokens where the first and the last are identifiers, since
   @{text [quotes] "hello"} and @{text [quotes] "world"} do not match any
@@ -523,24 +522,24 @@
 text \<open>
   then lexing @{text [quotes] "hello world"} will produce
 
-  @{ML_matchresult_fake [display,gray] \<open>Token.explode 
+  @{ML_response [display,gray] \<open>Token.explode
   (Thy_Header.get_keywords' @{context}) Position.none "hello world"\<close> 
-\<open>[Token (_,(Keyword, "hello"),_), 
- Token (_,(Space, " "),_), 
- Token (_,(Ident, "world"),_)]\<close>}
+\<open>[Token (\<dots>(Keyword, "hello"),\<dots>), 
+ Token (\<dots>(Space, " "),\<dots>), 
+ Token (\<dots>(Ident, "world"),\<dots>)]\<close>}
 
   Many parsing functions later on will require white space, comments and the like
   to have already been filtered out.  So from now on we are going to use the 
   functions @{ML filter} and @{ML_ind is_proper in Token} to do this. 
   For example:
 
-@{ML_matchresult_fake [display,gray]
+@{ML_response [display,gray]
 \<open>let
    val input = Token.explode (Thy_Header.get_keywords' @{context}) Position.none "hello world"
 in
    filter Token.is_proper input
 end\<close> 
-\<open>[Token (_,(Ident, "hello"), _), Token (_,(Ident, "world"), _)]\<close>}
+\<open>[Token (\<dots>(Keyword, "hello"), \<dots>), Token (\<dots>(Ident, "world"), \<dots>)]\<close>}
 
   For convenience we define the function:
 \<close>
@@ -552,11 +551,11 @@
 text \<open>
   If you now parse
 
-@{ML_matchresult_fake [display,gray] 
+@{ML_response [display,gray] 
 \<open>filtered_input "inductive | for"\<close> 
-\<open>[Token (_,(Command, "inductive"),_), 
- Token (_,(Keyword, "|"),_), 
- Token (_,(Keyword, "for"),_)]\<close>}
+\<open>[Token (\<dots>(Command, "inductive"),\<dots>), 
+ Token (\<dots>(Keyword, "|"),\<dots>), 
+ Token (\<dots>(Keyword, "for"),\<dots>)]\<close>}
 
   you obtain a list consisting of only one command and two keyword tokens.
   If you want to see which keywords and commands are currently known to Isabelle, 
@@ -637,15 +636,15 @@
   section). A difference, however, is that the error message of @{ML \<open>Parse.!!!\<close>} is fixed to be @{text [quotes] "Outer syntax error"}
   together with a relatively precise description of the failure. For example:
 
-@{ML_matchresult_fake [display,gray]
+@{ML_response [display,gray]
 \<open>let 
   val input = filtered_input "in |"
   val parse_bar_then_in = Parse.$$$ "|" -- Parse.$$$ "in"
 in 
   parse (Parse.!!! parse_bar_then_in) input 
 end\<close>
-\<open>Exception ERROR "Outer syntax error: keyword "|" expected, 
-but keyword in was found" raised\<close>
+\<open>Outer syntax error: keyword "|" expected, 
+but keyword in was found\<close>
 }
 
   \begin{exercise} (FIXME)
@@ -669,10 +668,10 @@
 text \<open>
   where we pretend the parsed string starts on line 7. An example is
 
-@{ML_matchresult_fake [display,gray]
-\<open>filtered_input' "foo \\n bar"\<close>
-\<open>[Token (("foo", ({line=7, end_line=7}, {line=7})), (Ident, "foo"), _),
- Token (("bar", ({line=8, end_line=8}, {line=8})), (Ident, "bar"), _)]\<close>}
+@{ML_response [display,gray]
+\<open>filtered_input' "foo \n bar"\<close>
+\<open>[Token (("foo", ({line=7, offset=1, end_offset=4}, {line=7, offset=4})), (Ident, "foo"),\<dots>),
+ Token (("bar", ({line=8, offset=7, end_offset=10}, {line=8, offset=10})), (Ident, "bar"),\<dots>)]\<close>}
 
   in which the @{text [quotes] "\\n"} causes the second token to be in 
   line 8.
@@ -680,13 +679,13 @@
   By using the parser @{ML position in Parse} you can access the token 
   position and return it as part of the parser result. For example
 
-@{ML_matchresult_fake [display,gray]
+@{ML_response [display,gray]
 \<open>let
   val input = filtered_input' "where"
 in 
   parse (Parse.position (Parse.$$$ "where")) input
 end\<close>
-\<open>(("where", {line=7, end_line=7}), [])\<close>}
+\<open>(("where", {line=7, offset=1, end_offset=6}), [])\<close>}
 
   \begin{readmore}
   The functions related to positions are implemented in the file
@@ -751,7 +750,7 @@
   for terms and  types: you can just call the predefined parsers. Terms can 
   be parsed using the function @{ML_ind term in Parse}. For example:
 
-@{ML_matchresult_fake [display,gray]
+@{ML_response [display,gray]
 \<open>let 
   val input = Token.explode (Thy_Header.get_keywords' @{context}) Position.none "foo"
 in 
@@ -767,13 +766,13 @@
   The result of the decoding is an XML-tree. You can see better what is going on if
   you replace @{ML Position.none} by @{ML \<open>Position.line 42\<close>}, say:
 
-@{ML_matchresult_fake [display,gray]
+@{ML_response [display,gray]
 \<open>let 
   val input = Token.explode (Thy_Header.get_keywords' @{context}) (Position.line 42) "foo"
 in 
   YXML.parse (fst (Parse.term input))
 end\<close>
-\<open>Elem ("token", [("line", "42"), ("end_line", "42")], [XML.Text "foo"])\<close>}
+\<open><input delimited="false" line="42" offset="1" end_offset="4">foo</input>\<close>}
 
   The positional information is stored as part of an XML-tree so that code 
   called later on will be able to give more precise error messages. 
@@ -847,13 +846,6 @@
     ((oddS,_),_)]), [])\<close>}
 \<close>
 
-ML \<open>let
-  val input = filtered_input 
-        "foo::\"int \<Rightarrow> bool\" and bar::nat (\"BAR\" 100) and blonk"
-in
-  parse Parse.vars input
-end\<close>
-
 text \<open>
   As you see, the result is a pair consisting of a list of
   variables with optional type-annotation and syntax-annotation, and a list of
@@ -866,16 +858,16 @@
   \<open>"int \<Rightarrow> bool"\<close> in order to properly escape the double quotes
   in the compound type.}
 
-@{ML_matchresult_fake [display,gray]
+@{ML_response [display,gray]
 \<open>let
   val input = filtered_input 
         "foo::\"int \<Rightarrow> bool\" and bar::nat (\"BAR\" 100) and blonk"
 in
   parse Parse.vars input
 end\<close>
-\<open>([(foo, SOME _, NoSyn), 
-  (bar, SOME _, Mixfix (Source {text="BAR",...}, [], 100, _)), 
-  (blonk, NONE, NoSyn)],[])\<close>}  
+\<open>([("foo", SOME \<dots>, NoSyn), 
+  ("bar", SOME \<dots>, Mixfix (Source {\<dots>text = "BAR"}, [], 100, \<dots>)), 
+  ("blonk", NONE, NoSyn)], [])\<close>}  
 \<close>
 
 text \<open>
@@ -953,11 +945,7 @@
 
 text \<open>
   Often new commands, for example for providing new definitional principles,
-  need to be implemented. While this is not difficult on the ML-level and for
-  jEdit, in order to be backwards compatible, new commands need also to be recognised 
-  by Proof-General. This results in some subtle configuration issues, which we will 
-  explain in the next section. Here we just describe how to define new commands
-  to work with jEdit.
+  need to be implemented. 
 
   Let us start with a ``silly'' command that does nothing at all. We
   shall name this command \isacommand{foobar}.  Before you can
@@ -1000,9 +988,7 @@
 
 text \<open>
   but of course you will not see anything since \isacommand{foobar} is
-  not intended to do anything.  Remember, however, that this only
-  works in jEdit. In order to enable also Proof-General recognise this
-  command, a keyword file needs to be generated (see next section).
+  not intended to do anything.
 
   As it stands, the command \isacommand{foobar} is not very useful. Let
   us refine it a bit next by letting it take a proposition as argument