ProgTutorial/Parsing.thy
changeset 230 8def50824320
parent 229 abc7f90188af
child 236 7b6d81ff9d9a
--- a/ProgTutorial/Parsing.thy	Tue Apr 07 17:04:39 2009 +0100
+++ b/ProgTutorial/Parsing.thy	Tue Apr 07 23:59:39 2009 +0100
@@ -2,7 +2,6 @@
 imports Base "Package/Simple_Inductive_Package"
 begin
 
-
 chapter {* Parsing *}
 
 text {*
@@ -359,9 +358,7 @@
 text {*
   The reason for using token parsers is that theory syntax, as well as the
   parsers for the arguments of proof methods, use the type @{ML_type
-  OuterLex.token} (which is identical to the type @{ML_type
-  OuterParse.token}).  However, there are also handy parsers for
-  ML-expressions and ML-files.
+  OuterLex.token}.
 
   \begin{readmore}
   The parser functions for the theory syntax are contained in the structure
@@ -369,17 +366,17 @@
   The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}.
   \end{readmore}
 
-  The structure @{ML_struct OuterLex} defines several kinds of tokens (for example 
-  @{ML "Ident" in OuterLex} for identifiers, @{ML "Keyword" in OuterLex} for keywords and
-  @{ML "Command" in OuterLex} for commands). Some token parsers take into account the 
-  kind of tokens.
-*}  
+  The structure @{ML_struct OuterLex} defines several kinds of tokens (for
+  example @{ML "Ident" in OuterLex} for identifiers, @{ML "Keyword" in
+  OuterLex} for keywords and @{ML "Command" in OuterLex} for commands). Some
+  token parsers take into account the kind of tokens. The first example shows
+  how to generate a token list out of a string using the function @{ML
+  "OuterSyntax.scan"}. It is given the argument @{ML "Position.none"} since,
+  at the moment, we are not interested in generating precise error
+  messages. The following code\footnote{Note that because of a possible bug in
+  the PolyML runtime system, the result is printed as @{text [quotes] "?"},
+  instead of the tokens.}
 
-text {*
-  The first example shows how to generate a token list out of a string using
-  the function @{ML "OuterSyntax.scan"}. It is given the argument @{ML "Position.none"}
-  since, at the moment, we are not interested in generating
-  precise error messages. The following code
 
 @{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\"" 
 "[Token (\<dots>,(Ident, \"hello\"),\<dots>), 
@@ -388,9 +385,21 @@
 
   produces three tokens where the first and the last are identifiers, since
   @{text [quotes] "hello"} and @{text [quotes] "world"} do not match any
-  other syntactic category.\footnote{Note that because of a possible bug in
-  the PolyML runtime system, the result is printed as @{text [quotes] "?"}, instead of
-  the tokens.} The second indicates a space.
+  other syntactic category. The second indicates a space.
+
+  We can easily change what is recognised as a keyword with 
+  @{ML OuterKeyword.keyword}. For example calling this function 
+*}
+
+ML{*val _ = OuterKeyword.keyword "hello"*}
+
+text {*
+  then lexing @{text [quotes] "hello world"} will produce
+
+  @{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\"" 
+"[Token (\<dots>,(Keyword, \"hello\"),\<dots>), 
+ Token (\<dots>,(Space, \" \"),\<dots>), 
+ Token (\<dots>,(Ident, \"world\"),\<dots>)]"}
 
   Many parsing functions later on will require spaces, comments and the like
   to have already been filtered out.  So from now on we are going to use the 
@@ -405,14 +414,12 @@
 "[Token (\<dots>,(Ident, \"hello\"), \<dots>), Token (\<dots>,(Ident, \"world\"), \<dots>)]"}
 
   For convenience we define the function:
-
 *}
 
 ML{*fun filtered_input str = 
   filter OuterLex.is_proper (OuterSyntax.scan Position.none str) *}
 
-text {*
-
+text {* 
   If you now parse
 
 @{ML_response_fake [display,gray] 
@@ -445,6 +452,18 @@
 end"
 "((\"where\",\<dots>), (\"|\",\<dots>))"}
 
+  Any non-keyword string can be parsed with the function @{ML OuterParse.reserved}.
+  For example:
+
+  @{ML_response [display,gray]
+"let 
+  val p = OuterParse.reserved \"bar\"
+  val input = filtered_input \"bar\"
+in
+  p input
+end"
+  "(\"bar\",[])"}
+
   Like before, you can sequentially connect parsers with @{ML "--"}. For example: 
 
 @{ML_response [display,gray]
@@ -485,7 +504,6 @@
 "([\"in\", \"in\", \"in\"], [])"}
 
   The following function will help to run examples.
-
 *}
 
 ML{*fun parse p input = Scan.finite OuterLex.stopper (Scan.error p) input *}
@@ -556,6 +574,10 @@
 
 *}
 
+text {*
+  (FIXME: there are also handy parsers for ML-expressions and ML-files)
+*}
+
 section {* Context Parser (TBD) *}
 
 text {*
@@ -901,10 +923,15 @@
 
   If you now build a theory on top of @{text "Command.thy"}, 
   then the command \isacommand{foobar} can be used. 
-  Similarly with any other new command. 
+  Similarly with any other new command, and also any new keyword that is 
+  introduced with
+*}
 
+ML{*val _ = OuterKeyword.keyword "blink" *}
 
-  At the moment \isacommand{foobar} is not very useful. Let us refine it a bit 
+text {*
+  At the moment the command \isacommand{foobar} is not very useful. Let us refine 
+  it a bit 
   next by letting it take a proposition as argument and printing this proposition 
   inside the tracing buffer. 
 
@@ -1002,7 +1029,7 @@
   \isacommand{done}
   \end{isabelle}
 
-  (FIXME read a name and show how to store theorems)
+  (FIXME: read a name and show how to store theorems)
 *}
 
 section {* Methods (TBD) *}