CookBook/Parsing.thy
changeset 65 c8e9a4f97916
parent 60 5b9c6010897b
child 66 d563f8ff6aa0
--- a/CookBook/Parsing.thy	Thu Jan 08 22:47:15 2009 +0000
+++ b/CookBook/Parsing.thy	Sat Jan 10 12:57:48 2009 +0000
@@ -95,8 +95,8 @@
   If, as in the previous example, one wants to parse a particular string, 
   then one should use the function @{ML Scan.this_string}:
 
-  @{ML_response [display] "Scan.this_string \"hel\" (explode \"hello\")"
-                          "(\"hel\", [\"l\", \"o\"])"}
+  @{ML_response [display] "Scan.this_string \"hell\" (explode \"hello\")"
+                          "(\"hell\", [\"o\"])"}
 
   Parsers that explore alternatives can be constructed using the function @{ML
   "(op ||)"}. For example, the parser @{ML "(p || q)" for p q} returns the
@@ -162,8 +162,7 @@
   @{ML [display] "(p -- q) || r" for p q r}
 
   However, this parser is problematic for producing an appropriate error message, in case
-  the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses with the parser
-  above the information 
+  the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses the information 
   that @{text p} should be followed by @{text q}. To see this consider the case in
   which @{text p} 
   is present in the input, but not @{text q}. That means @{ML "(p -- q)" for p q} will fail 
@@ -172,8 +171,8 @@
   parser for the input ``p-followed-by-q'' and therefore will also fail. The error message is then 
   caused by the
   failure of @{text r}, not by the absence of @{text q} in the input. This kind of situation
-  can be avoided by using the function @{ML "(op !!)"}. This function aborts the whole process of
-  parsing in case of a failure and invokes an error message. For example if we invoke the parser
+  can be avoided when using the function @{ML "(op !!)"}. This function aborts the whole process of
+  parsing in case of a failure and prints an error message. For example if we invoke the parser
   
   @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"}
 
@@ -188,7 +187,7 @@
   @{ML_response_fake [display] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")"
                                "Exception ABORT raised"}
 
-  the parsing aborts and the error message @{text "foo"} is printed out. In order to
+  then the parsing aborts and the error message @{text "foo"} is printed out. In order to
   see the error message properly, we need to prefix the parser with the function 
   @{ML "Scan.error"}. For example
 
@@ -198,7 +197,7 @@
   This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"} 
   (FIXME: give reference to later place). 
   
-  Returning to our example of parsing @{ML "(p -- q) || r" for p q r}. If we want
+  Let us now return to our example of parsing @{ML "(p -- q) || r" for p q r}. If we want
   to generate the correct error message for p-followed-by-q, then
   we have to write:
 *}
@@ -206,20 +205,21 @@
 ML {* 
 fun p_followed_by_q p q r =
   let 
-     val err = (fn _ => p ^ " is not followed by " ^ q)
+     val err_msg = (fn _ => p ^ " is not followed by " ^ q)
   in
-    (($$ p) -- (!! err ($$ q))) || (($$ r) -- ($$ r))
+    ($$ p -- (!! err_msg ($$ q))) || ($$ r -- $$ r)
   end
 *}
 
 
 text {*
-  Running this parser with
+  Running this parser with the @{text [quotes] "h"} and @{text [quotes] "e"}, and 
+  the input @{text [quotes] "holle"} 
 
   @{ML_response_fake [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"holle\")"
                                "Exception ERROR \"h is not followed by e\" raised"} 
 
-  gives the correct error message. Running it with
+  produces the correct error message. Running it with
  
   @{ML_response [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"wworld\")"
                           "((\"w\", \"w\"), [\"o\", \"r\", \"l\", \"d\"])"}
@@ -237,15 +237,16 @@
   succeeds at least once.
 
   Also note that the parser would have aborted with the exception @{text MORE}, if
-  we had run it only on just @{text [quotes] "hhhh"}. This can be avoided using
+  we had run it only on just @{text [quotes] "hhhh"}. This can be avoided by using
   the wrapper @{ML Scan.finite} and the ``stopper-token'' @{ML Symbol.stopper}. With
   them we can write
   
   @{ML_response [display] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")" 
                 "([\"h\", \"h\", \"h\", \"h\"], [])"}
 
-  However, this kind of manually wrapping needs to be done only very rarely
-  in practise, because it is already done by the infrastructure for you. 
+  @{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings;
+  other stoppers need to be used when parsing token, for example. However, this kind of 
+  manually wrapping is often already done by the surrounding infrastructure. 
 
   The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any 
   string as in
@@ -260,7 +261,7 @@
 "([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"}
 
   where the function @{ML Symbol.not_eof} ensures that we do not read beyond the 
-  end of the input string.
+  end of the input string (i.e.~stopper symbol).
 
   The function @{ML "Scan.unless p q" for p q} takes two parsers: if the first one can 
   parse the input, then the whole parser fails; if not, then the second is tried. Therefore
@@ -291,8 +292,8 @@
 "(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []),
  ([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"}
 
-  After parsing succeeded, one nearly always wants to apply a function on the parsed 
-  items. This is done using the function @{ML "(p >> f)" for p f} which runs 
+  After parsing is done, one nearly always wants to apply a function on the parsed 
+  items. To do this the function @{ML "(p >> f)" for p f} can be employed, which runs 
   first the parser @{text p} and upon successful completion applies the 
   function @{text f} to the result. For example
 
@@ -345,6 +346,7 @@
   This is because the parsers for the theory syntax, as well as the parsers for the 
   argument syntax of proof methods and attributes use the type 
   @{ML_type OuterLex.token} (which is identical to the type @{ML_type OuterParse.token}).
+  There are also parsers for ML-expressions and ML-files.
 
   \begin{readmore}
   The parser functions for the theory syntax are contained in the structure
@@ -389,7 +391,7 @@
 end" 
 "[Token (\<dots>,(Ident, \"hello\"), \<dots>), Token (\<dots>,(Ident, \"world\"), \<dots>)]"}
 
-  For convenience we are going to use the function
+  For convenience we define the function
 
 *}
 
@@ -409,7 +411,7 @@
  Token (\<dots>,(Keyword, \"for\"),\<dots>)]"}
 
   we obtain a list consisting of only a command and two keyword tokens.
-  If you want to see which keywords and commands are currently known, use
+  If you want to see which keywords and commands are currently known, type in
   the following (you might have to adjust the @{ML print_depth} in order to
   see the complete list):
 
@@ -497,7 +499,7 @@
 but keyword in was found\" raised"
 }
 
-  \begin{exercise}
+  \begin{exercise} (FIXME)
   A type-identifier, for example @{typ "'a"}, is a token of 
   kind @{ML "Keyword" in OuterLex}. It can be parsed using 
   the function @{ML OuterParse.type_ident}.
@@ -507,6 +509,9 @@
 *}
 
 
+
+
+
 section {* Positional Information *}
 
 text {*
@@ -535,6 +540,8 @@
   OuterParse.opt_target
 *}
 
+text {* (FIXME funny output for a proposition) *}
+
 ML {*
   OuterParse.opt_target --
   OuterParse.fixes -- 
@@ -544,7 +551,124 @@
 
 *}
 
-text {* (FIXME funny output for a proposition) *}
+ML {* OuterSyntax.command *}
+
+section {* New Commands and Keyword Files *}
+
+text {*
+
+  Often new commands, for example for providing a new definitional principle,
+  need to be programmed. While this is not too difficult to do on the
+  ML-level, new commands, in order to be useful, need to be recognised by
+  ProofGeneral. This results in some subtle configuration issues, which we
+  will explain in this section.
+
+
+  Let us start with a silly command, named \isacommand{foo}, which does nothing at all. 
+  It can be defined as
+
+@{ML [display]
+"let
+  val do_nothing = Scan.succeed (Toplevel.theory (fn x => x))
+  val flag = OuterKeyword.thy_decl
+in
+  OuterSyntax.command \"foo\" \"description of foo\" flag do_nothing
+end"}
+
+  The crucial function that makes the command \isacommand{foo} known to
+  Isabelle is @{ML OuterSyntax.command}.  It expects a name of a command, a
+  short description, a flag (we will explain it later more thoroughly) and a
+  parser for a top-level transition function (its purpose will also explained
+  later on). Lets also assume we packaged this function into a separate
+  theory named @{text "Command"}, say, and a file named @{text "Command.thy"}. 
+
+  We now need a keyword file, that can be loaded by ProofGeneral in
+  order to recognise \isacommand{foo} as command. Such a keyword file
+  can be generated with the command-line
+
+  @{text [display] "$ isabelle keywords -k foo some-log-files"}
+
+  The option @{text "-k foo"} indicates which postfix the keyword file will obtain. In
+  the case above the generated file is named @{text "isar-keywords-foo.el"}. This command
+  requires log files to be present (in order to extract the keywords from them). For 
+  our purposes it is sufficient to use the log files for the theories @{text "Pure"},
+  @{text "HOL"} and @{text "Pure-ProofGeneral"}, as well as the theory @{text "Command.thy"}
+  containing the new command. @{text Pure} and @{text HOL} are usually compiled during the 
+  installation of Isabelle, if not this can be done conveniently using build script from 
+  the distribution
+
+  @{text [display] 
+"$ ./build -m \"Pure\"
+$ ./build -m \"HOL\""}
+  
+  The @{text "Pure-ProofGeneral"} theory needs to be compiled with
+
+  @{text [display] "$ ./build -m \"Pure-ProofGeneral\" \"Pure\""}
+
+  For the theory containing the new command \isacommand{foo}, we first
+  create a ``managed'' subdirectory by
+
+
+  This creates files @{text "IsaMakefile"} and @{text "./FooCommand/ROOT.ML"}. We
+  need to copy the file @{text "Command.thy"} into the directory @{text "FooCommand"}
+  and add @{text "use_thy \"Command\";"} to @{text "./FooCommand/ROOT.ML"}.
+  Now we can compile the theory by just typing
+
+  @{text [display] "$ isabelle make"}
+
+  All these compilations created the log files that we are after. They are typically stored 
+  under the directory 
+  
+  @{text [display]  "~/.isabelle/heaps/\<dots>\<dots>/\<dots>\<dots>/log"}
+
+  where the dots stand for names of the current Isabelle distribution and 
+  hardware architecture. In it are the files
+
+  @{text [display] 
+"Pure.gz
+HOL.gz
+Pure-ProofGeneral.gz
+HOL-FooCommand.gz"} 
+
+  Let us assume the directory with these files is stored in the shell variable 
+  @{text "ISABELLE_LOGS"}, by typing for example
+
+  @{text [display] "$ ISABELLE_LOGS=\"$(isabelle getenv -b ISABELLE_OUTPUT)\"/log"}
+ 
+  We can now create the keyword file with
+
+@{text [display]
+"$ isabelle keywords -k foo 
+       `$ISABELLE_LOGS/{Pure.gz,HOL.gz,Pure-ProofGeneral,HOL-FooCommand.gz}`"}
+
+  The result is the file @{text "isar-keywords-foo.el"}, which needs to be 
+  copied in the directory @{text "~/.isabelle/etc"}. To make Isabelle use
+  this keyword file, we need to start it with the option @{text "-k foo"}:
+
+  @{text [display] "isabelle -k foo a-theory-file"}
+
+  If we now run the original code creating the command
+
+  @{ML [display]
+"let
+  val do_nothing = Scan.succeed (Toplevel.theory (fn x => x))
+  val flag = OuterKeyword.thy_decl
+in
+  OuterSyntax.command \"foo\" \"description of foo\" flag do_nothing
+end"}
+
+  then we can finally make use of \isacommand{foo}! Similarly
+  with any other command you want to implement. 
+
+  In the example above, we built the theories on top of the HOL-logic. If you
+  target other logics, such as Nominal or ZF, then you need to change the
+  log files appropriately.
+
+  (FIXME Explain @{text "OuterKeyword.thy_decl"} and parser)
+  
+
+*}
+
 
 
 
@@ -1217,4 +1341,5 @@
 *}
 
 
+
 end 
\ No newline at end of file