isabelle-cookbook: comparison CookBook/Parsing.thy

equal deleted inserted replaced

-:9a6e5e0c4906
+:c8e9a4f97916
 Note how the result of consumed strings builds up on the left as nested pairs.
 If, as in the previous example, one wants to parse a particular string,
 then one should use the function @{ML Scan.this_string}:
-@{ML_response [display] "Scan.this_string \"hel\" (explode \"hello\")"
+@{ML_response [display] "Scan.this_string \"hell\" (explode \"hello\")"
-"(\"hel\", [\"l\", \"o\"])"}
+"(\"hell\", [\"o\"])"}
 Parsers that explore alternatives can be constructed using the function @{ML
 "(op ||)"}. For example, the parser @{ML "(p || q)" for p q} returns the
 result of @{text "p"}, in case it succeeds, otherwise it returns the
 result of @{text "q"}. For example
 one might write
 @{ML [display] "(p -- q) || r" for p q r}
 However, this parser is problematic for producing an appropriate error message, in case
-the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses with the parser
+the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses the information
-above the information
 that @{text p} should be followed by @{text q}. To see this consider the case in
 which @{text p}
 is present in the input, but not @{text q}. That means @{ML "(p -- q)" for p q} will fail
 and the
 alternative parser @{text r} will be tried. However in many circumstance this will be the wrong
 parser for the input ``p-followed-by-q'' and therefore will also fail. The error message is then
 caused by the
 failure of @{text r}, not by the absence of @{text q} in the input. This kind of situation
-can be avoided by using the function @{ML "(op !!)"}. This function aborts the whole process of
+can be avoided when using the function @{ML "(op !!)"}. This function aborts the whole process of
-parsing in case of a failure and invokes an error message. For example if we invoke the parser
+parsing in case of a failure and prints an error message. For example if we invoke the parser
 @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"}
 on @{text [quotes] "hello"}, the parsing succeeds
 but if we invoke it on @{text [quotes] "world"}
 @{ML_response_fake [display] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")"
 "Exception ABORT raised"}
-the parsing aborts and the error message @{text "foo"} is printed out. In order to
+then the parsing aborts and the error message @{text "foo"} is printed out. In order to
 see the error message properly, we need to prefix the parser with the function
 @{ML "Scan.error"}. For example
 @{ML_response_fake [display] "Scan.error ((!! (fn _ => \"foo\") ($$ \"h\")))"
 "Exception Error \"foo\" raised"}
 This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"}
 (FIXME: give reference to later place).
-Returning to our example of parsing @{ML "(p -- q) || r" for p q r}. If we want
+Let us now return to our example of parsing @{ML "(p -- q) || r" for p q r}. If we want
 to generate the correct error message for p-followed-by-q, then
 we have to write:
 *}
 ML {*
 fun p_followed_by_q p q r =
 let
-val err = (fn _ => p ^ " is not followed by " ^ q)
+val err_msg = (fn _ => p ^ " is not followed by " ^ q)
 in
-(($$ p) -- (!! err ($$ q))) || (($$ r) -- ($$ r))
+($$ p -- (!! err_msg ($$ q))) || ($$ r -- $$ r)
 end
 *}
 text {*
-Running this parser with
+Running this parser with the @{text [quotes] "h"} and @{text [quotes] "e"}, and
+the input @{text [quotes] "holle"}
 @{ML_response_fake [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"holle\")"
 "Exception ERROR \"h is not followed by e\" raised"}
-gives the correct error message. Running it with
+produces the correct error message. Running it with
 @{ML_response [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"wworld\")"
 "((\"w\", \"w\"), [\"o\", \"r\", \"l\", \"d\"])"}
 yields the expected parsing.
 Note that @{ML "Scan.repeat"} stores the parsed items in a list. The function
 @{ML "Scan.repeat1"} is similar, but requires that the parser @{text "p"}
 succeeds at least once.
 Also note that the parser would have aborted with the exception @{text MORE}, if
-we had run it only on just @{text [quotes] "hhhh"}. This can be avoided using
+we had run it only on just @{text [quotes] "hhhh"}. This can be avoided by using
 the wrapper @{ML Scan.finite} and the ``stopper-token'' @{ML Symbol.stopper}. With
 them we can write
 @{ML_response [display] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")"
 "([\"h\", \"h\", \"h\", \"h\"], [])"}
-However, this kind of manually wrapping needs to be done only very rarely
+@{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings;
-in practise, because it is already done by the infrastructure for you.
+other stoppers need to be used when parsing token, for example. However, this kind of
+manually wrapping is often already done by the surrounding infrastructure.
 The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any
 string as in
 @{ML_response [display]
 Scan.finite Symbol.stopper p input
 end"
 "([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"}
 where the function @{ML Symbol.not_eof} ensures that we do not read beyond the
-end of the input string.
+end of the input string (i.e.~stopper symbol).
 The function @{ML "Scan.unless p q" for p q} takes two parsers: if the first one can
 parse the input, then the whole parser fails; if not, then the second is tried. Therefore
 @{ML_response_fake_both [display] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"hello\")"
 Scan.finite Symbol.stopper p input2)
 end"
 "(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []),
 ([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"}
-After parsing succeeded, one nearly always wants to apply a function on the parsed
+After parsing is done, one nearly always wants to apply a function on the parsed
-items. This is done using the function @{ML "(p >> f)" for p f} which runs
+items. To do this the function @{ML "(p >> f)" for p f} can be employed, which runs
 first the parser @{text p} and upon successful completion applies the
 function @{text f} to the result. For example
 @{ML_response [display]
 "let
 text {*
 Most of the time, however, Isabelle developers have to deal with parsing tokens, not strings.
 This is because the parsers for the theory syntax, as well as the parsers for the
 argument syntax of proof methods and attributes use the type
 @{ML_type OuterLex.token} (which is identical to the type @{ML_type OuterParse.token}).
+There are also parsers for ML-expressions and ML-files.
 \begin{readmore}
 The parser functions for the theory syntax are contained in the structure
 @{ML_struct OuterParse} defined in the file @{ML_file  "Pure/Isar/outer_parse.ML"}.
 The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}.
 in
 filter OuterLex.is_proper input
 end"
 "[Token (\<dots>,(Ident, \"hello\"), \<dots>), Token (\<dots>,(Ident, \"world\"), \<dots>)]"}
-For convenience we are going to use the function
+For convenience we define the function
 *}
 ML {*
 fun filtered_input str =
 "[Token (\<dots>,(Command, \"inductive\"),\<dots>),
 Token (\<dots>,(Keyword, \"|\"),\<dots>),
 Token (\<dots>,(Keyword, \"for\"),\<dots>)]"}
 we obtain a list consisting of only a command and two keyword tokens.
-If you want to see which keywords and commands are currently known, use
+If you want to see which keywords and commands are currently known, type in
 the following (you might have to adjust the @{ML print_depth} in order to
 see the complete list):
 @{ML_response_fake [display]
 "let
 end"
 "Exception ERROR \"Outer syntax error: keyword \"|\" expected,
 but keyword in was found\" raised"
 }
-\begin{exercise}
+\begin{exercise} (FIXME)
 A type-identifier, for example @{typ "'a"}, is a token of
 kind @{ML "Keyword" in OuterLex}. It can be parsed using
 the function @{ML OuterParse.type_ident}.
 \end{exercise}
 *}
 section {* Positional Information *}
 text {*
 ML {*
 OuterParse.opt_target
 *}
+text {* (FIXME funny output for a proposition) *}
 ML {*
 OuterParse.opt_target --
 OuterParse.fixes --
 OuterParse.for_fixes --
 Scan.optional (OuterParse.$$$ "where" |--
 OuterParse.!!! (OuterParse.enum1 "|" (SpecParse.opt_thm_name ":" -- OuterParse.prop))) []
 *}
-text {* (FIXME funny output for a proposition) *}
+ML {* OuterSyntax.command *}
+section {* New Commands and Keyword Files *}
+text {*
+Often new commands, for example for providing a new definitional principle,
+need to be programmed. While this is not too difficult to do on the
+ML-level, new commands, in order to be useful, need to be recognised by
+ProofGeneral. This results in some subtle configuration issues, which we
+will explain in this section.
+Let us start with a silly command, named \isacommand{foo}, which does nothing at all.
+It can be defined as
+@{ML [display]
+"let
+val do_nothing = Scan.succeed (Toplevel.theory (fn x => x))
+val flag = OuterKeyword.thy_decl
+in
+OuterSyntax.command \"foo\" \"description of foo\" flag do_nothing
+end"}
+The crucial function that makes the command \isacommand{foo} known to
+Isabelle is @{ML OuterSyntax.command}.  It expects a name of a command, a
+short description, a flag (we will explain it later more thoroughly) and a
+parser for a top-level transition function (its purpose will also explained
+later on). Lets also assume we packaged this function into a separate
+theory named @{text "Command"}, say, and a file named @{text "Command.thy"}.
+We now need a keyword file, that can be loaded by ProofGeneral in
+order to recognise \isacommand{foo} as command. Such a keyword file
+can be generated with the command-line
+@{text [display] "$ isabelle keywords -k foo some-log-files"}
+The option @{text "-k foo"} indicates which postfix the keyword file will obtain. In
+the case above the generated file is named @{text "isar-keywords-foo.el"}. This command
+requires log files to be present (in order to extract the keywords from them). For
+our purposes it is sufficient to use the log files for the theories @{text "Pure"},
+@{text "HOL"} and @{text "Pure-ProofGeneral"}, as well as the theory @{text "Command.thy"}
+containing the new command. @{text Pure} and @{text HOL} are usually compiled during the
+installation of Isabelle, if not this can be done conveniently using build script from
+the distribution
+@{text [display]
+"$ ./build -m \"Pure\"
+$ ./build -m \"HOL\""}
+The @{text "Pure-ProofGeneral"} theory needs to be compiled with
+@{text [display] "$ ./build -m \"Pure-ProofGeneral\" \"Pure\""}
+For the theory containing the new command \isacommand{foo}, we first
+create a ``managed'' subdirectory by
+This creates files @{text "IsaMakefile"} and @{text "./FooCommand/ROOT.ML"}. We
+need to copy the file @{text "Command.thy"} into the directory @{text "FooCommand"}
+and add @{text "use_thy \"Command\";"} to @{text "./FooCommand/ROOT.ML"}.
+Now we can compile the theory by just typing
+@{text [display] "$ isabelle make"}
+All these compilations created the log files that we are after. They are typically stored
+under the directory
+@{text [display]  "~/.isabelle/heaps/\<dots>\<dots>/\<dots>\<dots>/log"}
+where the dots stand for names of the current Isabelle distribution and
+hardware architecture. In it are the files
+@{text [display]
+"Pure.gz
+HOL.gz
+Pure-ProofGeneral.gz
+HOL-FooCommand.gz"}
+Let us assume the directory with these files is stored in the shell variable
+@{text "ISABELLE_LOGS"}, by typing for example
+@{text [display] "$ ISABELLE_LOGS=\"$(isabelle getenv -b ISABELLE_OUTPUT)\"/log"}
+We can now create the keyword file with
+@{text [display]
+"$ isabelle keywords -k foo
+`$ISABELLE_LOGS/{Pure.gz,HOL.gz,Pure-ProofGeneral,HOL-FooCommand.gz}`"}
+The result is the file @{text "isar-keywords-foo.el"}, which needs to be
+copied in the directory @{text "~/.isabelle/etc"}. To make Isabelle use
+this keyword file, we need to start it with the option @{text "-k foo"}:
+@{text [display] "isabelle -k foo a-theory-file"}
+If we now run the original code creating the command
+@{ML [display]
+"let
+val do_nothing = Scan.succeed (Toplevel.theory (fn x => x))
+val flag = OuterKeyword.thy_decl
+in
+OuterSyntax.command \"foo\" \"description of foo\" flag do_nothing
+end"}
+then we can finally make use of \isacommand{foo}! Similarly
+with any other command you want to implement.
+In the example above, we built the theories on top of the HOL-logic. If you
+target other logics, such as Nominal or ZF, then you need to change the
+log files appropriately.
+(FIXME Explain @{text "OuterKeyword.thy_decl"} and parser)
+*}
 chapter {* Parsing *}
 *}
 end

changeset 65	c8e9a4f97916
parent 60	5b9c6010897b
child 66	d563f8ff6aa0