diff -r 9a6e5e0c4906 -r c8e9a4f97916 CookBook/Parsing.thy --- a/CookBook/Parsing.thy Thu Jan 08 22:47:15 2009 +0000 +++ b/CookBook/Parsing.thy Sat Jan 10 12:57:48 2009 +0000 @@ -95,8 +95,8 @@ If, as in the previous example, one wants to parse a particular string, then one should use the function @{ML Scan.this_string}: - @{ML_response [display] "Scan.this_string \"hel\" (explode \"hello\")" - "(\"hel\", [\"l\", \"o\"])"} + @{ML_response [display] "Scan.this_string \"hell\" (explode \"hello\")" + "(\"hell\", [\"o\"])"} Parsers that explore alternatives can be constructed using the function @{ML "(op ||)"}. For example, the parser @{ML "(p || q)" for p q} returns the @@ -162,8 +162,7 @@ @{ML [display] "(p -- q) || r" for p q r} However, this parser is problematic for producing an appropriate error message, in case - the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses with the parser - above the information + the parsing of @{ML "(p -- q)" for p q} fails. Because in that case one loses the information that @{text p} should be followed by @{text q}. To see this consider the case in which @{text p} is present in the input, but not @{text q}. That means @{ML "(p -- q)" for p q} will fail @@ -172,8 +171,8 @@ parser for the input ``p-followed-by-q'' and therefore will also fail. The error message is then caused by the failure of @{text r}, not by the absence of @{text q} in the input. This kind of situation - can be avoided by using the function @{ML "(op !!)"}. This function aborts the whole process of - parsing in case of a failure and invokes an error message. For example if we invoke the parser + can be avoided when using the function @{ML "(op !!)"}. This function aborts the whole process of + parsing in case of a failure and prints an error message. For example if we invoke the parser @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"} @@ -188,7 +187,7 @@ @{ML_response_fake [display] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")" "Exception ABORT raised"} - the parsing aborts and the error message @{text "foo"} is printed out. In order to + then the parsing aborts and the error message @{text "foo"} is printed out. In order to see the error message properly, we need to prefix the parser with the function @{ML "Scan.error"}. For example @@ -198,7 +197,7 @@ This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"} (FIXME: give reference to later place). - Returning to our example of parsing @{ML "(p -- q) || r" for p q r}. If we want + Let us now return to our example of parsing @{ML "(p -- q) || r" for p q r}. If we want to generate the correct error message for p-followed-by-q, then we have to write: *} @@ -206,20 +205,21 @@ ML {* fun p_followed_by_q p q r = let - val err = (fn _ => p ^ " is not followed by " ^ q) + val err_msg = (fn _ => p ^ " is not followed by " ^ q) in - (($$ p) -- (!! err ($$ q))) || (($$ r) -- ($$ r)) + ($$ p -- (!! err_msg ($$ q))) || ($$ r -- $$ r) end *} text {* - Running this parser with + Running this parser with the @{text [quotes] "h"} and @{text [quotes] "e"}, and + the input @{text [quotes] "holle"} @{ML_response_fake [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"holle\")" "Exception ERROR \"h is not followed by e\" raised"} - gives the correct error message. Running it with + produces the correct error message. Running it with @{ML_response [display] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"wworld\")" "((\"w\", \"w\"), [\"o\", \"r\", \"l\", \"d\"])"} @@ -237,15 +237,16 @@ succeeds at least once. Also note that the parser would have aborted with the exception @{text MORE}, if - we had run it only on just @{text [quotes] "hhhh"}. This can be avoided using + we had run it only on just @{text [quotes] "hhhh"}. This can be avoided by using the wrapper @{ML Scan.finite} and the ``stopper-token'' @{ML Symbol.stopper}. With them we can write @{ML_response [display] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")" "([\"h\", \"h\", \"h\", \"h\"], [])"} - However, this kind of manually wrapping needs to be done only very rarely - in practise, because it is already done by the infrastructure for you. + @{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings; + other stoppers need to be used when parsing token, for example. However, this kind of + manually wrapping is often already done by the surrounding infrastructure. The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any string as in @@ -260,7 +261,7 @@ "([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"} where the function @{ML Symbol.not_eof} ensures that we do not read beyond the - end of the input string. + end of the input string (i.e.~stopper symbol). The function @{ML "Scan.unless p q" for p q} takes two parsers: if the first one can parse the input, then the whole parser fails; if not, then the second is tried. Therefore @@ -291,8 +292,8 @@ "(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []), ([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"} - After parsing succeeded, one nearly always wants to apply a function on the parsed - items. This is done using the function @{ML "(p >> f)" for p f} which runs + After parsing is done, one nearly always wants to apply a function on the parsed + items. To do this the function @{ML "(p >> f)" for p f} can be employed, which runs first the parser @{text p} and upon successful completion applies the function @{text f} to the result. For example @@ -345,6 +346,7 @@ This is because the parsers for the theory syntax, as well as the parsers for the argument syntax of proof methods and attributes use the type @{ML_type OuterLex.token} (which is identical to the type @{ML_type OuterParse.token}). + There are also parsers for ML-expressions and ML-files. \begin{readmore} The parser functions for the theory syntax are contained in the structure @@ -389,7 +391,7 @@ end" "[Token (\,(Ident, \"hello\"), \), Token (\,(Ident, \"world\"), \)]"} - For convenience we are going to use the function + For convenience we define the function *} @@ -409,7 +411,7 @@ Token (\,(Keyword, \"for\"),\)]"} we obtain a list consisting of only a command and two keyword tokens. - If you want to see which keywords and commands are currently known, use + If you want to see which keywords and commands are currently known, type in the following (you might have to adjust the @{ML print_depth} in order to see the complete list): @@ -497,7 +499,7 @@ but keyword in was found\" raised" } - \begin{exercise} + \begin{exercise} (FIXME) A type-identifier, for example @{typ "'a"}, is a token of kind @{ML "Keyword" in OuterLex}. It can be parsed using the function @{ML OuterParse.type_ident}. @@ -507,6 +509,9 @@ *} + + + section {* Positional Information *} text {* @@ -535,6 +540,8 @@ OuterParse.opt_target *} +text {* (FIXME funny output for a proposition) *} + ML {* OuterParse.opt_target -- OuterParse.fixes -- @@ -544,7 +551,124 @@ *} -text {* (FIXME funny output for a proposition) *} +ML {* OuterSyntax.command *} + +section {* New Commands and Keyword Files *} + +text {* + + Often new commands, for example for providing a new definitional principle, + need to be programmed. While this is not too difficult to do on the + ML-level, new commands, in order to be useful, need to be recognised by + ProofGeneral. This results in some subtle configuration issues, which we + will explain in this section. + + + Let us start with a silly command, named \isacommand{foo}, which does nothing at all. + It can be defined as + +@{ML [display] +"let + val do_nothing = Scan.succeed (Toplevel.theory (fn x => x)) + val flag = OuterKeyword.thy_decl +in + OuterSyntax.command \"foo\" \"description of foo\" flag do_nothing +end"} + + The crucial function that makes the command \isacommand{foo} known to + Isabelle is @{ML OuterSyntax.command}. It expects a name of a command, a + short description, a flag (we will explain it later more thoroughly) and a + parser for a top-level transition function (its purpose will also explained + later on). Lets also assume we packaged this function into a separate + theory named @{text "Command"}, say, and a file named @{text "Command.thy"}. + + We now need a keyword file, that can be loaded by ProofGeneral in + order to recognise \isacommand{foo} as command. Such a keyword file + can be generated with the command-line + + @{text [display] "$ isabelle keywords -k foo some-log-files"} + + The option @{text "-k foo"} indicates which postfix the keyword file will obtain. In + the case above the generated file is named @{text "isar-keywords-foo.el"}. This command + requires log files to be present (in order to extract the keywords from them). For + our purposes it is sufficient to use the log files for the theories @{text "Pure"}, + @{text "HOL"} and @{text "Pure-ProofGeneral"}, as well as the theory @{text "Command.thy"} + containing the new command. @{text Pure} and @{text HOL} are usually compiled during the + installation of Isabelle, if not this can be done conveniently using build script from + the distribution + + @{text [display] +"$ ./build -m \"Pure\" +$ ./build -m \"HOL\""} + + The @{text "Pure-ProofGeneral"} theory needs to be compiled with + + @{text [display] "$ ./build -m \"Pure-ProofGeneral\" \"Pure\""} + + For the theory containing the new command \isacommand{foo}, we first + create a ``managed'' subdirectory by + + + This creates files @{text "IsaMakefile"} and @{text "./FooCommand/ROOT.ML"}. We + need to copy the file @{text "Command.thy"} into the directory @{text "FooCommand"} + and add @{text "use_thy \"Command\";"} to @{text "./FooCommand/ROOT.ML"}. + Now we can compile the theory by just typing + + @{text [display] "$ isabelle make"} + + All these compilations created the log files that we are after. They are typically stored + under the directory + + @{text [display] "~/.isabelle/heaps/\\/\\/log"} + + where the dots stand for names of the current Isabelle distribution and + hardware architecture. In it are the files + + @{text [display] +"Pure.gz +HOL.gz +Pure-ProofGeneral.gz +HOL-FooCommand.gz"} + + Let us assume the directory with these files is stored in the shell variable + @{text "ISABELLE_LOGS"}, by typing for example + + @{text [display] "$ ISABELLE_LOGS=\"$(isabelle getenv -b ISABELLE_OUTPUT)\"/log"} + + We can now create the keyword file with + +@{text [display] +"$ isabelle keywords -k foo + `$ISABELLE_LOGS/{Pure.gz,HOL.gz,Pure-ProofGeneral,HOL-FooCommand.gz}`"} + + The result is the file @{text "isar-keywords-foo.el"}, which needs to be + copied in the directory @{text "~/.isabelle/etc"}. To make Isabelle use + this keyword file, we need to start it with the option @{text "-k foo"}: + + @{text [display] "isabelle -k foo a-theory-file"} + + If we now run the original code creating the command + + @{ML [display] +"let + val do_nothing = Scan.succeed (Toplevel.theory (fn x => x)) + val flag = OuterKeyword.thy_decl +in + OuterSyntax.command \"foo\" \"description of foo\" flag do_nothing +end"} + + then we can finally make use of \isacommand{foo}! Similarly + with any other command you want to implement. + + In the example above, we built the theories on top of the HOL-logic. If you + target other logics, such as Nominal or ZF, then you need to change the + log files appropriately. + + (FIXME Explain @{text "OuterKeyword.thy_decl"} and parser) + + +*} + @@ -1217,4 +1341,5 @@ *} + end \ No newline at end of file