4
+ − 1
theory Parsing
321
+ − 2
imports Base "Helper/Command/Command" "Package/Simple_Inductive_Package"
4
+ − 3
begin
+ − 4
414
+ − 5
chapter {* Parsing\label{chp:parsing} *}
4
+ − 6
520
+ − 7
4
+ − 8
text {*
421
+ − 9
\begin{flushright}
+ − 10
{\em An important principle underlying the success and popularity of Unix\\ is
+ − 11
the philosophy of building on the work of others.} \\[1ex]
539
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
+ − 12
Tony Travis in an email about the\\ ``LINUX is obsolete'' debate
421
+ − 13
\end{flushright}
+ − 14
+ − 15
321
+ − 16
Isabelle distinguishes between \emph{outer} and \emph{inner}
+ − 17
syntax. Commands, such as \isacommand{definition}, \isacommand{inductive}
+ − 18
and so on, belong to the outer syntax, whereas terms, types and so on belong
+ − 19
to the inner syntax. For parsing inner syntax, Isabelle uses a rather
+ − 20
general and sophisticated algorithm, which is driven by priority
+ − 21
grammars. Parsers for outer syntax are built up by functional parsing
+ − 22
combinators. These combinators are a well-established technique for parsing,
+ − 23
which has, for example, been described in Paulson's classic ML-book
+ − 24
\cite{paulson-ml2}. Isabelle developers are usually concerned with writing
+ − 25
these outer syntax parsers, either for new definitional packages or for
+ − 26
calling methods with specific arguments.
42
+ − 27
+ − 28
\begin{readmore}
236
+ − 29
The library for writing parser combinators is split up, roughly, into two
326
+ − 30
parts: The first part consists of a collection of generic parser combinators
236
+ − 31
defined in the structure @{ML_struct Scan} in the file @{ML_file
+ − 32
"Pure/General/scan.ML"}. The second part of the library consists of
+ − 33
combinators for dealing with specific token types, which are defined in the
426
+ − 34
structure @{ML_struct Parse} in the file @{ML_file
424
+ − 35
"Pure/Isar/parse.ML"}. In addition specific parsers for packages are
+ − 36
defined in @{ML_file "Pure/Isar/parse_spec.ML"}. Parsers for method arguments
326
+ − 37
are defined in @{ML_file "Pure/Isar/args.ML"}.
42
+ − 38
\end{readmore}
38
+ − 39
+ − 40
*}
+ − 41
49
+ − 42
section {* Building Generic Parsers *}
38
+ − 43
+ − 44
text {*
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 45
240
+ − 46
Let us first have a look at parsing strings using generic parsing
344
+ − 47
combinators. The function @{ML_ind "$$" in Scan} takes a string as argument and will
240
+ − 48
``consume'' this string from a given input list of strings. ``Consume'' in
+ − 49
this context means that it will return a pair consisting of this string and
+ − 50
the rest of the input list. For example:
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 51
240
+ − 52
@{ML_response [display,gray]
+ − 53
"($$ \"h\") (Symbol.explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 54
240
+ − 55
@{ML_response [display,gray]
+ − 56
"($$ \"w\") (Symbol.explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"}
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 57
240
+ − 58
The function @{ML "$$"} will either succeed (as in the two examples above)
+ − 59
or raise the exception @{text "FAIL"} if no string can be consumed. For
+ − 60
example trying to parse
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 61
240
+ − 62
@{ML_response_fake [display,gray]
+ − 63
"($$ \"x\") (Symbol.explode \"world\")"
+ − 64
"Exception FAIL raised"}
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 65
240
+ − 66
will raise the exception @{text "FAIL"}. There are three exceptions used in
+ − 67
the parsing combinators:
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 68
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 69
\begin{itemize}
58
+ − 70
\item @{text "FAIL"} is used to indicate that alternative routes of parsing
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 71
might be explored.
58
+ − 72
\item @{text "MORE"} indicates that there is not enough input for the parser. For example
+ − 73
in @{text "($$ \"h\") []"}.
60
5b9c6010897b
doem tuning and made the cookbook work again with recent changes (CookBook/Package/Ind_Interface.thy needs to be looked at to see what the problem with the new parser type is)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 74
\item @{text "ABORT"} is the exception that is raised when a dead end is reached.
108
8bea3f74889d
added to the tactical chapter; polished; added the tabularstar environment (which is just tabular*)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 75
It is used for example in the function @{ML "!!"} (see below).
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 76
\end{itemize}
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 77
50
+ − 78
However, note that these exceptions are private to the parser and cannot be accessed
49
+ − 79
by the programmer (for example to handle them).
240
+ − 80
357
+ − 81
In the examples above we use the function @{ML_ind explode in Symbol} from the
344
+ − 82
structure @{ML_struct Symbol}, instead of the more standard library function
369
+ − 83
@{ML_ind explode in String}, for obtaining an input list for the parser. The reason is
458
+ − 84
that @{ML explode in Symbol} is aware of character
344
+ − 85
sequences, for example @{text "\<foo>"}, that have a special meaning in
+ − 86
Isabelle. To see the difference consider
240
+ − 87
+ − 88
@{ML_response_fake [display,gray]
+ − 89
"let
261
+ − 90
val input = \"\<foo> bar\"
240
+ − 91
in
458
+ − 92
(String.explode input, Symbol.explode input)
240
+ − 93
end"
+ − 94
"([\"\\\", \"<\", \"f\", \"o\", \"o\", \">\", \" \", \"b\", \"a\", \"r\"],
261
+ − 95
[\"\<foo>\", \" \", \"b\", \"a\", \"r\"])"}
240
+ − 96
256
+ − 97
Slightly more general than the parser @{ML "$$"} is the function
344
+ − 98
@{ML_ind one in Scan}, in that it takes a predicate as argument and
256
+ − 99
then parses exactly
52
+ − 100
one item from the input list satisfying this predicate. For example the
58
+ − 101
following parser either consumes an @{text [quotes] "h"} or a @{text
49
+ − 102
[quotes] "w"}:
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 103
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 104
@{ML_response [display,gray]
40
+ − 105
"let
+ − 106
val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\")
240
+ − 107
val input1 = Symbol.explode \"hello\"
+ − 108
val input2 = Symbol.explode \"world\"
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 109
in
236
+ − 110
(hw input1, hw input2)
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 111
end"
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 112
"((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 113
344
+ − 114
Two parsers can be connected in sequence by using the function @{ML_ind "--" in Scan}.
220
+ − 115
For example parsing @{text "h"}, @{text "e"} and @{text "l"} (in this
+ − 116
order) you can achieve by:
38
+ − 117
236
+ − 118
@{ML_response [display,gray]
240
+ − 119
"($$ \"h\" -- $$ \"e\" -- $$ \"l\") (Symbol.explode \"hello\")"
236
+ − 120
"(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"}
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 121
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 122
Note how the result of consumed strings builds up on the left as nested pairs.
38
+ − 123
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 124
If, as in the previous example, you want to parse a particular string,
326
+ − 125
then you can use the function @{ML_ind this_string in Scan}.
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 126
236
+ − 127
@{ML_response [display,gray]
240
+ − 128
"Scan.this_string \"hell\" (Symbol.explode \"hello\")"
236
+ − 129
"(\"hell\", [\"o\"])"}
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 130
256
+ − 131
Parsers that explore alternatives can be constructed using the function
344
+ − 132
@{ML_ind "||" in Scan}. The parser @{ML "(p || q)" for p q} returns the
58
+ − 133
result of @{text "p"}, in case it succeeds, otherwise it returns the
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 134
result of @{text "q"}. For example:
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 135
38
+ − 136
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 137
@{ML_response [display,gray]
40
+ − 138
"let
236
+ − 139
val hw = $$ \"h\" || $$ \"w\"
240
+ − 140
val input1 = Symbol.explode \"hello\"
+ − 141
val input2 = Symbol.explode \"world\"
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 142
in
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 143
(hw input1, hw input2)
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 144
end"
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 145
"((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
38
+ − 146
344
+ − 147
The functions @{ML_ind "|--" in Scan} and @{ML_ind "--|" in Scan} work like the sequencing
321
+ − 148
function for parsers, except that they discard the item being parsed by the
357
+ − 149
first (respectively second) parser. That means the item being dropped is the
+ − 150
one that @{ML_ind "|--" in Scan} and @{ML_ind "--|" in Scan} ``point'' away.
+ − 151
For example:
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 152
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 153
@{ML_response [display,gray]
40
+ − 154
"let
236
+ − 155
val just_e = $$ \"h\" |-- $$ \"e\"
+ − 156
val just_h = $$ \"h\" --| $$ \"e\"
240
+ − 157
val input = Symbol.explode \"hello\"
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 158
in
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 159
(just_e input, just_h input)
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 160
end"
241
+ − 161
"((\"e\", [\"l\", \"l\", \"o\"]), (\"h\", [\"l\", \"l\", \"o\"]))"}
38
+ − 162
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 163
The parser @{ML "Scan.optional p x" for p x} returns the result of the parser
58
+ − 164
@{text "p"}, if it succeeds; otherwise it returns
104
+ − 165
the default value @{text "x"}. For example:
38
+ − 166
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 167
@{ML_response [display,gray]
40
+ − 168
"let
+ − 169
val p = Scan.optional ($$ \"h\") \"x\"
240
+ − 170
val input1 = Symbol.explode \"hello\"
+ − 171
val input2 = Symbol.explode \"world\"
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 172
in
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 173
(p input1, p input2)
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 174
end"
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 175
"((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 176
344
+ − 177
The function @{ML_ind option in Scan} works similarly, except no default value can
50
+ − 178
be given. Instead, the result is wrapped as an @{text "option"}-type. For example:
+ − 179
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 180
@{ML_response [display,gray]
50
+ − 181
"let
+ − 182
val p = Scan.option ($$ \"h\")
240
+ − 183
val input1 = Symbol.explode \"hello\"
+ − 184
val input2 = Symbol.explode \"world\"
50
+ − 185
in
+ − 186
(p input1, p input2)
+ − 187
end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
49
+ − 188
344
+ − 189
The function @{ML_ind ahead in Scan} parses some input, but leaves the original
326
+ − 190
input unchanged. For example:
+ − 191
+ − 192
@{ML_response [display,gray]
+ − 193
"Scan.ahead (Scan.this_string \"foo\") (Symbol.explode \"foo\")"
+ − 194
"(\"foo\", [\"f\", \"o\", \"o\"])"}
+ − 195
344
+ − 196
The function @{ML_ind "!!" in Scan} helps with producing appropriate error messages
326
+ − 197
during parsing. For example if you want to parse @{text p} immediately
58
+ − 198
followed by @{text q}, or start a completely different parser @{text r},
104
+ − 199
you might write:
40
+ − 200
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 201
@{ML [display,gray] "(p -- q) || r" for p q r}
40
+ − 202
326
+ − 203
However, this parser is problematic for producing a useful error
+ − 204
message, if the parsing of @{ML "(p -- q)" for p q} fails. Because with the
+ − 205
parser above you lose the information that @{text p} should be followed by @{text q}.
220
+ − 206
To see this assume that @{text p} is present in the input, but it is not
+ − 207
followed by @{text q}. That means @{ML "(p -- q)" for p q} will fail and
+ − 208
hence the alternative parser @{text r} will be tried. However, in many
236
+ − 209
circumstances this will be the wrong parser for the input ``@{text "p"}-followed-by-something''
220
+ − 210
and therefore will also fail. The error message is then caused by the failure
+ − 211
of @{text r}, not by the absence of @{text q} in the input. This kind of
+ − 212
situation can be avoided when using the function @{ML "!!"}. This function
+ − 213
aborts the whole process of parsing in case of a failure and prints an error
+ − 214
message. For example if you invoke the parser
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 215
40
+ − 216
472
+ − 217
@{ML [display,gray] "!! (fn _ => fn _ =>\"foo\") ($$ \"h\")"}
+ − 218
*}
+ − 219
text {*
58
+ − 220
on @{text [quotes] "hello"}, the parsing succeeds
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 221
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 222
@{ML_response [display,gray]
472
+ − 223
"(!! (fn _ => fn _ => \"foo\") ($$ \"h\")) (Symbol.explode \"hello\")"
236
+ − 224
"(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
40
+ − 225
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 226
but if you invoke it on @{text [quotes] "world"}
472
+ − 227
+ − 228
@{ML_response_fake [display,gray] "(!! (fn _ => fn _ => \"foo\") ($$ \"h\")) (Symbol.explode \"world\")"
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 229
"Exception ABORT raised"}
40
+ − 230
108
8bea3f74889d
added to the tactical chapter; polished; added the tabularstar environment (which is just tabular*)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 231
then the parsing aborts and the error message @{text "foo"} is printed. In order to
120
+ − 232
see the error message properly, you need to prefix the parser with the function
344
+ − 233
@{ML_ind error in Scan}. For example:
40
+ − 234
236
+ − 235
@{ML_response_fake [display,gray]
472
+ − 236
"Scan.error (!! (fn _ => fn _ => \"foo\") ($$ \"h\"))"
236
+ − 237
"Exception Error \"foo\" raised"}
40
+ − 238
426
+ − 239
This ``prefixing'' is usually done by wrappers such as @{ML_ind local_theory in Outer_Syntax}
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 240
(see Section~\ref{sec:newcommand} which explains this function in more detail).
40
+ − 241
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 242
Let us now return to our example of parsing @{ML "(p -- q) || r" for p q
326
+ − 243
r}. If you want to generate the correct error message for failure
+ − 244
of parsing @{text "p"}-followed-by-@{text "q"}, then you have to write:
38
+ − 245
*}
+ − 246
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 247
ML %grayML{*fun p_followed_by_q p q r =
133
+ − 248
let
236
+ − 249
val err_msg = fn _ => p ^ " is not followed by " ^ q
133
+ − 250
in
472
+ − 251
($$ p -- (!! (fn _ => err_msg) ($$ q))) || ($$ r -- $$ r)
133
+ − 252
end *}
38
+ − 253
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 254
40
+ − 255
text {*
220
+ − 256
Running this parser with the arguments
+ − 257
@{text [quotes] "h"}, @{text [quotes] "e"} and @{text [quotes] "w"}, and
65
+ − 258
the input @{text [quotes] "holle"}
40
+ − 259
240
+ − 260
@{ML_response_fake [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (Symbol.explode \"holle\")"
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 261
"Exception ERROR \"h is not followed by e\" raised"}
40
+ − 262
65
+ − 263
produces the correct error message. Running it with
40
+ − 264
240
+ − 265
@{ML_response [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (Symbol.explode \"wworld\")"
40
+ − 266
"((\"w\", \"w\"), [\"o\", \"r\", \"l\", \"d\"])"}
+ − 267
+ − 268
yields the expected parsing.
38
+ − 269
58
+ − 270
The function @{ML "Scan.repeat p" for p} will apply a parser @{text p} as
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 271
often as it succeeds. For example:
40
+ − 272
240
+ − 273
@{ML_response [display,gray] "Scan.repeat ($$ \"h\") (Symbol.explode \"hhhhello\")"
40
+ − 274
"([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"}
+ − 275
344
+ − 276
Note that @{ML_ind repeat in Scan} stores the parsed items in a list. The function
+ − 277
@{ML_ind repeat1 in Scan} is similar, but requires that the parser @{text "p"}
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 278
succeeds at least once.
48
+ − 279
58
+ − 280
Also note that the parser would have aborted with the exception @{text MORE}, if
326
+ − 281
you had it run with the string @{text [quotes] "hhhh"}. This can be avoided by using
344
+ − 282
the wrapper @{ML_ind finite in Scan} and the ``stopper-token''
+ − 283
@{ML_ind stopper in Symbol}. With them you can write:
49
+ − 284
240
+ − 285
@{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (Symbol.explode \"hhhh\")"
49
+ − 286
"([\"h\", \"h\", \"h\", \"h\"], [])"}
+ − 287
326
+ − 288
The function @{ML stopper in Symbol} is the ``end-of-input'' indicator for parsing strings;
128
+ − 289
other stoppers need to be used when parsing, for example, tokens. However, this kind of
65
+ − 290
manually wrapping is often already done by the surrounding infrastructure.
49
+ − 291
344
+ − 292
The function @{ML_ind repeat in Scan} can be used with @{ML_ind one in Scan} to read any
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 293
string as in
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 294
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 295
@{ML_response [display,gray]
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 296
"let
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 297
val p = Scan.repeat (Scan.one Symbol.not_eof)
240
+ − 298
val input = Symbol.explode \"foo bar foo\"
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 299
in
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 300
Scan.finite Symbol.stopper p input
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 301
end"
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 302
"([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"}
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 303
344
+ − 304
where the function @{ML_ind not_eof in Symbol} ensures that we do not read beyond the
65
+ − 305
end of the input string (i.e.~stopper symbol).
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 306
344
+ − 307
The function @{ML_ind unless in Scan} takes two parsers: if the first one can
60
5b9c6010897b
doem tuning and made the cookbook work again with recent changes (CookBook/Package/Ind_Interface.thy needs to be looked at to see what the problem with the new parser type is)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 308
parse the input, then the whole parser fails; if not, then the second is tried. Therefore
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 309
240
+ − 310
@{ML_response_fake_both [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (Symbol.explode \"hello\")"
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 311
"Exception FAIL raised"}
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 312
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 313
fails, while
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 314
240
+ − 315
@{ML_response [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (Symbol.explode \"world\")"
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 316
"(\"w\",[\"o\", \"r\", \"l\", \"d\"])"}
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 317
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 318
succeeds.
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 319
344
+ − 320
The functions @{ML_ind repeat in Scan} and @{ML_ind unless in Scan} can
256
+ − 321
be combined to read any input until a certain marker symbol is reached. In the
+ − 322
example below the marker symbol is a @{text [quotes] "*"}.
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 323
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 324
@{ML_response [display,gray]
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 325
"let
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 326
val p = Scan.repeat (Scan.unless ($$ \"*\") (Scan.one Symbol.not_eof))
240
+ − 327
val input1 = Symbol.explode \"fooooo\"
+ − 328
val input2 = Symbol.explode \"foo*ooo\"
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 329
in
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 330
(Scan.finite Symbol.stopper p input1,
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 331
Scan.finite Symbol.stopper p input2)
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 332
end"
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 333
"(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []),
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 334
([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"}
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 335
256
+ − 336
220
+ − 337
After parsing is done, you almost always want to apply a function to the parsed
344
+ − 338
items. One way to do this is the function @{ML_ind ">>" in Scan} where
256
+ − 339
@{ML "(p >> f)" for p f} runs
58
+ − 340
first the parser @{text p} and upon successful completion applies the
+ − 341
function @{text f} to the result. For example
38
+ − 342
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 343
@{ML_response [display,gray]
40
+ − 344
"let
193
+ − 345
fun double (x, y) = (x ^ x, y ^ y)
326
+ − 346
val parser = $$ \"h\" -- $$ \"e\"
40
+ − 347
in
326
+ − 348
(parser >> double) (Symbol.explode \"hello\")
40
+ − 349
end"
+ − 350
"((\"hh\", \"ee\"), [\"l\", \"l\", \"o\"])"}
+ − 351
104
+ − 352
doubles the two parsed input strings; or
59
+ − 353
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 354
@{ML_response [display,gray]
59
+ − 355
"let
104
+ − 356
val p = Scan.repeat (Scan.one Symbol.not_eof)
240
+ − 357
val input = Symbol.explode \"foo bar foo\"
59
+ − 358
in
104
+ − 359
Scan.finite Symbol.stopper (p >> implode) input
59
+ − 360
end"
+ − 361
"(\"foo bar foo\",[])"}
+ − 362
60
5b9c6010897b
doem tuning and made the cookbook work again with recent changes (CookBook/Package/Ind_Interface.thy needs to be looked at to see what the problem with the new parser type is)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 363
where the single-character strings in the parsed output are transformed
59
+ − 364
back into one string.
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 365
344
+ − 366
The function @{ML_ind lift in Scan} takes a parser and a pair as arguments. This function applies
40
+ − 367
the given parser to the second component of the pair and leaves the first component
+ − 368
untouched. For example
38
+ − 369
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 370
@{ML_response [display,gray]
240
+ − 371
"Scan.lift ($$ \"h\" -- $$ \"e\") (1, Symbol.explode \"hello\")"
40
+ − 372
"((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"}
+ − 373
390
+ − 374
\footnote{\bf FIXME: In which situations is @{text "lift"} useful? Give examples.}
+ − 375
397
+ − 376
Be aware of recursive parsers. Suppose you want to read strings separated by
+ − 377
commas and by parentheses into a tree datastructure; for example, generating
+ − 378
the tree corresponding to the string @{text [quotes] "(A, A), (A, A)"} where
+ − 379
the @{text "A"}s will be the leaves. We assume the trees are represented by the
+ − 380
datatype:
390
+ − 381
*}
+ − 382
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 383
ML %grayML{*datatype tree =
390
+ − 384
Lf of string
+ − 385
| Br of tree * tree*}
+ − 386
+ − 387
text {*
+ − 388
Since nested parentheses should be treated in a meaningful way---for example
+ − 389
the string @{text [quotes] "((A))"} should be read into a single
+ − 390
leaf---you might implement the following parser.
+ − 391
*}
+ − 392
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 393
ML %grayML{*fun parse_basic s =
397
+ − 394
$$ s >> Lf || $$ "(" |-- parse_tree s --| $$ ")"
+ − 395
390
+ − 396
and parse_tree s =
397
+ − 397
parse_basic s --| $$ "," -- parse_tree s >> Br || parse_basic s*}
390
+ − 398
+ − 399
text {*
523
+ − 400
This parser corresponds to the grammar:
397
+ − 401
+ − 402
\begin{center}
+ − 403
\begin{tabular}{lcl}
+ − 404
@{text "<Basic>"} & @{text "::="} & @{text "<String> | (<Tree>)"}\\
+ − 405
@{text "<Tree>"} & @{text "::="} & @{text "<Basic>, <Tree> | <Basic>"}\\
+ − 406
\end{tabular}
+ − 407
\end{center}
+ − 408
390
+ − 409
The parameter @{text "s"} is the string over which the tree is parsed. The
+ − 410
parser @{ML parse_basic} reads either a leaf or a tree enclosed in
+ − 411
parentheses. The parser @{ML parse_tree} reads either a pair of trees
+ − 412
separated by a comma, or acts like @{ML parse_basic}. Unfortunately,
+ − 413
because of the mutual recursion, this parser will immediately run into a
+ − 414
loop, even if it is called without any input. For example
+ − 415
+ − 416
@{ML_response_fake_both [display, gray]
+ − 417
"parse_tree \"A\""
+ − 418
"*** Exception- TOPLEVEL_ERROR raised"}
+ − 419
+ − 420
raises an exception indicating that the stack limit is reached. Such
392
+ − 421
looping parser are not useful, because of ML's strict evaluation of
390
+ − 422
arguments. Therefore we need to delay the execution of the
+ − 423
parser until an input is given. This can be done by adding the parsed
397
+ − 424
string as an explicit argument. So the parser above should be implemented
+ − 425
as follows.
390
+ − 426
*}
+ − 427
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 428
ML %grayML{*fun parse_basic s xs =
397
+ − 429
($$ s >> Lf || $$ "(" |-- parse_tree s --| $$ ")") xs
+ − 430
390
+ − 431
and parse_tree s xs =
397
+ − 432
(parse_basic s --| $$ "," -- parse_tree s >> Br || parse_basic s) xs*}
390
+ − 433
+ − 434
text {*
+ − 435
While the type of the parser is unchanged by the addition, its behaviour
+ − 436
changed: with this version of the parser the execution is delayed until
+ − 437
some string is applied for the argument @{text "xs"}. This gives us
+ − 438
exactly the parser what we wanted. An example is as follows:
+ − 439
+ − 440
@{ML_response [display, gray]
+ − 441
"let
+ − 442
val input = Symbol.explode \"(A,((A))),A\"
+ − 443
in
+ − 444
Scan.finite Symbol.stopper (parse_tree \"A\") input
+ − 445
end"
+ − 446
"(Br (Br (Lf \"A\", Lf \"A\"), Lf \"A\"), [])"}
+ − 447
149
+ − 448
+ − 449
\begin{exercise}\label{ex:scancmts}
+ − 450
Write a parser that parses an input string so that any comment enclosed
220
+ − 451
within @{text "(*\<dots>*)"} is replaced by the same comment but enclosed within
149
+ − 452
@{text "(**\<dots>**)"} in the output string. To enclose a string, you can use the
+ − 453
function @{ML "enclose s1 s2 s" for s1 s2 s} which produces the string @{ML
236
+ − 454
"s1 ^ s ^ s2" for s1 s2 s}. Hint: To simplify the task ignore the proper
+ − 455
nesting of comments.
149
+ − 456
\end{exercise}
40
+ − 457
*}
+ − 458
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 459
section {* Parsing Theory Syntax *}
38
+ − 460
40
+ − 461
text {*
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 462
Most of the time, however, Isabelle developers have to deal with parsing
156
+ − 463
tokens, not strings. These token parsers have the type:
128
+ − 464
*}
+ − 465
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 466
ML %grayML{*type 'a parser = Token.T list -> 'a * Token.T list*}
128
+ − 467
+ − 468
text {*
514
+ − 469
{\bf REDO!!}
+ − 470
+ − 471
149
+ − 472
The reason for using token parsers is that theory syntax, as well as the
128
+ − 473
parsers for the arguments of proof methods, use the type @{ML_type
426
+ − 474
Token.T}.
42
+ − 475
+ − 476
\begin{readmore}
40
+ − 477
The parser functions for the theory syntax are contained in the structure
426
+ − 478
@{ML_struct Parse} defined in the file @{ML_file "Pure/Isar/parse.ML"}.
+ − 479
The definition for tokens is in the file @{ML_file "Pure/Isar/token.ML"}.
42
+ − 480
\end{readmore}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 481
426
+ − 482
The structure @{ML_struct Token} defines several kinds of tokens (for
+ − 483
example @{ML_ind Ident in Token} for identifiers, @{ML Keyword in
+ − 484
Token} for keywords and @{ML_ind Command in Token} for commands). Some
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 485
token parsers take into account the kind of tokens. The first example shows
256
+ − 486
how to generate a token list out of a string using the function
426
+ − 487
@{ML_ind scan in Outer_Syntax}. It is given the argument
256
+ − 488
@{ML "Position.none"} since,
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 489
at the moment, we are not interested in generating precise error
376
+ − 490
messages. The following code
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 491
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 492
426
+ − 493
@{ML_response_fake [display,gray] "Outer_Syntax.scan Position.none \"hello world\""
50
+ − 494
"[Token (\<dots>,(Ident, \"hello\"),\<dots>),
+ − 495
Token (\<dots>,(Space, \" \"),\<dots>),
+ − 496
Token (\<dots>,(Ident, \"world\"),\<dots>)]"}
+ − 497
+ − 498
produces three tokens where the first and the last are identifiers, since
58
+ − 499
@{text [quotes] "hello"} and @{text [quotes] "world"} do not match any
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 500
other syntactic category. The second indicates a space.
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 501
326
+ − 502
We can easily change what is recognised as a keyword with the function
514
+ − 503
@{ML_ind define in Keyword}. For example calling it with
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 504
*}
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 505
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 506
ML %grayML{*val _ = Keyword.define ("hello", NONE) *}
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 507
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 508
text {*
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 509
then lexing @{text [quotes] "hello world"} will produce
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 510
426
+ − 511
@{ML_response_fake [display,gray] "Outer_Syntax.scan Position.none \"hello world\""
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 512
"[Token (\<dots>,(Keyword, \"hello\"),\<dots>),
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 513
Token (\<dots>,(Space, \" \"),\<dots>),
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 514
Token (\<dots>,(Ident, \"world\"),\<dots>)]"}
50
+ − 515
241
+ − 516
Many parsing functions later on will require white space, comments and the like
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 517
to have already been filtered out. So from now on we are going to use the
426
+ − 518
functions @{ML filter} and @{ML_ind is_proper in Token} to do this.
256
+ − 519
For example:
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 520
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 521
@{ML_response_fake [display,gray]
50
+ − 522
"let
426
+ − 523
val input = Outer_Syntax.scan Position.none \"hello world\"
50
+ − 524
in
426
+ − 525
filter Token.is_proper input
50
+ − 526
end"
+ − 527
"[Token (\<dots>,(Ident, \"hello\"), \<dots>), Token (\<dots>,(Ident, \"world\"), \<dots>)]"}
+ − 528
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 529
For convenience we define the function:
50
+ − 530
*}
+ − 531
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 532
ML %grayML{*fun filtered_input str =
426
+ − 533
filter Token.is_proper (Outer_Syntax.scan Position.none str) *}
50
+ − 534
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 535
text {*
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 536
If you now parse
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 537
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 538
@{ML_response_fake [display,gray]
50
+ − 539
"filtered_input \"inductive | for\""
+ − 540
"[Token (\<dots>,(Command, \"inductive\"),\<dots>),
+ − 541
Token (\<dots>,(Keyword, \"|\"),\<dots>),
+ − 542
Token (\<dots>,(Keyword, \"for\"),\<dots>)]"}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 543
221
+ − 544
you obtain a list consisting of only one command and two keyword tokens.
241
+ − 545
If you want to see which keywords and commands are currently known to Isabelle,
449
+ − 546
use the function @{ML_ind get_lexicons in Keyword}:
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 547
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 548
@{ML_response_fake [display,gray]
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 549
"let
426
+ − 550
val (keywords, commands) = Keyword.get_lexicons ()
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 551
in
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 552
(Scan.dest_lexicon commands, Scan.dest_lexicon keywords)
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 553
end"
132
+ − 554
"([\"}\", \"{\", \<dots>], [\"\<rightleftharpoons>\", \"\<leftharpoondown>\", \<dots>])"}
42
+ − 555
344
+ − 556
You might have to adjust the @{ML_ind print_depth} in order to
241
+ − 557
see the complete list.
+ − 558
426
+ − 559
The parser @{ML_ind "$$$" in Parse} parses a single keyword. For example:
50
+ − 560
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 561
@{ML_response [display,gray]
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 562
"let
50
+ − 563
val input1 = filtered_input \"where for\"
+ − 564
val input2 = filtered_input \"| in\"
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 565
in
426
+ − 566
(Parse.$$$ \"where\" input1, Parse.$$$ \"|\" input2)
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 567
end"
128
+ − 568
"((\"where\",\<dots>), (\"|\",\<dots>))"}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 569
426
+ − 570
Any non-keyword string can be parsed with the function @{ML_ind reserved in Parse}.
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 571
For example:
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 572
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 573
@{ML_response [display,gray]
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 574
"let
426
+ − 575
val p = Parse.reserved \"bar\"
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 576
val input = filtered_input \"bar\"
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 577
in
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 578
p input
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 579
end"
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 580
"(\"bar\",[])"}
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 581
344
+ − 582
Like before, you can sequentially connect parsers with @{ML "--"}. For example:
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 583
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 584
@{ML_response [display,gray]
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 585
"let
50
+ − 586
val input = filtered_input \"| in\"
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 587
in
426
+ − 588
(Parse.$$$ \"|\" -- Parse.$$$ \"in\") input
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 589
end"
183
+ − 590
"((\"|\", \"in\"), [])"}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 591
426
+ − 592
The parser @{ML "Parse.enum s p" for s p} parses a possibly empty
58
+ − 593
list of items recognised by the parser @{text p}, where the items being parsed
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 594
are separated by the string @{text s}. For example:
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 595
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 596
@{ML_response [display,gray]
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 597
"let
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 598
val input = filtered_input \"in | in | in foo\"
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 599
in
426
+ − 600
(Parse.enum \"|\" (Parse.$$$ \"in\")) input
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 601
end"
183
+ − 602
"([\"in\", \"in\", \"in\"], [\<dots>])"}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 603
426
+ − 604
The function @{ML_ind enum1 in Parse} works similarly, except that the
326
+ − 605
parsed list must be non-empty. Note that we had to add a string @{text
+ − 606
[quotes] "foo"} at the end of the parsed string, otherwise the parser would
+ − 607
have consumed all tokens and then failed with the exception @{text
+ − 608
"MORE"}. Like in the previous section, we can avoid this exception using the
+ − 609
wrapper @{ML Scan.finite}. This time, however, we have to use the
426
+ − 610
``stopper-token'' @{ML Token.stopper}. We can write:
49
+ − 611
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 612
@{ML_response [display,gray]
49
+ − 613
"let
50
+ − 614
val input = filtered_input \"in | in | in\"
426
+ − 615
val p = Parse.enum \"|\" (Parse.$$$ \"in\")
49
+ − 616
in
426
+ − 617
Scan.finite Token.stopper p input
49
+ − 618
end"
183
+ − 619
"([\"in\", \"in\", \"in\"], [])"}
49
+ − 620
75
+ − 621
The following function will help to run examples.
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 622
*}
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 623
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 624
ML %grayML{*fun parse p input = Scan.finite Token.stopper (Scan.error p) input *}
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 625
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 626
text {*
426
+ − 627
The function @{ML_ind "!!!" in Parse} can be used to force termination
326
+ − 628
of the parser in case of a dead end, just like @{ML "Scan.!!"} (see previous
+ − 629
section). A difference, however, is that the error message of @{ML
426
+ − 630
"Parse.!!!"} is fixed to be @{text [quotes] "Outer syntax error"}
221
+ − 631
together with a relatively precise description of the failure. For example:
49
+ − 632
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 633
@{ML_response_fake [display,gray]
49
+ − 634
"let
50
+ − 635
val input = filtered_input \"in |\"
426
+ − 636
val parse_bar_then_in = Parse.$$$ \"|\" -- Parse.$$$ \"in\"
49
+ − 637
in
426
+ − 638
parse (Parse.!!! parse_bar_then_in) input
49
+ − 639
end"
+ − 640
"Exception ERROR \"Outer syntax error: keyword \"|\" expected,
+ − 641
but keyword in was found\" raised"
+ − 642
}
42
+ − 643
65
+ − 644
\begin{exercise} (FIXME)
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 645
A type-identifier, for example @{typ "'a"}, is a token of
426
+ − 646
kind @{ML_ind Keyword in Token}. It can be parsed using
+ − 647
the function @{ML type_ident in Parse}.
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 648
\end{exercise}
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 649
104
+ − 650
(FIXME: or give parser for numbers)
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 651
125
+ − 652
Whenever there is a possibility that the processing of user input can fail,
221
+ − 653
it is a good idea to give all available information about where the error
220
+ − 654
occurred. For this Isabelle can attach positional information to tokens
326
+ − 655
and then thread this information up the ``processing chain''. To see this,
+ − 656
modify the function @{ML filtered_input}, described earlier, as follows
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 657
*}
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 658
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 659
ML %grayML{*fun filtered_input' str =
426
+ − 660
filter Token.is_proper (Outer_Syntax.scan (Position.line 7) str) *}
49
+ − 661
+ − 662
text {*
125
+ − 663
where we pretend the parsed string starts on line 7. An example is
49
+ − 664
125
+ − 665
@{ML_response_fake [display,gray]
+ − 666
"filtered_input' \"foo \\n bar\""
+ − 667
"[Token ((\"foo\", ({line=7, end_line=7}, {line=7})), (Ident, \"foo\"), \<dots>),
+ − 668
Token ((\"bar\", ({line=8, end_line=8}, {line=8})), (Ident, \"bar\"), \<dots>)]"}
+ − 669
+ − 670
in which the @{text [quotes] "\\n"} causes the second token to be in
+ − 671
line 8.
+ − 672
426
+ − 673
By using the parser @{ML position in Parse} you can access the token
326
+ − 674
position and return it as part of the parser result. For example
125
+ − 675
+ − 676
@{ML_response_fake [display,gray]
+ − 677
"let
241
+ − 678
val input = filtered_input' \"where\"
125
+ − 679
in
426
+ − 680
parse (Parse.position (Parse.$$$ \"where\")) input
125
+ − 681
end"
+ − 682
"((\"where\", {line=7, end_line=7}), [])"}
+ − 683
+ − 684
\begin{readmore}
+ − 685
The functions related to positions are implemented in the file
+ − 686
@{ML_file "Pure/General/position.ML"}.
+ − 687
\end{readmore}
49
+ − 688
391
+ − 689
\begin{exercise}\label{ex:contextfree}
+ − 690
Write a parser for the context-free grammar representing arithmetic
+ − 691
expressions with addition and multiplication. As usual, multiplication
+ − 692
binds stronger than addition, and both of them nest to the right.
+ − 693
The context-free grammar is defined as:
+ − 694
+ − 695
\begin{center}
+ − 696
\begin{tabular}{lcl}
+ − 697
@{text "<Basic>"} & @{text "::="} & @{text "<Number> | (<Expr>)"}\\
+ − 698
@{text "<Factor>"} & @{text "::="} & @{text "<Basic> * <Factor> | <Basic>"}\\
+ − 699
@{text "<Expr>"} & @{text "::="} & @{text "<Factor> + <Expr> | <Factor>"}\\
+ − 700
\end{tabular}
+ − 701
\end{center}
+ − 702
+ − 703
Hint: Be careful with recursive parsers.
+ − 704
\end{exercise}
49
+ − 705
*}
+ − 706
326
+ − 707
section {* Parsers for ML-Code (TBD) *}
+ − 708
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 709
text {*
426
+ − 710
@{ML_ind ML_source in Parse}
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 711
*}
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 712
193
+ − 713
section {* Context Parser (TBD) *}
+ − 714
+ − 715
text {*
326
+ − 716
@{ML_ind Args.context}
+ − 717
*}
+ − 718
(*
+ − 719
ML {*
+ − 720
let
+ − 721
val parser = Args.context -- Scan.lift Args.name_source
+ − 722
+ − 723
fun term_pat (ctxt, str) =
+ − 724
str |> Syntax.read_prop ctxt
+ − 725
in
+ − 726
(parser >> term_pat) (Context.Proof @{context}, filtered_input "f (a::nat)")
+ − 727
|> fst
+ − 728
end
+ − 729
*}
+ − 730
*)
+ − 731
+ − 732
text {*
+ − 733
@{ML_ind Args.context}
+ − 734
193
+ − 735
Used for example in \isacommand{attribute\_setup} and \isacommand{method\_setup}.
+ − 736
*}
+ − 737
207
+ − 738
section {* Argument and Attribute Parsers (TBD) *}
+ − 739
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 740
section {* Parsing Inner Syntax *}
42
+ − 741
125
+ − 742
text {*
+ − 743
There is usually no need to write your own parser for parsing inner syntax, that is
285
+ − 744
for terms and types: you can just call the predefined parsers. Terms can
426
+ − 745
be parsed using the function @{ML_ind term in Parse}. For example:
125
+ − 746
+ − 747
@{ML_response [display,gray]
+ − 748
"let
426
+ − 749
val input = Outer_Syntax.scan Position.none \"foo\"
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 750
in
426
+ − 751
Parse.term input
125
+ − 752
end"
+ − 753
"(\"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\", [])"}
+ − 754
426
+ − 755
The function @{ML_ind prop in Parse} is similar, except that it gives a different
127
+ − 756
error message, when parsing fails. As you can see, the parser not just returns
+ − 757
the parsed string, but also some encoded information. You can decode the
326
+ − 758
information with the function @{ML_ind parse in YXML} in @{ML_struct YXML}. For example
127
+ − 759
445
+ − 760
@{ML_response_fake [display,gray]
127
+ − 761
"YXML.parse \"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\""
445
+ − 762
"Text \"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\""}
127
+ − 763
149
+ − 764
The result of the decoding is an XML-tree. You can see better what is going on if
131
+ − 765
you replace @{ML Position.none} by @{ML "Position.line 42"}, say:
101
+ − 766
445
+ − 767
@{ML_response_fake [display,gray]
125
+ − 768
"let
426
+ − 769
val input = Outer_Syntax.scan (Position.line 42) \"foo\"
125
+ − 770
in
426
+ − 771
YXML.parse (fst (Parse.term input))
125
+ − 772
end"
445
+ − 773
"Elem (\"token\", [(\"line\", \"42\"), (\"end_line\", \"42\")], [XML.Text \"foo\"])"}
+ − 774
149
+ − 775
The positional information is stored as part of an XML-tree so that code
+ − 776
called later on will be able to give more precise error messages.
125
+ − 777
127
+ − 778
\begin{readmore}
128
+ − 779
The functions to do with input and output of XML and YXML are defined
473
+ − 780
in @{ML_file "Pure/PIDE/xml.ML"} and @{ML_file "Pure/PIDE/yxml.ML"}.
127
+ − 781
\end{readmore}
160
cc9359bfacf4
redefined the functions warning and tracing in order to properly match more antiquotations
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 782
361
+ − 783
FIXME:
+ − 784
@{ML_ind parse_term in Syntax} @{ML_ind check_term in Syntax}
+ − 785
@{ML_ind parse_typ in Syntax} @{ML_ind check_typ in Syntax}
374
+ − 786
@{ML_ind read_term in Syntax} @{ML_ind read_term in Syntax}
+ − 787
361
+ − 788
125
+ − 789
*}
101
+ − 790
116
+ − 791
section {* Parsing Specifications\label{sec:parsingspecs} *}
101
+ − 792
+ − 793
text {*
121
+ − 794
There are a number of special purpose parsers that help with parsing
156
+ − 795
specifications of function definitions, inductive predicates and so on. In
220
+ − 796
Chapter~\ref{chp:package}, for example, we will need to parse specifications
121
+ − 797
for inductive predicates of the form:
+ − 798
*}
101
+ − 799
451
+ − 800
121
+ − 801
simple_inductive
+ − 802
even and odd
+ − 803
where
+ − 804
even0: "even 0"
+ − 805
| evenS: "odd n \<Longrightarrow> even (Suc n)"
+ − 806
| oddS: "even n \<Longrightarrow> odd (Suc n)"
101
+ − 807
+ − 808
text {*
121
+ − 809
For this we are going to use the parser:
101
+ − 810
*}
+ − 811
121
+ − 812
ML %linenosgray{*val spec_parser =
426
+ − 813
Parse.fixes --
126
+ − 814
Scan.optional
426
+ − 815
(Parse.$$$ "where" |--
+ − 816
Parse.!!!
+ − 817
(Parse.enum1 "|"
+ − 818
(Parse_Spec.opt_thm_name ":" -- Parse.prop))) []*}
120
+ − 819
101
+ − 820
text {*
241
+ − 821
Note that the parser must not parse the keyword \simpleinductive, even if it is
126
+ − 822
meant to process definitions as shown above. The parser of the keyword
128
+ − 823
will be given by the infrastructure that will eventually call @{ML spec_parser}.
126
+ − 824
+ − 825
124
+ − 826
To see what the parser returns, let us parse the string corresponding to the
121
+ − 827
definition of @{term even} and @{term odd}:
+ − 828
101
+ − 829
@{ML_response [display,gray]
+ − 830
"let
+ − 831
val input = filtered_input
+ − 832
(\"even and odd \" ^
+ − 833
\"where \" ^
+ − 834
\" even0[intro]: \\\"even 0\\\" \" ^
+ − 835
\"| evenS[intro]: \\\"odd n \<Longrightarrow> even (Suc n)\\\" \" ^
+ − 836
\"| oddS[intro]: \\\"even n \<Longrightarrow> odd (Suc n)\\\"\")
+ − 837
in
120
+ − 838
parse spec_parser input
101
+ − 839
end"
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 840
"(([(even, NONE, NoSyn), (odd, NONE, NoSyn)],
101
+ − 841
[((even0,\<dots>), \"\\^E\\^Ftoken\\^Eeven 0\\^E\\^F\\^E\"),
+ − 842
((evenS,\<dots>), \"\\^E\\^Ftoken\\^Eodd n \<Longrightarrow> even (Suc n)\\^E\\^F\\^E\"),
+ − 843
((oddS,\<dots>), \"\\^E\\^Ftoken\\^Eeven n \<Longrightarrow> odd (Suc n)\\^E\\^F\\^E\")]), [])"}
121
+ − 844
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 845
As you see, the result is a pair consisting of a list of
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 846
variables with optional type-annotation and syntax-annotation, and a list of
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 847
rules where every rule has optionally a name and an attribute.
121
+ − 848
426
+ − 849
The function @{ML_ind "fixes" in Parse} in Line 2 of the parser reads an
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 850
\isacommand{and}-separated
124
+ − 851
list of variables that can include optional type annotations and syntax translations.
121
+ − 852
For example:\footnote{Note that in the code we need to write
+ − 853
@{text "\\\"int \<Rightarrow> bool\\\""} in order to properly escape the double quotes
+ − 854
in the compound type.}
+ − 855
+ − 856
@{ML_response [display,gray]
+ − 857
"let
+ − 858
val input = filtered_input
+ − 859
\"foo::\\\"int \<Rightarrow> bool\\\" and bar::nat (\\\"BAR\\\" 100) and blonk\"
+ − 860
in
426
+ − 861
parse Parse.fixes input
121
+ − 862
end"
+ − 863
"([(foo, SOME \"\\^E\\^Ftoken\\^Eint \<Rightarrow> bool\\^E\\^F\\^E\", NoSyn),
+ − 864
(bar, SOME \"\\^E\\^Ftoken\\^Enat\\^E\\^F\\^E\", Mixfix (\"BAR\", [], 100)),
+ − 865
(blonk, NONE, NoSyn)],[])"}
50
+ − 866
*}
+ − 867
121
+ − 868
text {*
156
+ − 869
Whenever types are given, they are stored in the @{ML SOME}s. The types are
+ − 870
not yet used to type the variables: this must be done by type-inference later
149
+ − 871
on. Since types are part of the inner syntax they are strings with some
241
+ − 872
encoded information (see previous section). If a mixfix-syntax is
369
+ − 873
present for a variable, then it is stored in the
371
+ − 874
@{ML Mixfix} data structure; no syntax translation is indicated by @{ML_ind NoSyn in Syntax}.
121
+ − 875
+ − 876
\begin{readmore}
371
+ − 877
The data structure for mixfix annotations are implemented in
+ − 878
@{ML_file "Pure/Syntax/mixfix.ML"} and @{ML_file "Pure/Syntax/syntax.ML"}.
121
+ − 879
\end{readmore}
+ − 880
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 881
Lines 3 to 7 in the function @{ML spec_parser} implement the parser for a
219
+ − 882
list of introduction rules, that is propositions with theorem annotations
+ − 883
such as rule names and attributes. The introduction rules are propositions
426
+ − 884
parsed by @{ML_ind prop in Parse}. However, they can include an optional
219
+ − 885
theorem name plus some attributes. For example
121
+ − 886
+ − 887
@{ML_response [display,gray] "let
+ − 888
val input = filtered_input \"foo_lemma[intro,dest!]:\"
426
+ − 889
val ((name, attrib), _) = parse (Parse_Spec.thm_name \":\") input
121
+ − 890
in
+ − 891
(name, map Args.dest_src attrib)
+ − 892
end" "(foo_lemma, [((\"intro\", []), \<dots>), ((\"dest\", [\<dots>]), \<dots>)])"}
+ − 893
426
+ − 894
The function @{ML_ind opt_thm_name in Parse_Spec} is the ``optional'' variant of
+ − 895
@{ML_ind thm_name in Parse_Spec}. Theorem names can contain attributes. The name
131
+ − 896
has to end with @{text [quotes] ":"}---see the argument of
426
+ − 897
the function @{ML Parse_Spec.opt_thm_name} in Line 7.
121
+ − 898
+ − 899
\begin{readmore}
+ − 900
Attributes and arguments are implemented in the files @{ML_file "Pure/Isar/attrib.ML"}
+ − 901
and @{ML_file "Pure/Isar/args.ML"}.
+ − 902
\end{readmore}
101
+ − 903
*}
65
+ − 904
193
+ − 905
text_raw {*
+ − 906
\begin{exercise}
426
+ − 907
Have a look at how the parser @{ML Parse_Spec.where_alt_specs} is implemented
424
+ − 908
in file @{ML_file "Pure/Isar/parse_spec.ML"}. This parser corresponds
207
+ − 909
to the ``where-part'' of the introduction rules given above. Below
426
+ − 910
we paraphrase the code of @{ML_ind where_alt_specs in Parse_Spec} adapted to our
207
+ − 911
purposes.
193
+ − 912
\begin{isabelle}
+ − 913
*}
+ − 914
ML %linenosgray{*val spec_parser' =
426
+ − 915
Parse.fixes --
193
+ − 916
Scan.optional
426
+ − 917
(Parse.$$$ "where" |--
+ − 918
Parse.!!!
+ − 919
(Parse.enum1 "|"
+ − 920
((Parse_Spec.opt_thm_name ":" -- Parse.prop) --|
+ − 921
Scan.option (Scan.ahead (Parse.name ||
+ − 922
Parse.$$$ "[") --
+ − 923
Parse.!!! (Parse.$$$ "|"))))) [] *}
193
+ − 924
text_raw {*
+ − 925
\end{isabelle}
284
+ − 926
Both parsers accept the same input% that's not true:
+ − 927
% spec_parser accepts input that is refuted by spec_parser'
+ − 928
, but if you look closely, you can notice
207
+ − 929
an additional ``tail'' (Lines 8 to 10) in @{ML spec_parser'}. What is the purpose of
+ − 930
this additional ``tail''?
193
+ − 931
\end{exercise}
+ − 932
*}
+ − 933
229
+ − 934
text {*
426
+ − 935
(FIXME: @{ML Parse.type_args}, @{ML Parse.typ}, @{ML Parse.opt_mixfix})
229
+ − 936
*}
+ − 937
+ − 938
519
+ − 939
section {* New Commands\label{sec:newcommand} *}
65
+ − 940
+ − 941
text {*
68
+ − 942
Often new commands, for example for providing new definitional principles,
523
+ − 943
need to be implemented. While this is not difficult on the ML-level and for
520
+ − 944
jEdit, in order to be backwards compatible, new commands need also to be recognised
+ − 945
by Proof-General. This results in some subtle configuration issues, which we will
519
+ − 946
explain in the next section. Here we just describe how to define new commands
+ − 947
to work with jEdit.
65
+ − 948
520
+ − 949
Let us start with a ``silly'' command that does nothing at all. We
+ − 950
shall name this command \isacommand{foobar}. Before you can
+ − 951
implement any new command, you have to ``announce'' it in the
523
+ − 952
\isacommand{keywords}-section of your theory header. For \isacommand{foobar}
520
+ − 953
we need to write something like
519
+ − 954
+ − 955
\begin{graybox}
+ − 956
\isacommand{theory}~@{text Foo}\\
+ − 957
\isacommand{imports}~@{text Main}\\
+ − 958
\isacommand{keywords} @{text [quotes] "foobar"} @{text "::"} @{text "thy_decl"}\\
+ − 959
...
+ − 960
\end{graybox}
+ − 961
520
+ − 962
whereby @{ML_ind "thy_decl" in Keyword} indicates the kind of the
+ − 963
command. Another possible kind is @{text "thy_goal"}, or you can
523
+ − 964
also omit the kind entirely, in which case you declare a keyword
520
+ − 965
(something that is part of a command).
+ − 966
521
+ − 967
Now you can implement \isacommand{foobar} as follows.
519
+ − 968
*}
+ − 969
520
+ − 970
ML %grayML{*let
+ − 971
val do_nothing = Scan.succeed (Local_Theory.background_theory I)
+ − 972
in
+ − 973
Outer_Syntax.local_theory @{command_spec "foobar"}
+ − 974
"description of foobar"
+ − 975
do_nothing
+ − 976
end *}
+ − 977
+ − 978
text {*
+ − 979
The crucial function @{ML_ind local_theory in Outer_Syntax} expects
+ − 980
the name for the command, a kind indicator, a short description and
+ − 981
a parser producing a local theory transition (explained later). For the
+ − 982
name and the kind, you can use the ML-antiquotation @{text "@{command_spec ...}"}.
522
+ − 983
You can now write in your theory
520
+ − 984
*}
+ − 985
519
+ − 986
foobar
+ − 987
520
+ − 988
text {*
522
+ − 989
but of course you will not see anything since \isacommand{foobar} is
+ − 990
not intended to do anything. Remember, however, that this only
+ − 991
works in jEdit. In order to enable also Proof-General recognise this
+ − 992
command, a keyword file needs to be generated (see next section).
520
+ − 993
522
+ − 994
As it stands, the command \isacommand{foobar} is not very useful. Let
+ − 995
us refine it a bit next by letting it take a proposition as argument
+ − 996
and printing this proposition inside the tracing buffer. We announce
+ − 997
the command \isacommand{foobar\_trace} in the theory header as
520
+ − 998
+ − 999
\begin{graybox}
+ − 1000
\isacommand{keywords} @{text [quotes] "foobar_trace"} @{text "::"} @{text "thy_decl"}
+ − 1001
\end{graybox}
+ − 1002
+ − 1003
The crucial part of a command is the function that determines the
+ − 1004
behaviour of the command. In the code above we used a
+ − 1005
``do-nothing''-function, which because of the parser @{ML_ind succeed in Scan}
+ − 1006
does not parse any argument, but immediately returns the simple
+ − 1007
function @{ML "Local_Theory.background_theory I"}. We can replace
+ − 1008
this code by a function that first parses a proposition (using the
+ − 1009
parser @{ML Parse.prop}), then prints out some tracing information
+ − 1010
(using the function @{text trace_prop}) and finally does
+ − 1011
nothing. For this you can write:
519
+ − 1012
*}
+ − 1013
+ − 1014
ML %grayML{*let
+ − 1015
fun trace_prop str =
+ − 1016
Local_Theory.background_theory (fn ctxt => (tracing str; ctxt))
+ − 1017
in
520
+ − 1018
Outer_Syntax.local_theory @{command_spec "foobar_trace"}
519
+ − 1019
"traces a proposition"
+ − 1020
(Parse.prop >> trace_prop)
69
+ − 1021
end *}
65
+ − 1022
68
+ − 1023
text {*
521
+ − 1024
This command can now be used to
520
+ − 1025
see the proposition in the tracing buffer.
519
+ − 1026
*}
+ − 1027
+ − 1028
foobar_trace "True \<and> False"
+ − 1029
+ − 1030
text {*
+ − 1031
Note that so far we used @{ML_ind thy_decl in Keyword} as the kind
+ − 1032
indicator for the new command. This means that the command finishes as soon as
+ − 1033
the arguments are processed. Examples of this kind of commands are
+ − 1034
\isacommand{definition} and \isacommand{declare}. In other cases, commands
+ − 1035
are expected to parse some arguments, for example a proposition, and then
+ − 1036
``open up'' a proof in order to prove the proposition (for example
+ − 1037
\isacommand{lemma}) or prove some other properties (for example
+ − 1038
\isacommand{function}). To achieve this kind of behaviour, you have to use
+ − 1039
the kind indicator @{ML_ind thy_goal in Keyword} and the function @{ML
+ − 1040
"local_theory_to_proof" in Outer_Syntax} to set up the command.
+ − 1041
Below we show the command \isacommand{foobar\_goal} which takes a
+ − 1042
proposition as argument and then starts a proof in order to prove
520
+ − 1043
it. Therefore, we need to announce this command in the header
521
+ − 1044
as @{text "thy_goal"}.
520
+ − 1045
+ − 1046
\begin{graybox}
+ − 1047
\isacommand{keywords} @{text [quotes] "foobar_goal"} @{text "::"} @{text "thy_goal"}
+ − 1048
\end{graybox}
+ − 1049
+ − 1050
Then we can write:
519
+ − 1051
*}
+ − 1052
+ − 1053
ML%linenosgray{*let
+ − 1054
fun goal_prop str ctxt =
+ − 1055
let
+ − 1056
val prop = Syntax.read_prop ctxt str
+ − 1057
in
+ − 1058
Proof.theorem NONE (K I) [[(prop, [])]] ctxt
+ − 1059
end
+ − 1060
in
520
+ − 1061
Outer_Syntax.local_theory_to_proof @{command_spec "foobar_goal"}
519
+ − 1062
"proves a proposition"
+ − 1063
(Parse.prop >> goal_prop)
+ − 1064
end *}
65
+ − 1065
519
+ − 1066
text {*
+ − 1067
The function @{text goal_prop} in Lines 2 to 7 takes a string (the proposition to be
+ − 1068
proved) and a context as argument. The context is necessary in order to be able to use
+ − 1069
@{ML_ind read_prop in Syntax}, which converts a string into a proper proposition.
+ − 1070
In Line 6 the function @{ML_ind theorem in Proof} starts the proof for the
+ − 1071
proposition. Its argument @{ML NONE} stands for a locale (which we chose to
+ − 1072
omit); the argument @{ML "(K I)"} stands for a function that determines what
+ − 1073
should be done with the theorem once it is proved (we chose to just forget
+ − 1074
about it).
+ − 1075
+ − 1076
If you now type \isacommand{foobar\_goal}~@{text [quotes] "True \<and> True"},
+ − 1077
you obtain the following proof state:
+ − 1078
*}
+ − 1079
+ − 1080
foobar_goal "True \<and> True"
+ − 1081
txt {*
+ − 1082
\begin{minipage}{\textwidth}
+ − 1083
@{subgoals [display]}
+ − 1084
\end{minipage}\medskip
+ − 1085
+ − 1086
and can prove the proposition as follows.
+ − 1087
*}
+ − 1088
apply(rule conjI)
+ − 1089
apply(rule TrueI)+
+ − 1090
done
+ − 1091
+ − 1092
text {*
521
+ − 1093
The last command we describe here is
522
+ − 1094
\isacommand{foobar\_proof}. Like \isacommand{foobar\_goal}, its purpose is
520
+ − 1095
to take a proposition and open a corresponding proof-state that
+ − 1096
allows us to give a proof for it. However, unlike
522
+ − 1097
\isacommand{foobar\_goal}, the proposition will be given as a
520
+ − 1098
ML-value. Such a command is quite useful during development
521
+ − 1099
when you generate a goal on the ML-level and want to see
522
+ − 1100
whether it is provable. In addition we want to allow the proved
521
+ − 1101
proposition to have a name that can be referenced later on.
520
+ − 1102
522
+ − 1103
The first problem for \isacommand{foobar\_proof} is to parse some
+ − 1104
text as ML-source and then interpret it as an Isabelle term using
+ − 1105
the ML-runtime. For the parsing part, we can use the function
+ − 1106
@{ML_ind "ML_source" in Parse} in the structure @{ML_struct
+ − 1107
Parse}. For running the ML-interpreter we need the following
+ − 1108
scaffolding code.
520
+ − 1109
*}
+ − 1110
+ − 1111
ML %grayML{*
529
+ − 1112
structure Result = Proof_Data
+ − 1113
(type T = unit -> term
+ − 1114
fun init thy () = error "Result")
520
+ − 1115
+ − 1116
val result_cookie = (Result.get, Result.put, "Result.put") *}
+ − 1117
+ − 1118
text {*
522
+ − 1119
With this in place, we can implement the code for \isacommand{foobar\_prove}
+ − 1120
as follows.
520
+ − 1121
*}
+ − 1122
+ − 1123
ML %linenosgray{*let
+ − 1124
fun after_qed thm_name thms lthy =
+ − 1125
Local_Theory.note (thm_name, (flat thms)) lthy |> snd
+ − 1126
+ − 1127
fun setup_proof (thm_name, (txt, pos)) lthy =
+ − 1128
let
+ − 1129
val trm = Code_Runtime.value lthy result_cookie ("", txt)
+ − 1130
in
+ − 1131
Proof.theorem NONE (after_qed thm_name) [[(trm, [])]] lthy
+ − 1132
end
+ − 1133
+ − 1134
val parser = Parse_Spec.opt_thm_name ":" -- Parse.ML_source
+ − 1135
in
+ − 1136
Outer_Syntax.local_theory_to_proof @{command_spec "foobar_prove"}
+ − 1137
"proving a proposition"
+ − 1138
(parser >> setup_proof)
+ − 1139
end*}
+ − 1140
+ − 1141
text {*
+ − 1142
In Line 12, we implement a parser that first reads in an optional lemma name (terminated
521
+ − 1143
by ``:'') and then some ML-code. The function in Lines 5 to 10 takes the ML-text
+ − 1144
and lets the ML-runtime evaluate it using the function @{ML_ind value in Code_Runtime}
520
+ − 1145
in the structure @{ML_struct Code_Runtime}. Once the ML-text has been turned into a term,
+ − 1146
the function @{ML theorem in Proof} opens a corresponding proof-state. This function takes the
522
+ − 1147
function @{text "after_qed"} as argument, whose purpose is to store the theorem
+ − 1148
(once it is proven) under the given name @{text "thm_name"}.
520
+ − 1149
521
+ − 1150
You can now define a term, for example
520
+ − 1151
*}
+ − 1152
+ − 1153
ML %grayML{*val prop_true = @{prop "True"}*}
+ − 1154
+ − 1155
text {*
521
+ − 1156
and give it a proof using \isacommand{foobar\_prove}:
520
+ − 1157
*}
+ − 1158
+ − 1159
foobar_prove test: prop_true
+ − 1160
apply(rule TrueI)
+ − 1161
done
+ − 1162
+ − 1163
text {*
522
+ − 1164
Finally you can test whether the lemma has been stored under the given name.
520
+ − 1165
+ − 1166
\begin{isabelle}
+ − 1167
\isacommand{thm}~@{text "test"}\\
+ − 1168
@{text "> "}~@{thm TrueI}
+ − 1169
\end{isabelle}
+ − 1170
519
+ − 1171
While this is everything you have to do for a new command when using jEdit,
+ − 1172
things are not as simple when using Emacs and ProofGeneral. We explain the details
+ − 1173
next.
+ − 1174
*}
+ − 1175
+ − 1176
+ − 1177
section {* Proof-General and Keyword Files *}
+ − 1178
+ − 1179
text {*
+ − 1180
In order to use a new command in Emacs and Proof-General, you need a keyword
+ − 1181
file that can be loaded by ProofGeneral. To keep things simple we take as
+ − 1182
running example the command \isacommand{foobar} from the previous section.
+ − 1183
+ − 1184
A keyword file can be generated with the command-line:
68
+ − 1185
74
+ − 1186
@{text [display] "$ isabelle keywords -k foobar some_log_files"}
65
+ − 1187
74
+ − 1188
The option @{text "-k foobar"} indicates which postfix the name of the keyword file
80
+ − 1189
will be assigned. In the case above the file will be named @{text
86
+ − 1190
"isar-keywords-foobar.el"}. This command requires log files to be
68
+ − 1191
present (in order to extract the keywords from them). To generate these log
101
+ − 1192
files, you first need to package the code above into a separate theory file named
68
+ − 1193
@{text "Command.thy"}, say---see Figure~\ref{fig:commandtheory} for the
+ − 1194
complete code.
65
+ − 1195
66
+ − 1196
+ − 1197
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+ − 1198
\begin{figure}[t]
69
+ − 1199
\begin{graybox}\small
66
+ − 1200
\isacommand{theory}~@{text Command}\\
+ − 1201
\isacommand{imports}~@{text Main}\\
519
+ − 1202
\isacommand{keywords} @{text [quotes] "foobar"} @{text "::"} @{text "thy_decl"}\\
66
+ − 1203
\isacommand{begin}\\
85
+ − 1204
\isacommand{ML}~@{text "\<verbopen>"}\\
66
+ − 1205
@{ML
+ − 1206
"let
449
+ − 1207
val do_nothing = Scan.succeed (Local_Theory.background_theory I)
66
+ − 1208
in
520
+ − 1209
Outer_Syntax.local_theory @{command_spec \"foobar\"}
519
+ − 1210
\"description of foobar\"
+ − 1211
do_nothing
66
+ − 1212
end"}\\
85
+ − 1213
@{text "\<verbclose>"}\\
66
+ − 1214
\isacommand{end}
80
+ − 1215
\end{graybox}
241
+ − 1216
\caption{This file can be used to generate a log file. This log file in turn can
+ − 1217
be used to generate a keyword file containing the command \isacommand{foobar}.
+ − 1218
\label{fig:commandtheory}}
66
+ − 1219
\end{figure}
+ − 1220
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+ − 1221
75
+ − 1222
For our purposes it is sufficient to use the log files of the theories
68
+ − 1223
@{text "Pure"}, @{text "HOL"} and @{text "Pure-ProofGeneral"}, as well as
75
+ − 1224
the log file for the theory @{text "Command.thy"}, which contains the new
+ − 1225
\isacommand{foobar}-command. If you target other logics besides HOL, such
74
+ − 1226
as Nominal or ZF, then you need to adapt the log files appropriately.
104
+ − 1227
74
+ − 1228
@{text Pure} and @{text HOL} are usually compiled during the installation of
+ − 1229
Isabelle. So log files for them should be already available. If not, then
75
+ − 1230
they can be conveniently compiled with the help of the build-script from the Isabelle
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1231
distribution.
65
+ − 1232
+ − 1233
@{text [display]
+ − 1234
"$ ./build -m \"Pure\"
+ − 1235
$ ./build -m \"HOL\""}
+ − 1236
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1237
The @{text "Pure-ProofGeneral"} theory needs to be compiled with:
65
+ − 1238
+ − 1239
@{text [display] "$ ./build -m \"Pure-ProofGeneral\" \"Pure\""}
+ − 1240
101
+ − 1241
For the theory @{text "Command.thy"}, you first need to create a ``managed'' subdirectory
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1242
with:
66
+ − 1243
68
+ − 1244
@{text [display] "$ isabelle mkdir FoobarCommand"}
66
+ − 1245
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1246
This generates a directory containing the files:
66
+ − 1247
+ − 1248
@{text [display]
+ − 1249
"./IsaMakefile
68
+ − 1250
./FoobarCommand/ROOT.ML
+ − 1251
./FoobarCommand/document
+ − 1252
./FoobarCommand/document/root.tex"}
65
+ − 1253
+ − 1254
101
+ − 1255
You need to copy the file @{text "Command.thy"} into the directory @{text "FoobarCommand"}
66
+ − 1256
and add the line
+ − 1257
207
+ − 1258
@{text [display] "no_document use_thy \"Command\";"}
66
+ − 1259
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1260
to the file @{text "./FoobarCommand/ROOT.ML"}. You can now compile the theory by just typing:
65
+ − 1261
+ − 1262
@{text [display] "$ isabelle make"}
+ − 1263
101
+ − 1264
If the compilation succeeds, you have finally created all the necessary log files.
+ − 1265
They are stored in the directory
65
+ − 1266
519
+ − 1267
@{text [display] "~/.isabelle/heaps/Isabelle2012/polyml-5.2.1_x86-linux/log"}
65
+ − 1268
74
+ − 1269
or something similar depending on your Isabelle distribution and architecture.
+ − 1270
One quick way to assign a shell variable to this directory is by typing
66
+ − 1271
+ − 1272
@{text [display] "$ ISABELLE_LOGS=\"$(isabelle getenv -b ISABELLE_OUTPUT)\"/log"}
+ − 1273
156
+ − 1274
on the Unix prompt. If you now type @{text "ls $ISABELLE_LOGS"}, then the
128
+ − 1275
directory should include the files:
65
+ − 1276
+ − 1277
@{text [display]
+ − 1278
"Pure.gz
+ − 1279
HOL.gz
+ − 1280
Pure-ProofGeneral.gz
68
+ − 1281
HOL-FoobarCommand.gz"}
65
+ − 1282
101
+ − 1283
From them you can create the keyword files. Assuming the name
75
+ − 1284
of the directory is in @{text "$ISABELLE_LOGS"},
74
+ − 1285
then the Unix command for creating the keyword file is:
65
+ − 1286
+ − 1287
@{text [display]
68
+ − 1288
"$ isabelle keywords -k foobar
80
+ − 1289
$ISABELLE_LOGS/{Pure.gz,HOL.gz,Pure-ProofGeneral.gz,HOL-FoobarCommand.gz}"}
65
+ − 1290
80
+ − 1291
The result is the file @{text "isar-keywords-foobar.el"}. It should contain
321
+ − 1292
the string @{text "foobar"} twice.\footnote{To see whether things are fine,
+ − 1293
check that @{text "grep foobar"} on this file returns something non-empty.}
+ − 1294
This keyword file needs to be copied into the directory @{text
+ − 1295
"~/.isabelle/etc"}. To make ProofGeneral aware of it, you have to start
+ − 1296
Isabelle with the option @{text "-k foobar"}, that is:
65
+ − 1297
80
+ − 1298
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1299
@{text [display] "$ isabelle emacs -k foobar a_theory_file"}
65
+ − 1300
101
+ − 1301
If you now build a theory on top of @{text "Command.thy"},
519
+ − 1302
then you can now use the command \isacommand{foobar}
+ − 1303
in Proof-General
321
+ − 1304
519
+ − 1305
A similar procedure has to be done with any
326
+ − 1306
other new command, and also any new keyword that is introduced with
514
+ − 1307
the function @{ML_ind define in Keyword}. For example:
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1308
*}
65
+ − 1309
517
d8c376662bb4
removed special ML-setup and replaced it by explicit markups (i.e., %grayML)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1310
ML %grayML{*val _ = Keyword.define ("blink", NONE) *}
65
+ − 1311
218
7ff7325e3b4e
started to adapt the rest of chapter 5 to the simplified version without parameters (they will be described in the extension section)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1312
68
+ − 1313
text {*
519
+ − 1314
Also if the kind of a command changes, from @{text "thy_decl"} to
+ − 1315
@{text "thy_goal"} say, you need to recreate the keyword file.
68
+ − 1316
*}
+ − 1317
522
+ − 1318
+ − 1319
+ − 1320
321
+ − 1321
text {*
327
+ − 1322
{\bf TBD below}
74
+ − 1323
522
+ − 1324
*}
65
+ − 1325
451
+ − 1326
324
+ − 1327
321
+ − 1328
+ − 1329
322
+ − 1330
(*
+ − 1331
ML {*
+ − 1332
structure TacticData = ProofDataFun
+ − 1333
(
+ − 1334
type T = thm list -> tactic;
+ − 1335
fun init _ = undefined;
366
+ − 1336
)
322
+ − 1337
+ − 1338
val set_tactic = TacticData.put;
+ − 1339
*}
+ − 1340
+ − 1341
ML {*
+ − 1342
TacticData.get @{context}
+ − 1343
*}
+ − 1344
+ − 1345
ML {* Method.set_tactic *}
+ − 1346
ML {* fun tactic (facts: thm list) : tactic = (atac 1) *}
+ − 1347
ML {* Context.map_proof *}
+ − 1348
ML {* ML_Context.expression *}
+ − 1349
ML {* METHOD *}
+ − 1350
+ − 1351
+ − 1352
ML {*
+ − 1353
fun myexpression pos bind body txt =
+ − 1354
let
+ − 1355
val _ = tracing ("bind)" ^ bind)
+ − 1356
val _ = tracing ("body)" ^ body)
+ − 1357
val _ = tracing ("txt)" ^ txt)
+ − 1358
val _ = tracing ("result) " ^ "Context.set_thread_data (SOME (let " ^ bind ^ " = " ^ txt ^ " in " ^ body ^
+ − 1359
" end (ML_Context.the_generic_context ())));")
+ − 1360
in
+ − 1361
ML_Context.exec (fn () => ML_Context.eval false pos
+ − 1362
("Context.set_thread_data (SOME (let " ^ bind ^ " = " ^ txt ^ " in " ^ body ^
+ − 1363
" end (ML_Context.the_generic_context ())));"))
+ − 1364
end
+ − 1365
*}
319
+ − 1366
+ − 1367
322
+ − 1368
ML {*
+ − 1369
fun ml_tactic (txt, pos) ctxt =
+ − 1370
let
+ − 1371
val ctxt' = ctxt |> Context.proof_map
+ − 1372
(myexpression pos
+ − 1373
"fun tactic (facts: thm list) : tactic"
+ − 1374
"Context.map_proof (Method.set_tactic tactic)" txt);
+ − 1375
in
+ − 1376
Context.setmp_thread_data (SOME (Context.Proof ctxt)) (TacticData.get ctxt')
+ − 1377
end;
+ − 1378
*}
+ − 1379
+ − 1380
ML {*
+ − 1381
fun tactic3 (txt, pos) ctxt =
+ − 1382
let
+ − 1383
val _ = tracing ("1) " ^ txt )
+ − 1384
in
+ − 1385
METHOD (ml_tactic (txt, pos) ctxt; K (atac 1))
+ − 1386
end
+ − 1387
*}
+ − 1388
+ − 1389
setup {*
426
+ − 1390
Method.setup (Binding.name "tactic3") (Scan.lift (Parse.position Args.name)
322
+ − 1391
>> tactic3)
+ − 1392
"ML tactic as proof method"
+ − 1393
*}
+ − 1394
+ − 1395
lemma "A \<Longrightarrow> A"
+ − 1396
apply(tactic3 {* (atac 1) *})
+ − 1397
done
+ − 1398
+ − 1399
ML {*
+ − 1400
(ML_Context.the_generic_context ())
+ − 1401
*}
+ − 1402
+ − 1403
ML {*
+ − 1404
Context.set_thread_data;
+ − 1405
ML_Context.the_generic_context
+ − 1406
*}
+ − 1407
+ − 1408
lemma "A \<Longrightarrow> A"
+ − 1409
ML_prf {*
+ − 1410
Context.set_thread_data (SOME (let fun tactic (facts: thm list) : tactic = (atac 1) in Context.map_proof (Method.set_tactic tactic) end (ML_Context.the_generic_context ())));
+ − 1411
*}
+ − 1412
+ − 1413
ML {*
+ − 1414
Context.set_thread_data (SOME ((let fun tactic (facts: thm list) : tactic = (atac 1) in 3 end) (ML_Context.the_generic_context ())));
+ − 1415
*}
+ − 1416
+ − 1417
ML {*
+ − 1418
Context.set_thread_data (SOME (let
+ − 1419
fun tactic (facts: thm list) : tactic = (atac 1)
+ − 1420
in
+ − 1421
Context.map_proof (Method.set_tactic tactic)
+ − 1422
end
+ − 1423
(ML_Context.the_generic_context ())));
+ − 1424
*}
+ − 1425
+ − 1426
+ − 1427
ML {*
+ − 1428
let
+ − 1429
fun tactic (facts: thm list) : tactic = atac
+ − 1430
in
+ − 1431
Context.map_proof (Method.set_tactic tactic)
+ − 1432
end *}
+ − 1433
+ − 1434
end *}
+ − 1435
+ − 1436
ML {* Toplevel.program (fn () =>
+ − 1437
(ML_Context.expression Position.none "val plus : int" "3 + 4" "1" (Context.Proof @{context})))*}
+ − 1438
+ − 1439
+ − 1440
ML {*
+ − 1441
fun ml_tactic (txt, pos) ctxt =
+ − 1442
let
+ − 1443
val ctxt' = ctxt |> Context.proof_map
+ − 1444
(ML_Context.expression pos
+ − 1445
"fun tactic (facts: thm list) : tactic"
+ − 1446
"Context.map_proof (Method.set_tactic tactic)" txt);
+ − 1447
in Context.setmp_thread_data (SOME (Context.Proof ctxt)) (TacticData.get ctxt') end;
+ − 1448
+ − 1449
*}
+ − 1450
+ − 1451
ML {*
+ − 1452
Context.set_thread_data (SOME (let fun tactic (facts: thm list) : tactic = (atac 1) in Context.map_proof (Method.set_tactic tactic) end (ML_Context.the_generic_context ())));
+ − 1453
*}
+ − 1454
*)
319
+ − 1455
211
d5accbc67e1b
more work on simple inductive and marked all sections that are still seriously incomplete with TBD
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1456
section {* Methods (TBD) *}
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1457
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1458
text {*
207
+ − 1459
(FIXME: maybe move to after the tactic section)
+ − 1460
221
+ − 1461
Methods are central to Isabelle. They are the ones you use for example
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1462
in \isacommand{apply}. To print out all currently known methods you can use the
192
+ − 1463
Isabelle command:
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1464
207
+ − 1465
\begin{isabelle}
+ − 1466
\isacommand{print\_methods}\\
+ − 1467
@{text "> methods:"}\\
+ − 1468
@{text "> -: do nothing (insert current facts only)"}\\
+ − 1469
@{text "> HOL.default: apply some intro/elim rule (potentially classical)"}\\
+ − 1470
@{text "> ..."}
+ − 1471
\end{isabelle}
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1472
193
+ − 1473
An example of a very simple method is:
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1474
*}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1475
244
+ − 1476
method_setup %gray foo =
181
+ − 1477
{* Scan.succeed
+ − 1478
(K (SIMPLE_METHOD ((etac @{thm conjE} THEN' rtac @{thm conjI}) 1))) *}
244
+ − 1479
"foo method for conjE and conjI"
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1480
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1481
text {*
286
+ − 1482
It defines the method @{text foo}, which takes no arguments (therefore the
207
+ − 1483
parser @{ML Scan.succeed}) and only applies a single tactic, namely the tactic which
256
+ − 1484
applies @{thm [source] conjE} and then @{thm [source] conjI}. The function
344
+ − 1485
@{ML_ind SIMPLE_METHOD in Method}
287
+ − 1486
turns such a tactic into a method. The method @{text "foo"} can be used as follows
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1487
*}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1488
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1489
lemma shows "A \<and> B \<Longrightarrow> C \<and> D"
244
+ − 1490
apply(foo)
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1491
txt {*
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1492
where it results in the goal state
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1493
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1494
\begin{minipage}{\textwidth}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1495
@{subgoals}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1496
\end{minipage} *}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1497
(*<*)oops(*>*)
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1498
519
+ − 1499
method_setup test = {*
+ − 1500
Scan.lift (Scan.succeed (K Method.succeed)) *} {* bla *}
421
+ − 1501
+ − 1502
lemma "True"
+ − 1503
apply(test)
+ − 1504
oops
+ − 1505
519
+ − 1506
method_setup joker = {*
+ − 1507
Scan.lift (Scan.succeed (fn ctxt => Method.cheating true ctxt)) *} {* bla *}
421
+ − 1508
+ − 1509
lemma "False"
+ − 1510
apply(joker)
+ − 1511
oops
+ − 1512
+ − 1513
text {* if true is set then always works *}
+ − 1514
+ − 1515
ML {* atac *}
+ − 1516
+ − 1517
method_setup first_atac = {* Scan.lift (Scan.succeed (K (SIMPLE_METHOD (atac 1)))) *} {* bla *}
+ − 1518
+ − 1519
ML {* HEADGOAL *}
+ − 1520
+ − 1521
lemma "A \<Longrightarrow> A"
+ − 1522
apply(first_atac)
+ − 1523
oops
+ − 1524
+ − 1525
method_setup my_atac = {* Scan.lift (Scan.succeed (K (SIMPLE_METHOD' atac))) *} {* bla *}
+ − 1526
+ − 1527
lemma "A \<Longrightarrow> A"
+ − 1528
apply(my_atac)
+ − 1529
oops
+ − 1530
+ − 1531
193
+ − 1532
319
+ − 1533
+ − 1534
+ − 1535
+ − 1536
193
+ − 1537
(*
+ − 1538
ML {* SIMPLE_METHOD *}
+ − 1539
ML {* METHOD *}
+ − 1540
ML {* K (SIMPLE_METHOD ((etac @{thm conjE} THEN' rtac @{thm conjI}) 1)) *}
+ − 1541
ML {* Scan.succeed *}
+ − 1542
*)
+ − 1543
421
+ − 1544
ML {* resolve_tac *}
+ − 1545
+ − 1546
method_setup myrule =
+ − 1547
{* Scan.lift (Scan.succeed (K (METHOD (fn thms => resolve_tac thms 1)))) *}
+ − 1548
{* bla *}
+ − 1549
+ − 1550
lemma
+ − 1551
assumes a: "A \<Longrightarrow> B \<Longrightarrow> C"
+ − 1552
shows "C"
+ − 1553
using a
+ − 1554
apply(myrule)
+ − 1555
oops
+ − 1556
+ − 1557
+ − 1558
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1559
text {*
421
+ − 1560
(********************************************************)
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1561
(FIXME: explain a version of rule-tac)
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1562
*}
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1563
220
+ − 1564
end