4
+ − 1
theory Parsing
321
+ − 2
imports Base "Helper/Command/Command" "Package/Simple_Inductive_Package"
4
+ − 3
begin
+ − 4
346
+ − 5
(*<*)
+ − 6
setup {*
+ − 7
open_file_with_prelude
+ − 8
"Parsing_Code.thy"
+ − 9
["theory Parsing",
+ − 10
"imports Base \"Package/Simple_Inductive_Package\"",
+ − 11
"begin"]
+ − 12
*}
+ − 13
(*>*)
+ − 14
4
+ − 15
chapter {* Parsing *}
+ − 16
+ − 17
text {*
321
+ − 18
Isabelle distinguishes between \emph{outer} and \emph{inner}
+ − 19
syntax. Commands, such as \isacommand{definition}, \isacommand{inductive}
+ − 20
and so on, belong to the outer syntax, whereas terms, types and so on belong
+ − 21
to the inner syntax. For parsing inner syntax, Isabelle uses a rather
+ − 22
general and sophisticated algorithm, which is driven by priority
+ − 23
grammars. Parsers for outer syntax are built up by functional parsing
+ − 24
combinators. These combinators are a well-established technique for parsing,
+ − 25
which has, for example, been described in Paulson's classic ML-book
+ − 26
\cite{paulson-ml2}. Isabelle developers are usually concerned with writing
+ − 27
these outer syntax parsers, either for new definitional packages or for
+ − 28
calling methods with specific arguments.
42
+ − 29
+ − 30
\begin{readmore}
236
+ − 31
The library for writing parser combinators is split up, roughly, into two
326
+ − 32
parts: The first part consists of a collection of generic parser combinators
236
+ − 33
defined in the structure @{ML_struct Scan} in the file @{ML_file
+ − 34
"Pure/General/scan.ML"}. The second part of the library consists of
+ − 35
combinators for dealing with specific token types, which are defined in the
+ − 36
structure @{ML_struct OuterParse} in the file @{ML_file
326
+ − 37
"Pure/Isar/outer_parse.ML"}. In addition specific parsers for packages are
+ − 38
defined in @{ML_file "Pure/Isar/spec_parse.ML"}. Parsers for method arguments
+ − 39
are defined in @{ML_file "Pure/Isar/args.ML"}.
42
+ − 40
\end{readmore}
38
+ − 41
+ − 42
*}
+ − 43
49
+ − 44
section {* Building Generic Parsers *}
38
+ − 45
+ − 46
text {*
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 47
240
+ − 48
Let us first have a look at parsing strings using generic parsing
344
+ − 49
combinators. The function @{ML_ind "$$" in Scan} takes a string as argument and will
240
+ − 50
``consume'' this string from a given input list of strings. ``Consume'' in
+ − 51
this context means that it will return a pair consisting of this string and
+ − 52
the rest of the input list. For example:
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 53
240
+ − 54
@{ML_response [display,gray]
+ − 55
"($$ \"h\") (Symbol.explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 56
240
+ − 57
@{ML_response [display,gray]
+ − 58
"($$ \"w\") (Symbol.explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"}
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 59
240
+ − 60
The function @{ML "$$"} will either succeed (as in the two examples above)
+ − 61
or raise the exception @{text "FAIL"} if no string can be consumed. For
+ − 62
example trying to parse
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 63
240
+ − 64
@{ML_response_fake [display,gray]
+ − 65
"($$ \"x\") (Symbol.explode \"world\")"
+ − 66
"Exception FAIL raised"}
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 67
240
+ − 68
will raise the exception @{text "FAIL"}. There are three exceptions used in
+ − 69
the parsing combinators:
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 70
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 71
\begin{itemize}
58
+ − 72
\item @{text "FAIL"} is used to indicate that alternative routes of parsing
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 73
might be explored.
58
+ − 74
\item @{text "MORE"} indicates that there is not enough input for the parser. For example
+ − 75
in @{text "($$ \"h\") []"}.
60
5b9c6010897b
doem tuning and made the cookbook work again with recent changes (CookBook/Package/Ind_Interface.thy needs to be looked at to see what the problem with the new parser type is)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 76
\item @{text "ABORT"} is the exception that is raised when a dead end is reached.
108
8bea3f74889d
added to the tactical chapter; polished; added the tabularstar environment (which is just tabular*)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 77
It is used for example in the function @{ML "!!"} (see below).
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 78
\end{itemize}
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 79
50
+ − 80
However, note that these exceptions are private to the parser and cannot be accessed
49
+ − 81
by the programmer (for example to handle them).
240
+ − 82
357
+ − 83
In the examples above we use the function @{ML_ind explode in Symbol} from the
344
+ − 84
structure @{ML_struct Symbol}, instead of the more standard library function
+ − 85
@{ML_ind explode}, for obtaining an input list for the parser. The reason is
+ − 86
that @{ML_ind explode} in @{ML_struct Symbol} is aware of character
+ − 87
sequences, for example @{text "\<foo>"}, that have a special meaning in
+ − 88
Isabelle. To see the difference consider
240
+ − 89
+ − 90
@{ML_response_fake [display,gray]
+ − 91
"let
261
+ − 92
val input = \"\<foo> bar\"
240
+ − 93
in
+ − 94
(explode input, Symbol.explode input)
+ − 95
end"
+ − 96
"([\"\\\", \"<\", \"f\", \"o\", \"o\", \">\", \" \", \"b\", \"a\", \"r\"],
261
+ − 97
[\"\<foo>\", \" \", \"b\", \"a\", \"r\"])"}
240
+ − 98
256
+ − 99
Slightly more general than the parser @{ML "$$"} is the function
344
+ − 100
@{ML_ind one in Scan}, in that it takes a predicate as argument and
256
+ − 101
then parses exactly
52
+ − 102
one item from the input list satisfying this predicate. For example the
58
+ − 103
following parser either consumes an @{text [quotes] "h"} or a @{text
49
+ − 104
[quotes] "w"}:
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 105
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 106
@{ML_response [display,gray]
40
+ − 107
"let
+ − 108
val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\")
240
+ − 109
val input1 = Symbol.explode \"hello\"
+ − 110
val input2 = Symbol.explode \"world\"
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 111
in
236
+ − 112
(hw input1, hw input2)
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 113
end"
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 114
"((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 115
344
+ − 116
Two parsers can be connected in sequence by using the function @{ML_ind "--" in Scan}.
220
+ − 117
For example parsing @{text "h"}, @{text "e"} and @{text "l"} (in this
+ − 118
order) you can achieve by:
38
+ − 119
236
+ − 120
@{ML_response [display,gray]
240
+ − 121
"($$ \"h\" -- $$ \"e\" -- $$ \"l\") (Symbol.explode \"hello\")"
236
+ − 122
"(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"}
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 123
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 124
Note how the result of consumed strings builds up on the left as nested pairs.
38
+ − 125
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 126
If, as in the previous example, you want to parse a particular string,
326
+ − 127
then you can use the function @{ML_ind this_string in Scan}.
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 128
236
+ − 129
@{ML_response [display,gray]
240
+ − 130
"Scan.this_string \"hell\" (Symbol.explode \"hello\")"
236
+ − 131
"(\"hell\", [\"o\"])"}
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 132
256
+ − 133
Parsers that explore alternatives can be constructed using the function
344
+ − 134
@{ML_ind "||" in Scan}. The parser @{ML "(p || q)" for p q} returns the
58
+ − 135
result of @{text "p"}, in case it succeeds, otherwise it returns the
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 136
result of @{text "q"}. For example:
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 137
38
+ − 138
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 139
@{ML_response [display,gray]
40
+ − 140
"let
236
+ − 141
val hw = $$ \"h\" || $$ \"w\"
240
+ − 142
val input1 = Symbol.explode \"hello\"
+ − 143
val input2 = Symbol.explode \"world\"
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 144
in
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 145
(hw input1, hw input2)
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 146
end"
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 147
"((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
38
+ − 148
344
+ − 149
The functions @{ML_ind "|--" in Scan} and @{ML_ind "--|" in Scan} work like the sequencing
321
+ − 150
function for parsers, except that they discard the item being parsed by the
357
+ − 151
first (respectively second) parser. That means the item being dropped is the
+ − 152
one that @{ML_ind "|--" in Scan} and @{ML_ind "--|" in Scan} ``point'' away.
+ − 153
For example:
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 154
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 155
@{ML_response [display,gray]
40
+ − 156
"let
236
+ − 157
val just_e = $$ \"h\" |-- $$ \"e\"
+ − 158
val just_h = $$ \"h\" --| $$ \"e\"
240
+ − 159
val input = Symbol.explode \"hello\"
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 160
in
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 161
(just_e input, just_h input)
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 162
end"
241
+ − 163
"((\"e\", [\"l\", \"l\", \"o\"]), (\"h\", [\"l\", \"l\", \"o\"]))"}
38
+ − 164
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 165
The parser @{ML "Scan.optional p x" for p x} returns the result of the parser
58
+ − 166
@{text "p"}, if it succeeds; otherwise it returns
104
+ − 167
the default value @{text "x"}. For example:
38
+ − 168
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 169
@{ML_response [display,gray]
40
+ − 170
"let
+ − 171
val p = Scan.optional ($$ \"h\") \"x\"
240
+ − 172
val input1 = Symbol.explode \"hello\"
+ − 173
val input2 = Symbol.explode \"world\"
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 174
in
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 175
(p input1, p input2)
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 176
end"
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 177
"((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 178
344
+ − 179
The function @{ML_ind option in Scan} works similarly, except no default value can
50
+ − 180
be given. Instead, the result is wrapped as an @{text "option"}-type. For example:
+ − 181
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 182
@{ML_response [display,gray]
50
+ − 183
"let
+ − 184
val p = Scan.option ($$ \"h\")
240
+ − 185
val input1 = Symbol.explode \"hello\"
+ − 186
val input2 = Symbol.explode \"world\"
50
+ − 187
in
+ − 188
(p input1, p input2)
+ − 189
end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
49
+ − 190
344
+ − 191
The function @{ML_ind ahead in Scan} parses some input, but leaves the original
326
+ − 192
input unchanged. For example:
+ − 193
+ − 194
@{ML_response [display,gray]
+ − 195
"Scan.ahead (Scan.this_string \"foo\") (Symbol.explode \"foo\")"
+ − 196
"(\"foo\", [\"f\", \"o\", \"o\"])"}
+ − 197
344
+ − 198
The function @{ML_ind "!!" in Scan} helps with producing appropriate error messages
326
+ − 199
during parsing. For example if you want to parse @{text p} immediately
58
+ − 200
followed by @{text q}, or start a completely different parser @{text r},
104
+ − 201
you might write:
40
+ − 202
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 203
@{ML [display,gray] "(p -- q) || r" for p q r}
40
+ − 204
326
+ − 205
However, this parser is problematic for producing a useful error
+ − 206
message, if the parsing of @{ML "(p -- q)" for p q} fails. Because with the
+ − 207
parser above you lose the information that @{text p} should be followed by @{text q}.
220
+ − 208
To see this assume that @{text p} is present in the input, but it is not
+ − 209
followed by @{text q}. That means @{ML "(p -- q)" for p q} will fail and
+ − 210
hence the alternative parser @{text r} will be tried. However, in many
236
+ − 211
circumstances this will be the wrong parser for the input ``@{text "p"}-followed-by-something''
220
+ − 212
and therefore will also fail. The error message is then caused by the failure
+ − 213
of @{text r}, not by the absence of @{text q} in the input. This kind of
+ − 214
situation can be avoided when using the function @{ML "!!"}. This function
+ − 215
aborts the whole process of parsing in case of a failure and prints an error
+ − 216
message. For example if you invoke the parser
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 217
40
+ − 218
236
+ − 219
@{ML [display,gray] "!! (fn _ => \"foo\") ($$ \"h\")"}
40
+ − 220
58
+ − 221
on @{text [quotes] "hello"}, the parsing succeeds
39
631d12c25bde
substantial changes to the antiquotations (preliminary version)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 222
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 223
@{ML_response [display,gray]
240
+ − 224
"(!! (fn _ => \"foo\") ($$ \"h\")) (Symbol.explode \"hello\")"
236
+ − 225
"(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
40
+ − 226
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 227
but if you invoke it on @{text [quotes] "world"}
40
+ − 228
240
+ − 229
@{ML_response_fake [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\")) (Symbol.explode \"world\")"
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 230
"Exception ABORT raised"}
40
+ − 231
108
8bea3f74889d
added to the tactical chapter; polished; added the tabularstar environment (which is just tabular*)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 232
then the parsing aborts and the error message @{text "foo"} is printed. In order to
120
+ − 233
see the error message properly, you need to prefix the parser with the function
344
+ − 234
@{ML_ind error in Scan}. For example:
40
+ − 235
236
+ − 236
@{ML_response_fake [display,gray]
+ − 237
"Scan.error (!! (fn _ => \"foo\") ($$ \"h\"))"
+ − 238
"Exception Error \"foo\" raised"}
40
+ − 239
344
+ − 240
This ``prefixing'' is usually done by wrappers such as @{ML_ind local_theory in OuterSyntax}
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 241
(see Section~\ref{sec:newcommand} which explains this function in more detail).
40
+ − 242
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 243
Let us now return to our example of parsing @{ML "(p -- q) || r" for p q
326
+ − 244
r}. If you want to generate the correct error message for failure
+ − 245
of parsing @{text "p"}-followed-by-@{text "q"}, then you have to write:
38
+ − 246
*}
+ − 247
69
+ − 248
ML{*fun p_followed_by_q p q r =
133
+ − 249
let
236
+ − 250
val err_msg = fn _ => p ^ " is not followed by " ^ q
133
+ − 251
in
+ − 252
($$ p -- (!! err_msg ($$ q))) || ($$ r -- $$ r)
+ − 253
end *}
38
+ − 254
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 255
40
+ − 256
text {*
220
+ − 257
Running this parser with the arguments
+ − 258
@{text [quotes] "h"}, @{text [quotes] "e"} and @{text [quotes] "w"}, and
65
+ − 259
the input @{text [quotes] "holle"}
40
+ − 260
240
+ − 261
@{ML_response_fake [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (Symbol.explode \"holle\")"
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 262
"Exception ERROR \"h is not followed by e\" raised"}
40
+ − 263
65
+ − 264
produces the correct error message. Running it with
40
+ − 265
240
+ − 266
@{ML_response [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (Symbol.explode \"wworld\")"
40
+ − 267
"((\"w\", \"w\"), [\"o\", \"r\", \"l\", \"d\"])"}
+ − 268
+ − 269
yields the expected parsing.
38
+ − 270
58
+ − 271
The function @{ML "Scan.repeat p" for p} will apply a parser @{text p} as
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 272
often as it succeeds. For example:
40
+ − 273
240
+ − 274
@{ML_response [display,gray] "Scan.repeat ($$ \"h\") (Symbol.explode \"hhhhello\")"
40
+ − 275
"([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"}
+ − 276
344
+ − 277
Note that @{ML_ind repeat in Scan} stores the parsed items in a list. The function
+ − 278
@{ML_ind repeat1 in Scan} is similar, but requires that the parser @{text "p"}
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 279
succeeds at least once.
48
+ − 280
58
+ − 281
Also note that the parser would have aborted with the exception @{text MORE}, if
326
+ − 282
you had it run with the string @{text [quotes] "hhhh"}. This can be avoided by using
344
+ − 283
the wrapper @{ML_ind finite in Scan} and the ``stopper-token''
+ − 284
@{ML_ind stopper in Symbol}. With them you can write:
49
+ − 285
240
+ − 286
@{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (Symbol.explode \"hhhh\")"
49
+ − 287
"([\"h\", \"h\", \"h\", \"h\"], [])"}
+ − 288
326
+ − 289
The function @{ML stopper in Symbol} is the ``end-of-input'' indicator for parsing strings;
128
+ − 290
other stoppers need to be used when parsing, for example, tokens. However, this kind of
65
+ − 291
manually wrapping is often already done by the surrounding infrastructure.
49
+ − 292
344
+ − 293
The function @{ML_ind repeat in Scan} can be used with @{ML_ind one in Scan} to read any
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 294
string as in
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 295
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 296
@{ML_response [display,gray]
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 297
"let
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 298
val p = Scan.repeat (Scan.one Symbol.not_eof)
240
+ − 299
val input = Symbol.explode \"foo bar foo\"
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 300
in
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 301
Scan.finite Symbol.stopper p input
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 302
end"
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 303
"([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"}
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 304
344
+ − 305
where the function @{ML_ind not_eof in Symbol} ensures that we do not read beyond the
65
+ − 306
end of the input string (i.e.~stopper symbol).
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 307
344
+ − 308
The function @{ML_ind unless in Scan} takes two parsers: if the first one can
60
5b9c6010897b
doem tuning and made the cookbook work again with recent changes (CookBook/Package/Ind_Interface.thy needs to be looked at to see what the problem with the new parser type is)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 309
parse the input, then the whole parser fails; if not, then the second is tried. Therefore
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 310
240
+ − 311
@{ML_response_fake_both [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (Symbol.explode \"hello\")"
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 312
"Exception FAIL raised"}
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 313
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 314
fails, while
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 315
240
+ − 316
@{ML_response [display,gray] "Scan.unless ($$ \"h\") ($$ \"w\") (Symbol.explode \"world\")"
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 317
"(\"w\",[\"o\", \"r\", \"l\", \"d\"])"}
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 318
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 319
succeeds.
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 320
344
+ − 321
The functions @{ML_ind repeat in Scan} and @{ML_ind unless in Scan} can
256
+ − 322
be combined to read any input until a certain marker symbol is reached. In the
+ − 323
example below the marker symbol is a @{text [quotes] "*"}.
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 324
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 325
@{ML_response [display,gray]
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 326
"let
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 327
val p = Scan.repeat (Scan.unless ($$ \"*\") (Scan.one Symbol.not_eof))
240
+ − 328
val input1 = Symbol.explode \"fooooo\"
+ − 329
val input2 = Symbol.explode \"foo*ooo\"
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 330
in
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 331
(Scan.finite Symbol.stopper p input1,
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 332
Scan.finite Symbol.stopper p input2)
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 333
end"
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 334
"(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []),
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 335
([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"}
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 336
256
+ − 337
220
+ − 338
After parsing is done, you almost always want to apply a function to the parsed
344
+ − 339
items. One way to do this is the function @{ML_ind ">>" in Scan} where
256
+ − 340
@{ML "(p >> f)" for p f} runs
58
+ − 341
first the parser @{text p} and upon successful completion applies the
+ − 342
function @{text f} to the result. For example
38
+ − 343
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 344
@{ML_response [display,gray]
40
+ − 345
"let
193
+ − 346
fun double (x, y) = (x ^ x, y ^ y)
326
+ − 347
val parser = $$ \"h\" -- $$ \"e\"
40
+ − 348
in
326
+ − 349
(parser >> double) (Symbol.explode \"hello\")
40
+ − 350
end"
+ − 351
"((\"hh\", \"ee\"), [\"l\", \"l\", \"o\"])"}
+ − 352
104
+ − 353
doubles the two parsed input strings; or
59
+ − 354
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 355
@{ML_response [display,gray]
59
+ − 356
"let
104
+ − 357
val p = Scan.repeat (Scan.one Symbol.not_eof)
240
+ − 358
val input = Symbol.explode \"foo bar foo\"
59
+ − 359
in
104
+ − 360
Scan.finite Symbol.stopper (p >> implode) input
59
+ − 361
end"
+ − 362
"(\"foo bar foo\",[])"}
+ − 363
60
5b9c6010897b
doem tuning and made the cookbook work again with recent changes (CookBook/Package/Ind_Interface.thy needs to be looked at to see what the problem with the new parser type is)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 364
where the single-character strings in the parsed output are transformed
59
+ − 365
back into one string.
56
126646f2aa88
added a para on Scan.unless and an exercise about scanning comments
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 366
344
+ − 367
The function @{ML_ind lift in Scan} takes a parser and a pair as arguments. This function applies
40
+ − 368
the given parser to the second component of the pair and leaves the first component
+ − 369
untouched. For example
38
+ − 370
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 371
@{ML_response [display,gray]
240
+ − 372
"Scan.lift ($$ \"h\" -- $$ \"e\") (1, Symbol.explode \"hello\")"
40
+ − 373
"((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"}
+ − 374
43
02f76f1b6e7b
added positions to anti-quotations; removed old antiquotation_setup; tuned the text a bit
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 375
(FIXME: In which situations is this useful? Give examples.)
149
+ − 376
+ − 377
\begin{exercise}\label{ex:scancmts}
+ − 378
Write a parser that parses an input string so that any comment enclosed
220
+ − 379
within @{text "(*\<dots>*)"} is replaced by the same comment but enclosed within
149
+ − 380
@{text "(**\<dots>**)"} in the output string. To enclose a string, you can use the
+ − 381
function @{ML "enclose s1 s2 s" for s1 s2 s} which produces the string @{ML
236
+ − 382
"s1 ^ s ^ s2" for s1 s2 s}. Hint: To simplify the task ignore the proper
+ − 383
nesting of comments.
149
+ − 384
\end{exercise}
40
+ − 385
*}
+ − 386
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 387
section {* Parsing Theory Syntax *}
38
+ − 388
40
+ − 389
text {*
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 390
Most of the time, however, Isabelle developers have to deal with parsing
156
+ − 391
tokens, not strings. These token parsers have the type:
128
+ − 392
*}
+ − 393
+ − 394
ML{*type 'a parser = OuterLex.token list -> 'a * OuterLex.token list*}
+ − 395
+ − 396
text {*
149
+ − 397
The reason for using token parsers is that theory syntax, as well as the
128
+ − 398
parsers for the arguments of proof methods, use the type @{ML_type
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 399
OuterLex.token}.
42
+ − 400
+ − 401
\begin{readmore}
40
+ − 402
The parser functions for the theory syntax are contained in the structure
42
+ − 403
@{ML_struct OuterParse} defined in the file @{ML_file "Pure/Isar/outer_parse.ML"}.
+ − 404
The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}.
+ − 405
\end{readmore}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 406
316
+ − 407
The structure @{ML_struct OuterLex} defines several kinds of tokens (for
344
+ − 408
example @{ML_ind Ident in OuterLex} for identifiers, @{ML Keyword in
+ − 409
OuterLex} for keywords and @{ML_ind Command in OuterLex} for commands). Some
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 410
token parsers take into account the kind of tokens. The first example shows
256
+ − 411
how to generate a token list out of a string using the function
344
+ − 412
@{ML_ind scan in OuterSyntax}. It is given the argument
256
+ − 413
@{ML "Position.none"} since,
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 414
at the moment, we are not interested in generating precise error
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 415
messages. The following code\footnote{Note that because of a possible bug in
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 416
the PolyML runtime system, the result is printed as @{text [quotes] "?"},
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 417
instead of the tokens.}
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 418
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 419
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 420
@{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\""
50
+ − 421
"[Token (\<dots>,(Ident, \"hello\"),\<dots>),
+ − 422
Token (\<dots>,(Space, \" \"),\<dots>),
+ − 423
Token (\<dots>,(Ident, \"world\"),\<dots>)]"}
+ − 424
+ − 425
produces three tokens where the first and the last are identifiers, since
58
+ − 426
@{text [quotes] "hello"} and @{text [quotes] "world"} do not match any
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 427
other syntactic category. The second indicates a space.
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 428
326
+ − 429
We can easily change what is recognised as a keyword with the function
344
+ − 430
@{ML_ind keyword in OuterKeyword}. For example calling it with
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 431
*}
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 432
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 433
ML{*val _ = OuterKeyword.keyword "hello"*}
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 434
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 435
text {*
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 436
then lexing @{text [quotes] "hello world"} will produce
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 437
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 438
@{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\""
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 439
"[Token (\<dots>,(Keyword, \"hello\"),\<dots>),
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 440
Token (\<dots>,(Space, \" \"),\<dots>),
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 441
Token (\<dots>,(Ident, \"world\"),\<dots>)]"}
50
+ − 442
241
+ − 443
Many parsing functions later on will require white space, comments and the like
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 444
to have already been filtered out. So from now on we are going to use the
344
+ − 445
functions @{ML filter} and @{ML_ind is_proper in OuterLex} to do this.
256
+ − 446
For example:
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 447
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 448
@{ML_response_fake [display,gray]
50
+ − 449
"let
+ − 450
val input = OuterSyntax.scan Position.none \"hello world\"
+ − 451
in
+ − 452
filter OuterLex.is_proper input
+ − 453
end"
+ − 454
"[Token (\<dots>,(Ident, \"hello\"), \<dots>), Token (\<dots>,(Ident, \"world\"), \<dots>)]"}
+ − 455
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 456
For convenience we define the function:
50
+ − 457
*}
+ − 458
69
+ − 459
ML{*fun filtered_input str =
160
cc9359bfacf4
redefined the functions warning and tracing in order to properly match more antiquotations
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 460
filter OuterLex.is_proper (OuterSyntax.scan Position.none str) *}
50
+ − 461
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 462
text {*
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 463
If you now parse
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 464
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 465
@{ML_response_fake [display,gray]
50
+ − 466
"filtered_input \"inductive | for\""
+ − 467
"[Token (\<dots>,(Command, \"inductive\"),\<dots>),
+ − 468
Token (\<dots>,(Keyword, \"|\"),\<dots>),
+ − 469
Token (\<dots>,(Keyword, \"for\"),\<dots>)]"}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 470
221
+ − 471
you obtain a list consisting of only one command and two keyword tokens.
241
+ − 472
If you want to see which keywords and commands are currently known to Isabelle,
+ − 473
type:
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 474
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 475
@{ML_response_fake [display,gray]
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 476
"let
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 477
val (keywords, commands) = OuterKeyword.get_lexicons ()
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 478
in
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 479
(Scan.dest_lexicon commands, Scan.dest_lexicon keywords)
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 480
end"
132
+ − 481
"([\"}\", \"{\", \<dots>], [\"\<rightleftharpoons>\", \"\<leftharpoondown>\", \<dots>])"}
42
+ − 482
344
+ − 483
You might have to adjust the @{ML_ind print_depth} in order to
241
+ − 484
see the complete list.
+ − 485
344
+ − 486
The parser @{ML_ind "$$$" in OuterParse} parses a single keyword. For example:
50
+ − 487
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 488
@{ML_response [display,gray]
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 489
"let
50
+ − 490
val input1 = filtered_input \"where for\"
+ − 491
val input2 = filtered_input \"| in\"
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 492
in
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 493
(OuterParse.$$$ \"where\" input1, OuterParse.$$$ \"|\" input2)
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 494
end"
128
+ − 495
"((\"where\",\<dots>), (\"|\",\<dots>))"}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 496
344
+ − 497
Any non-keyword string can be parsed with the function @{ML_ind reserved in OuterParse}.
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 498
For example:
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 499
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 500
@{ML_response [display,gray]
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 501
"let
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 502
val p = OuterParse.reserved \"bar\"
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 503
val input = filtered_input \"bar\"
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 504
in
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 505
p input
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 506
end"
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 507
"(\"bar\",[])"}
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 508
344
+ − 509
Like before, you can sequentially connect parsers with @{ML "--"}. For example:
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 510
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 511
@{ML_response [display,gray]
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 512
"let
50
+ − 513
val input = filtered_input \"| in\"
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 514
in
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 515
(OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\") input
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 516
end"
183
+ − 517
"((\"|\", \"in\"), [])"}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 518
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 519
The parser @{ML "OuterParse.enum s p" for s p} parses a possibly empty
58
+ − 520
list of items recognised by the parser @{text p}, where the items being parsed
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 521
are separated by the string @{text s}. For example:
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 522
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 523
@{ML_response [display,gray]
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 524
"let
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 525
val input = filtered_input \"in | in | in foo\"
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 526
in
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 527
(OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 528
end"
183
+ − 529
"([\"in\", \"in\", \"in\"], [\<dots>])"}
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 530
326
+ − 531
The function @{ML_ind enum1 in OuterParse} works similarly, except that the
+ − 532
parsed list must be non-empty. Note that we had to add a string @{text
+ − 533
[quotes] "foo"} at the end of the parsed string, otherwise the parser would
+ − 534
have consumed all tokens and then failed with the exception @{text
+ − 535
"MORE"}. Like in the previous section, we can avoid this exception using the
+ − 536
wrapper @{ML Scan.finite}. This time, however, we have to use the
+ − 537
``stopper-token'' @{ML OuterLex.stopper}. We can write:
49
+ − 538
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 539
@{ML_response [display,gray]
49
+ − 540
"let
50
+ − 541
val input = filtered_input \"in | in | in\"
326
+ − 542
val p = OuterParse.enum \"|\" (OuterParse.$$$ \"in\")
49
+ − 543
in
326
+ − 544
Scan.finite OuterLex.stopper p input
49
+ − 545
end"
183
+ − 546
"([\"in\", \"in\", \"in\"], [])"}
49
+ − 547
75
+ − 548
The following function will help to run examples.
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 549
*}
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 550
69
+ − 551
ML{*fun parse p input = Scan.finite OuterLex.stopper (Scan.error p) input *}
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 552
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 553
text {*
326
+ − 554
The function @{ML_ind "!!!" in OuterParse} can be used to force termination
+ − 555
of the parser in case of a dead end, just like @{ML "Scan.!!"} (see previous
+ − 556
section). A difference, however, is that the error message of @{ML
+ − 557
"OuterParse.!!!"} is fixed to be @{text [quotes] "Outer syntax error"}
221
+ − 558
together with a relatively precise description of the failure. For example:
49
+ − 559
72
7b8c4fe235aa
added an antiquotation option [gray] for gray boxes around displays
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 560
@{ML_response_fake [display,gray]
49
+ − 561
"let
50
+ − 562
val input = filtered_input \"in |\"
49
+ − 563
val parse_bar_then_in = OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\"
+ − 564
in
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 565
parse (OuterParse.!!! parse_bar_then_in) input
49
+ − 566
end"
+ − 567
"Exception ERROR \"Outer syntax error: keyword \"|\" expected,
+ − 568
but keyword in was found\" raised"
+ − 569
}
42
+ − 570
65
+ − 571
\begin{exercise} (FIXME)
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 572
A type-identifier, for example @{typ "'a"}, is a token of
344
+ − 573
kind @{ML_ind Keyword in OuterLex}. It can be parsed using
256
+ − 574
the function @{ML type_ident in OuterParse}.
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 575
\end{exercise}
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 576
104
+ − 577
(FIXME: or give parser for numbers)
53
0c3580c831a4
removed the @{ML ...} antiquotation in favour of @{ML_open ...x}
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 578
125
+ − 579
Whenever there is a possibility that the processing of user input can fail,
221
+ − 580
it is a good idea to give all available information about where the error
220
+ − 581
occurred. For this Isabelle can attach positional information to tokens
326
+ − 582
and then thread this information up the ``processing chain''. To see this,
+ − 583
modify the function @{ML filtered_input}, described earlier, as follows
41
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 584
*}
b11653b11bd3
further progress on the parsing section and tuning on the antiqu's
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 585
125
+ − 586
ML{*fun filtered_input' str =
+ − 587
filter OuterLex.is_proper (OuterSyntax.scan (Position.line 7) str) *}
49
+ − 588
+ − 589
text {*
125
+ − 590
where we pretend the parsed string starts on line 7. An example is
49
+ − 591
125
+ − 592
@{ML_response_fake [display,gray]
+ − 593
"filtered_input' \"foo \\n bar\""
+ − 594
"[Token ((\"foo\", ({line=7, end_line=7}, {line=7})), (Ident, \"foo\"), \<dots>),
+ − 595
Token ((\"bar\", ({line=8, end_line=8}, {line=8})), (Ident, \"bar\"), \<dots>)]"}
+ − 596
+ − 597
in which the @{text [quotes] "\\n"} causes the second token to be in
+ − 598
line 8.
+ − 599
326
+ − 600
By using the parser @{ML position in OuterParse} you can access the token
+ − 601
position and return it as part of the parser result. For example
125
+ − 602
+ − 603
@{ML_response_fake [display,gray]
+ − 604
"let
241
+ − 605
val input = filtered_input' \"where\"
125
+ − 606
in
+ − 607
parse (OuterParse.position (OuterParse.$$$ \"where\")) input
+ − 608
end"
+ − 609
"((\"where\", {line=7, end_line=7}), [])"}
+ − 610
+ − 611
\begin{readmore}
+ − 612
The functions related to positions are implemented in the file
+ − 613
@{ML_file "Pure/General/position.ML"}.
+ − 614
\end{readmore}
49
+ − 615
+ − 616
*}
+ − 617
326
+ − 618
section {* Parsers for ML-Code (TBD) *}
+ − 619
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 620
text {*
326
+ − 621
@{ML_ind ML_source in OuterParse}
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 622
*}
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 623
193
+ − 624
section {* Context Parser (TBD) *}
+ − 625
+ − 626
text {*
326
+ − 627
@{ML_ind Args.context}
+ − 628
*}
+ − 629
(*
+ − 630
ML {*
+ − 631
let
+ − 632
val parser = Args.context -- Scan.lift Args.name_source
+ − 633
+ − 634
fun term_pat (ctxt, str) =
+ − 635
str |> Syntax.read_prop ctxt
+ − 636
in
+ − 637
(parser >> term_pat) (Context.Proof @{context}, filtered_input "f (a::nat)")
+ − 638
|> fst
+ − 639
end
+ − 640
*}
+ − 641
*)
+ − 642
+ − 643
text {*
+ − 644
@{ML_ind Args.context}
+ − 645
193
+ − 646
Used for example in \isacommand{attribute\_setup} and \isacommand{method\_setup}.
+ − 647
*}
+ − 648
207
+ − 649
section {* Argument and Attribute Parsers (TBD) *}
+ − 650
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 651
section {* Parsing Inner Syntax *}
42
+ − 652
125
+ − 653
text {*
+ − 654
There is usually no need to write your own parser for parsing inner syntax, that is
285
+ − 655
for terms and types: you can just call the predefined parsers. Terms can
326
+ − 656
be parsed using the function @{ML_ind term in OuterParse}. For example:
125
+ − 657
+ − 658
@{ML_response [display,gray]
+ − 659
"let
+ − 660
val input = OuterSyntax.scan Position.none \"foo\"
44
dee4b3e66dfe
added a readme chapter for prospective authors; added commands for referring to the Isar Reference Manual
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 661
in
125
+ − 662
OuterParse.term input
+ − 663
end"
+ − 664
"(\"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\", [])"}
+ − 665
326
+ − 666
The function @{ML_ind prop in OuterParse} is similar, except that it gives a different
127
+ − 667
error message, when parsing fails. As you can see, the parser not just returns
+ − 668
the parsed string, but also some encoded information. You can decode the
326
+ − 669
information with the function @{ML_ind parse in YXML} in @{ML_struct YXML}. For example
127
+ − 670
+ − 671
@{ML_response [display,gray]
+ − 672
"YXML.parse \"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\""
+ − 673
"XML.Elem (\"token\", [], [XML.Text \"foo\"])"}
+ − 674
149
+ − 675
The result of the decoding is an XML-tree. You can see better what is going on if
131
+ − 676
you replace @{ML Position.none} by @{ML "Position.line 42"}, say:
101
+ − 677
125
+ − 678
@{ML_response [display,gray]
+ − 679
"let
+ − 680
val input = OuterSyntax.scan (Position.line 42) \"foo\"
+ − 681
in
127
+ − 682
YXML.parse (fst (OuterParse.term input))
125
+ − 683
end"
127
+ − 684
"XML.Elem (\"token\", [(\"line\", \"42\"), (\"end_line\", \"42\")], [XML.Text \"foo\"])"}
125
+ − 685
149
+ − 686
The positional information is stored as part of an XML-tree so that code
+ − 687
called later on will be able to give more precise error messages.
125
+ − 688
127
+ − 689
\begin{readmore}
128
+ − 690
The functions to do with input and output of XML and YXML are defined
127
+ − 691
in @{ML_file "Pure/General/xml.ML"} and @{ML_file "Pure/General/yxml.ML"}.
+ − 692
\end{readmore}
160
cc9359bfacf4
redefined the functions warning and tracing in order to properly match more antiquotations
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 693
361
+ − 694
FIXME:
+ − 695
@{ML_ind parse_term in Syntax} @{ML_ind check_term in Syntax}
+ − 696
@{ML_ind parse_typ in Syntax} @{ML_ind check_typ in Syntax}
+ − 697
+ − 698
125
+ − 699
*}
101
+ − 700
116
+ − 701
section {* Parsing Specifications\label{sec:parsingspecs} *}
101
+ − 702
+ − 703
text {*
121
+ − 704
There are a number of special purpose parsers that help with parsing
156
+ − 705
specifications of function definitions, inductive predicates and so on. In
220
+ − 706
Chapter~\ref{chp:package}, for example, we will need to parse specifications
121
+ − 707
for inductive predicates of the form:
+ − 708
*}
101
+ − 709
121
+ − 710
simple_inductive
+ − 711
even and odd
+ − 712
where
+ − 713
even0: "even 0"
+ − 714
| evenS: "odd n \<Longrightarrow> even (Suc n)"
+ − 715
| oddS: "even n \<Longrightarrow> odd (Suc n)"
101
+ − 716
327
+ − 717
101
+ − 718
text {*
121
+ − 719
For this we are going to use the parser:
101
+ − 720
*}
+ − 721
121
+ − 722
ML %linenosgray{*val spec_parser =
126
+ − 723
OuterParse.fixes --
+ − 724
Scan.optional
+ − 725
(OuterParse.$$$ "where" |--
+ − 726
OuterParse.!!!
+ − 727
(OuterParse.enum1 "|"
+ − 728
(SpecParse.opt_thm_name ":" -- OuterParse.prop))) []*}
120
+ − 729
101
+ − 730
text {*
241
+ − 731
Note that the parser must not parse the keyword \simpleinductive, even if it is
126
+ − 732
meant to process definitions as shown above. The parser of the keyword
128
+ − 733
will be given by the infrastructure that will eventually call @{ML spec_parser}.
126
+ − 734
+ − 735
124
+ − 736
To see what the parser returns, let us parse the string corresponding to the
121
+ − 737
definition of @{term even} and @{term odd}:
+ − 738
101
+ − 739
@{ML_response [display,gray]
+ − 740
"let
+ − 741
val input = filtered_input
+ − 742
(\"even and odd \" ^
+ − 743
\"where \" ^
+ − 744
\" even0[intro]: \\\"even 0\\\" \" ^
+ − 745
\"| evenS[intro]: \\\"odd n \<Longrightarrow> even (Suc n)\\\" \" ^
+ − 746
\"| oddS[intro]: \\\"even n \<Longrightarrow> odd (Suc n)\\\"\")
+ − 747
in
120
+ − 748
parse spec_parser input
101
+ − 749
end"
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 750
"(([(even, NONE, NoSyn), (odd, NONE, NoSyn)],
101
+ − 751
[((even0,\<dots>), \"\\^E\\^Ftoken\\^Eeven 0\\^E\\^F\\^E\"),
+ − 752
((evenS,\<dots>), \"\\^E\\^Ftoken\\^Eodd n \<Longrightarrow> even (Suc n)\\^E\\^F\\^E\"),
+ − 753
((oddS,\<dots>), \"\\^E\\^Ftoken\\^Eeven n \<Longrightarrow> odd (Suc n)\\^E\\^F\\^E\")]), [])"}
121
+ − 754
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 755
As you see, the result is a pair consisting of a list of
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 756
variables with optional type-annotation and syntax-annotation, and a list of
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 757
rules where every rule has optionally a name and an attribute.
121
+ − 758
344
+ − 759
The function @{ML_ind "fixes" in OuterParse} in Line 2 of the parser reads an
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 760
\isacommand{and}-separated
124
+ − 761
list of variables that can include optional type annotations and syntax translations.
121
+ − 762
For example:\footnote{Note that in the code we need to write
+ − 763
@{text "\\\"int \<Rightarrow> bool\\\""} in order to properly escape the double quotes
+ − 764
in the compound type.}
+ − 765
+ − 766
@{ML_response [display,gray]
+ − 767
"let
+ − 768
val input = filtered_input
+ − 769
\"foo::\\\"int \<Rightarrow> bool\\\" and bar::nat (\\\"BAR\\\" 100) and blonk\"
+ − 770
in
219
+ − 771
parse OuterParse.fixes input
121
+ − 772
end"
+ − 773
"([(foo, SOME \"\\^E\\^Ftoken\\^Eint \<Rightarrow> bool\\^E\\^F\\^E\", NoSyn),
+ − 774
(bar, SOME \"\\^E\\^Ftoken\\^Enat\\^E\\^F\\^E\", Mixfix (\"BAR\", [], 100)),
+ − 775
(blonk, NONE, NoSyn)],[])"}
50
+ − 776
*}
+ − 777
121
+ − 778
text {*
156
+ − 779
Whenever types are given, they are stored in the @{ML SOME}s. The types are
+ − 780
not yet used to type the variables: this must be done by type-inference later
149
+ − 781
on. Since types are part of the inner syntax they are strings with some
241
+ − 782
encoded information (see previous section). If a mixfix-syntax is
344
+ − 783
present for a variable, then it is stored in the @{ML_ind Mixfix} data structure;
+ − 784
no syntax translation is indicated by @{ML_ind NoSyn}.
121
+ − 785
+ − 786
\begin{readmore}
241
+ − 787
The data structure for mixfix annotations is defined in @{ML_file "Pure/Syntax/mixfix.ML"}.
121
+ − 788
\end{readmore}
+ − 789
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 790
Lines 3 to 7 in the function @{ML spec_parser} implement the parser for a
219
+ − 791
list of introduction rules, that is propositions with theorem annotations
+ − 792
such as rule names and attributes. The introduction rules are propositions
344
+ − 793
parsed by @{ML_ind prop in OuterParse}. However, they can include an optional
219
+ − 794
theorem name plus some attributes. For example
121
+ − 795
+ − 796
@{ML_response [display,gray] "let
+ − 797
val input = filtered_input \"foo_lemma[intro,dest!]:\"
+ − 798
val ((name, attrib), _) = parse (SpecParse.thm_name \":\") input
+ − 799
in
+ − 800
(name, map Args.dest_src attrib)
+ − 801
end" "(foo_lemma, [((\"intro\", []), \<dots>), ((\"dest\", [\<dots>]), \<dots>)])"}
+ − 802
344
+ − 803
The function @{ML_ind opt_thm_name in SpecParse} is the ``optional'' variant of
+ − 804
@{ML_ind thm_name in SpecParse}. Theorem names can contain attributes. The name
131
+ − 805
has to end with @{text [quotes] ":"}---see the argument of
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 806
the function @{ML SpecParse.opt_thm_name} in Line 7.
121
+ − 807
+ − 808
\begin{readmore}
+ − 809
Attributes and arguments are implemented in the files @{ML_file "Pure/Isar/attrib.ML"}
+ − 810
and @{ML_file "Pure/Isar/args.ML"}.
+ − 811
\end{readmore}
101
+ − 812
*}
65
+ − 813
193
+ − 814
text_raw {*
+ − 815
\begin{exercise}
207
+ − 816
Have a look at how the parser @{ML SpecParse.where_alt_specs} is implemented
+ − 817
in file @{ML_file "Pure/Isar/spec_parse.ML"}. This parser corresponds
+ − 818
to the ``where-part'' of the introduction rules given above. Below
344
+ − 819
we paraphrase the code of @{ML_ind where_alt_specs in SpecParse} adapted to our
207
+ − 820
purposes.
193
+ − 821
\begin{isabelle}
+ − 822
*}
+ − 823
ML %linenosgray{*val spec_parser' =
+ − 824
OuterParse.fixes --
+ − 825
Scan.optional
+ − 826
(OuterParse.$$$ "where" |--
+ − 827
OuterParse.!!!
+ − 828
(OuterParse.enum1 "|"
+ − 829
((SpecParse.opt_thm_name ":" -- OuterParse.prop) --|
+ − 830
Scan.option (Scan.ahead (OuterParse.name ||
+ − 831
OuterParse.$$$ "[") --
+ − 832
OuterParse.!!! (OuterParse.$$$ "|"))))) [] *}
+ − 833
text_raw {*
+ − 834
\end{isabelle}
284
+ − 835
Both parsers accept the same input% that's not true:
+ − 836
% spec_parser accepts input that is refuted by spec_parser'
+ − 837
, but if you look closely, you can notice
207
+ − 838
an additional ``tail'' (Lines 8 to 10) in @{ML spec_parser'}. What is the purpose of
+ − 839
this additional ``tail''?
193
+ − 840
\end{exercise}
+ − 841
*}
+ − 842
229
+ − 843
text {*
+ − 844
(FIXME: @{ML OuterParse.type_args}, @{ML OuterParse.typ}, @{ML OuterParse.opt_mixfix})
+ − 845
*}
+ − 846
+ − 847
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 848
section {* New Commands and Keyword Files\label{sec:newcommand} *}
65
+ − 849
+ − 850
text {*
68
+ − 851
Often new commands, for example for providing new definitional principles,
+ − 852
need to be implemented. While this is not difficult on the ML-level,
66
+ − 853
new commands, in order to be useful, need to be recognised by
65
+ − 854
ProofGeneral. This results in some subtle configuration issues, which we
+ − 855
will explain in this section.
+ − 856
74
+ − 857
To keep things simple, let us start with a ``silly'' command that does nothing
+ − 858
at all. We shall name this command \isacommand{foobar}. On the ML-level it can be
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 859
defined as:
68
+ − 860
*}
65
+ − 861
69
+ − 862
ML{*let
219
+ − 863
val do_nothing = Scan.succeed (LocalTheory.theory I)
68
+ − 864
val kind = OuterKeyword.thy_decl
65
+ − 865
in
219
+ − 866
OuterSyntax.local_theory "foobar" "description of foobar" kind do_nothing
69
+ − 867
end *}
65
+ − 868
68
+ − 869
text {*
344
+ − 870
The crucial function @{ML_ind local_theory in OuterSyntax} expects a name for the command, a
219
+ − 871
short description, a kind indicator (which we will explain later more thoroughly) and a
+ − 872
parser producing a local theory transition (its purpose will also explained
66
+ − 873
later).
65
+ − 874
101
+ − 875
While this is everything you have to do on the ML-level, you need a keyword
68
+ − 876
file that can be loaded by ProofGeneral. This is to enable ProofGeneral to
+ − 877
recognise \isacommand{foobar} as a command. Such a keyword file can be
74
+ − 878
generated with the command-line:
68
+ − 879
74
+ − 880
@{text [display] "$ isabelle keywords -k foobar some_log_files"}
65
+ − 881
74
+ − 882
The option @{text "-k foobar"} indicates which postfix the name of the keyword file
80
+ − 883
will be assigned. In the case above the file will be named @{text
86
+ − 884
"isar-keywords-foobar.el"}. This command requires log files to be
68
+ − 885
present (in order to extract the keywords from them). To generate these log
101
+ − 886
files, you first need to package the code above into a separate theory file named
68
+ − 887
@{text "Command.thy"}, say---see Figure~\ref{fig:commandtheory} for the
+ − 888
complete code.
65
+ − 889
66
+ − 890
+ − 891
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+ − 892
\begin{figure}[t]
69
+ − 893
\begin{graybox}\small
66
+ − 894
\isacommand{theory}~@{text Command}\\
+ − 895
\isacommand{imports}~@{text Main}\\
+ − 896
\isacommand{begin}\\
85
+ − 897
\isacommand{ML}~@{text "\<verbopen>"}\\
66
+ − 898
@{ML
+ − 899
"let
219
+ − 900
val do_nothing = Scan.succeed (LocalTheory.theory I)
68
+ − 901
val kind = OuterKeyword.thy_decl
66
+ − 902
in
219
+ − 903
OuterSyntax.local_theory \"foobar\" \"description of foobar\" kind do_nothing
66
+ − 904
end"}\\
85
+ − 905
@{text "\<verbclose>"}\\
66
+ − 906
\isacommand{end}
80
+ − 907
\end{graybox}
241
+ − 908
\caption{This file can be used to generate a log file. This log file in turn can
+ − 909
be used to generate a keyword file containing the command \isacommand{foobar}.
+ − 910
\label{fig:commandtheory}}
66
+ − 911
\end{figure}
+ − 912
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+ − 913
75
+ − 914
For our purposes it is sufficient to use the log files of the theories
68
+ − 915
@{text "Pure"}, @{text "HOL"} and @{text "Pure-ProofGeneral"}, as well as
75
+ − 916
the log file for the theory @{text "Command.thy"}, which contains the new
+ − 917
\isacommand{foobar}-command. If you target other logics besides HOL, such
74
+ − 918
as Nominal or ZF, then you need to adapt the log files appropriately.
104
+ − 919
74
+ − 920
@{text Pure} and @{text HOL} are usually compiled during the installation of
+ − 921
Isabelle. So log files for them should be already available. If not, then
75
+ − 922
they can be conveniently compiled with the help of the build-script from the Isabelle
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 923
distribution.
65
+ − 924
+ − 925
@{text [display]
+ − 926
"$ ./build -m \"Pure\"
+ − 927
$ ./build -m \"HOL\""}
+ − 928
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 929
The @{text "Pure-ProofGeneral"} theory needs to be compiled with:
65
+ − 930
+ − 931
@{text [display] "$ ./build -m \"Pure-ProofGeneral\" \"Pure\""}
+ − 932
101
+ − 933
For the theory @{text "Command.thy"}, you first need to create a ``managed'' subdirectory
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 934
with:
66
+ − 935
68
+ − 936
@{text [display] "$ isabelle mkdir FoobarCommand"}
66
+ − 937
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 938
This generates a directory containing the files:
66
+ − 939
+ − 940
@{text [display]
+ − 941
"./IsaMakefile
68
+ − 942
./FoobarCommand/ROOT.ML
+ − 943
./FoobarCommand/document
+ − 944
./FoobarCommand/document/root.tex"}
65
+ − 945
+ − 946
101
+ − 947
You need to copy the file @{text "Command.thy"} into the directory @{text "FoobarCommand"}
66
+ − 948
and add the line
+ − 949
207
+ − 950
@{text [display] "no_document use_thy \"Command\";"}
66
+ − 951
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 952
to the file @{text "./FoobarCommand/ROOT.ML"}. You can now compile the theory by just typing:
65
+ − 953
+ − 954
@{text [display] "$ isabelle make"}
+ − 955
101
+ − 956
If the compilation succeeds, you have finally created all the necessary log files.
+ − 957
They are stored in the directory
65
+ − 958
241
+ − 959
@{text [display] "~/.isabelle/heaps/Isabelle2009/polyml-5.2.1_x86-linux/log"}
65
+ − 960
74
+ − 961
or something similar depending on your Isabelle distribution and architecture.
+ − 962
One quick way to assign a shell variable to this directory is by typing
66
+ − 963
+ − 964
@{text [display] "$ ISABELLE_LOGS=\"$(isabelle getenv -b ISABELLE_OUTPUT)\"/log"}
+ − 965
156
+ − 966
on the Unix prompt. If you now type @{text "ls $ISABELLE_LOGS"}, then the
128
+ − 967
directory should include the files:
65
+ − 968
+ − 969
@{text [display]
+ − 970
"Pure.gz
+ − 971
HOL.gz
+ − 972
Pure-ProofGeneral.gz
68
+ − 973
HOL-FoobarCommand.gz"}
65
+ − 974
101
+ − 975
From them you can create the keyword files. Assuming the name
75
+ − 976
of the directory is in @{text "$ISABELLE_LOGS"},
74
+ − 977
then the Unix command for creating the keyword file is:
65
+ − 978
+ − 979
@{text [display]
68
+ − 980
"$ isabelle keywords -k foobar
80
+ − 981
$ISABELLE_LOGS/{Pure.gz,HOL.gz,Pure-ProofGeneral.gz,HOL-FoobarCommand.gz}"}
65
+ − 982
80
+ − 983
The result is the file @{text "isar-keywords-foobar.el"}. It should contain
321
+ − 984
the string @{text "foobar"} twice.\footnote{To see whether things are fine,
+ − 985
check that @{text "grep foobar"} on this file returns something non-empty.}
+ − 986
This keyword file needs to be copied into the directory @{text
+ − 987
"~/.isabelle/etc"}. To make ProofGeneral aware of it, you have to start
+ − 988
Isabelle with the option @{text "-k foobar"}, that is:
65
+ − 989
80
+ − 990
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 991
@{text [display] "$ isabelle emacs -k foobar a_theory_file"}
65
+ − 992
101
+ − 993
If you now build a theory on top of @{text "Command.thy"},
326
+ − 994
then you can use the command \isacommand{foobar}. You can just write
321
+ − 995
*}
+ − 996
+ − 997
foobar
+ − 998
+ − 999
text {*
+ − 1000
but you will not see any action as we chose to implement this command to do
327
+ − 1001
nothing. The point of this command is only to show the procedure of how
326
+ − 1002
to interact with ProofGeneral. A similar procedure has to be done with any
+ − 1003
other new command, and also any new keyword that is introduced with
327
+ − 1004
the function @{ML_ind keyword in OuterKeyword}. For example:
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1005
*}
65
+ − 1006
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1007
ML{*val _ = OuterKeyword.keyword "blink" *}
65
+ − 1008
230
8def50824320
added material about OuterKeyword.keyword and OuterParse.reserved
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1009
text {*
321
+ − 1010
At the moment the command \isacommand{foobar} is not very useful. Let us
+ − 1011
refine it a bit next by letting it take a proposition as argument and
+ − 1012
printing this proposition inside the tracing buffer.
68
+ − 1013
75
+ − 1014
The crucial part of a command is the function that determines the behaviour
+ − 1015
of the command. In the code above we used a ``do-nothing''-function, which
344
+ − 1016
because of @{ML_ind succeed in Scan} does not parse any argument, but immediately
219
+ − 1017
returns the simple function @{ML "LocalTheory.theory I"}. We can
75
+ − 1018
replace this code by a function that first parses a proposition (using the
+ − 1019
parser @{ML OuterParse.prop}), then prints out the tracing
219
+ − 1020
information (using a new function @{text trace_prop}) and
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1021
finally does nothing. For this you can write:
68
+ − 1022
*}
+ − 1023
69
+ − 1024
ML{*let
219
+ − 1025
fun trace_prop str =
327
+ − 1026
LocalTheory.theory (fn ctxt => (tracing str; ctxt))
75
+ − 1027
68
+ − 1028
val kind = OuterKeyword.thy_decl
+ − 1029
in
321
+ − 1030
OuterSyntax.local_theory "foobar_trace" "traces a proposition"
327
+ − 1031
kind (OuterParse.prop >> trace_prop)
69
+ − 1032
end *}
68
+ − 1033
321
+ − 1034
text {*
+ − 1035
The command is now \isacommand{foobar\_trace} and can be used to
+ − 1036
see the proposition in the tracing buffer.
+ − 1037
*}
+ − 1038
+ − 1039
foobar_trace "True \<and> False"
218
7ff7325e3b4e
started to adapt the rest of chapter 5 to the simplified version without parameters (they will be described in the extension section)
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1040
68
+ − 1041
text {*
344
+ − 1042
Note that so far we used @{ML_ind thy_decl in OuterKeyword} as the kind
219
+ − 1043
indicator for the command. This means that the command finishes as soon as
+ − 1044
the arguments are processed. Examples of this kind of commands are
+ − 1045
\isacommand{definition} and \isacommand{declare}. In other cases, commands
+ − 1046
are expected to parse some arguments, for example a proposition, and then
+ − 1047
``open up'' a proof in order to prove the proposition (for example
86
+ − 1048
\isacommand{lemma}) or prove some other properties (for example
219
+ − 1049
\isacommand{function}). To achieve this kind of behaviour, you have to use
344
+ − 1050
the kind indicator @{ML_ind thy_goal in OuterKeyword} and the function @{ML
219
+ − 1051
"local_theory_to_proof" in OuterSyntax} to set up the command. Note,
+ − 1052
however, once you change the ``kind'' of a command from @{ML thy_decl in
+ − 1053
OuterKeyword} to @{ML thy_goal in OuterKeyword} then the keyword file needs
+ − 1054
to be re-created!
68
+ − 1055
327
+ − 1056
Below we show the command \isacommand{foobar\_goal} which takes a
+ − 1057
proposition as argument and then starts a proof in order to prove
+ − 1058
it. Therefore in Line 9, we set the kind indicator to @{ML thy_goal in
+ − 1059
OuterKeyword}.
68
+ − 1060
*}
+ − 1061
114
+ − 1062
ML%linenosgray{*let
327
+ − 1063
fun goal_prop str lthy =
68
+ − 1064
let
241
+ − 1065
val prop = Syntax.read_prop lthy str
68
+ − 1066
in
241
+ − 1067
Proof.theorem_i NONE (K I) [[(prop,[])]] lthy
327
+ − 1068
end
68
+ − 1069
+ − 1070
val kind = OuterKeyword.thy_goal
+ − 1071
in
327
+ − 1072
OuterSyntax.local_theory_to_proof "foobar_goal" "proves a proposition"
+ − 1073
kind (OuterParse.prop >> goal_prop)
69
+ − 1074
end *}
68
+ − 1075
+ − 1076
text {*
327
+ − 1077
The function @{text goal_prop} in Lines 2 to 7 takes a string (the proposition to be
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1078
proved) and a context as argument. The context is necessary in order to be able to use
344
+ − 1079
@{ML_ind read_prop in Syntax}, which converts a string into a proper proposition.
+ − 1080
In Line 6 the function @{ML_ind theorem_i in Proof} starts the proof for the
75
+ − 1081
proposition. Its argument @{ML NONE} stands for a locale (which we chose to
+ − 1082
omit); the argument @{ML "(K I)"} stands for a function that determines what
+ − 1083
should be done with the theorem once it is proved (we chose to just forget
219
+ − 1084
about it). Line 9 contains the parser for the proposition.
102
5e309df58557
general cleaning up; deleted antiquotation ML_text; adjusted pathnames of various files in the distribution
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1085
321
+ − 1086
If you now type \isacommand{foobar\_goal}~@{text [quotes] "True \<and> True"},
+ − 1087
you obtain the following proof state
+ − 1088
*}
68
+ − 1089
321
+ − 1090
foobar_goal "True \<and> True"
+ − 1091
txt {*
+ − 1092
\begin{minipage}{\textwidth}
+ − 1093
@{subgoals [display]}
+ − 1094
\end{minipage}\medskip
68
+ − 1095
321
+ − 1096
and can prove the proposition as follows.
+ − 1097
*}
+ − 1098
apply(rule conjI)
+ − 1099
apply(rule TrueI)+
+ − 1100
done
+ − 1101
+ − 1102
text {*
327
+ − 1103
{\bf TBD below}
74
+ − 1104
344
+ − 1105
(FIXME: read a name and show how to store theorems; see @{ML_ind note in LocalTheory})
241
+ − 1106
65
+ − 1107
*}
+ − 1108
328
+ − 1109
ML_val{*val r = Unsynchronized.ref (NONE:(unit -> term) option)*}
321
+ − 1110
ML{*let
324
+ − 1111
fun after_qed thm_name thms lthy =
+ − 1112
LocalTheory.note Thm.theoremK (thm_name, (flat thms)) lthy |> snd
+ − 1113
+ − 1114
fun setup_proof (thm_name, (txt, pos)) lthy =
321
+ − 1115
let
+ − 1116
val trm = ML_Context.evaluate lthy true ("r", r) txt
+ − 1117
in
324
+ − 1118
Proof.theorem_i NONE (after_qed thm_name) [[(trm,[])]] lthy
+ − 1119
end
321
+ − 1120
324
+ − 1121
val parser = SpecParse.opt_thm_name ":" -- OuterParse.ML_source
321
+ − 1122
in
+ − 1123
OuterSyntax.local_theory_to_proof "foobar_prove" "proving a proposition"
324
+ − 1124
OuterKeyword.thy_goal (parser >> setup_proof)
321
+ − 1125
end*}
+ − 1126
324
+ − 1127
foobar_prove test: {* @{prop "True"} *}
321
+ − 1128
apply(rule TrueI)
+ − 1129
done
+ − 1130
322
+ − 1131
(*
+ − 1132
ML {*
+ − 1133
structure TacticData = ProofDataFun
+ − 1134
(
+ − 1135
type T = thm list -> tactic;
+ − 1136
fun init _ = undefined;
+ − 1137
);
+ − 1138
+ − 1139
val set_tactic = TacticData.put;
+ − 1140
*}
+ − 1141
+ − 1142
ML {*
+ − 1143
TacticData.get @{context}
+ − 1144
*}
+ − 1145
+ − 1146
ML {* Method.set_tactic *}
+ − 1147
ML {* fun tactic (facts: thm list) : tactic = (atac 1) *}
+ − 1148
ML {* Context.map_proof *}
+ − 1149
ML {* ML_Context.expression *}
+ − 1150
ML {* METHOD *}
+ − 1151
+ − 1152
+ − 1153
ML {*
+ − 1154
fun myexpression pos bind body txt =
+ − 1155
let
+ − 1156
val _ = tracing ("bind)" ^ bind)
+ − 1157
val _ = tracing ("body)" ^ body)
+ − 1158
val _ = tracing ("txt)" ^ txt)
+ − 1159
val _ = tracing ("result) " ^ "Context.set_thread_data (SOME (let " ^ bind ^ " = " ^ txt ^ " in " ^ body ^
+ − 1160
" end (ML_Context.the_generic_context ())));")
+ − 1161
in
+ − 1162
ML_Context.exec (fn () => ML_Context.eval false pos
+ − 1163
("Context.set_thread_data (SOME (let " ^ bind ^ " = " ^ txt ^ " in " ^ body ^
+ − 1164
" end (ML_Context.the_generic_context ())));"))
+ − 1165
end
+ − 1166
*}
319
+ − 1167
+ − 1168
322
+ − 1169
ML {*
+ − 1170
fun ml_tactic (txt, pos) ctxt =
+ − 1171
let
+ − 1172
val ctxt' = ctxt |> Context.proof_map
+ − 1173
(myexpression pos
+ − 1174
"fun tactic (facts: thm list) : tactic"
+ − 1175
"Context.map_proof (Method.set_tactic tactic)" txt);
+ − 1176
in
+ − 1177
Context.setmp_thread_data (SOME (Context.Proof ctxt)) (TacticData.get ctxt')
+ − 1178
end;
+ − 1179
*}
+ − 1180
+ − 1181
ML {*
+ − 1182
fun tactic3 (txt, pos) ctxt =
+ − 1183
let
+ − 1184
val _ = tracing ("1) " ^ txt )
+ − 1185
in
+ − 1186
METHOD (ml_tactic (txt, pos) ctxt; K (atac 1))
+ − 1187
end
+ − 1188
*}
+ − 1189
+ − 1190
setup {*
+ − 1191
Method.setup (Binding.name "tactic3") (Scan.lift (OuterParse.position Args.name)
+ − 1192
>> tactic3)
+ − 1193
"ML tactic as proof method"
+ − 1194
*}
+ − 1195
+ − 1196
lemma "A \<Longrightarrow> A"
+ − 1197
apply(tactic3 {* (atac 1) *})
+ − 1198
done
+ − 1199
+ − 1200
ML {*
+ − 1201
(ML_Context.the_generic_context ())
+ − 1202
*}
+ − 1203
+ − 1204
ML {*
+ − 1205
Context.set_thread_data;
+ − 1206
ML_Context.the_generic_context
+ − 1207
*}
+ − 1208
+ − 1209
lemma "A \<Longrightarrow> A"
+ − 1210
ML_prf {*
+ − 1211
Context.set_thread_data (SOME (let fun tactic (facts: thm list) : tactic = (atac 1) in Context.map_proof (Method.set_tactic tactic) end (ML_Context.the_generic_context ())));
+ − 1212
*}
+ − 1213
+ − 1214
ML {*
+ − 1215
Context.set_thread_data (SOME ((let fun tactic (facts: thm list) : tactic = (atac 1) in 3 end) (ML_Context.the_generic_context ())));
+ − 1216
*}
+ − 1217
+ − 1218
ML {*
+ − 1219
Context.set_thread_data (SOME (let
+ − 1220
fun tactic (facts: thm list) : tactic = (atac 1)
+ − 1221
in
+ − 1222
Context.map_proof (Method.set_tactic tactic)
+ − 1223
end
+ − 1224
(ML_Context.the_generic_context ())));
+ − 1225
*}
+ − 1226
+ − 1227
+ − 1228
ML {*
+ − 1229
let
+ − 1230
fun tactic (facts: thm list) : tactic = atac
+ − 1231
in
+ − 1232
Context.map_proof (Method.set_tactic tactic)
+ − 1233
end *}
+ − 1234
+ − 1235
end *}
+ − 1236
+ − 1237
ML {* Toplevel.program (fn () =>
+ − 1238
(ML_Context.expression Position.none "val plus : int" "3 + 4" "1" (Context.Proof @{context})))*}
+ − 1239
+ − 1240
+ − 1241
ML {*
+ − 1242
fun ml_tactic (txt, pos) ctxt =
+ − 1243
let
+ − 1244
val ctxt' = ctxt |> Context.proof_map
+ − 1245
(ML_Context.expression pos
+ − 1246
"fun tactic (facts: thm list) : tactic"
+ − 1247
"Context.map_proof (Method.set_tactic tactic)" txt);
+ − 1248
in Context.setmp_thread_data (SOME (Context.Proof ctxt)) (TacticData.get ctxt') end;
+ − 1249
+ − 1250
*}
+ − 1251
+ − 1252
ML {*
+ − 1253
Context.set_thread_data (SOME (let fun tactic (facts: thm list) : tactic = (atac 1) in Context.map_proof (Method.set_tactic tactic) end (ML_Context.the_generic_context ())));
+ − 1254
*}
+ − 1255
*)
319
+ − 1256
211
d5accbc67e1b
more work on simple inductive and marked all sections that are still seriously incomplete with TBD
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1257
section {* Methods (TBD) *}
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1258
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1259
text {*
207
+ − 1260
(FIXME: maybe move to after the tactic section)
+ − 1261
221
+ − 1262
Methods are central to Isabelle. They are the ones you use for example
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1263
in \isacommand{apply}. To print out all currently known methods you can use the
192
+ − 1264
Isabelle command:
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1265
207
+ − 1266
\begin{isabelle}
+ − 1267
\isacommand{print\_methods}\\
+ − 1268
@{text "> methods:"}\\
+ − 1269
@{text "> -: do nothing (insert current facts only)"}\\
+ − 1270
@{text "> HOL.default: apply some intro/elim rule (potentially classical)"}\\
+ − 1271
@{text "> ..."}
+ − 1272
\end{isabelle}
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1273
193
+ − 1274
An example of a very simple method is:
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1275
*}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1276
244
+ − 1277
method_setup %gray foo =
181
+ − 1278
{* Scan.succeed
+ − 1279
(K (SIMPLE_METHOD ((etac @{thm conjE} THEN' rtac @{thm conjI}) 1))) *}
244
+ − 1280
"foo method for conjE and conjI"
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1281
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1282
text {*
286
+ − 1283
It defines the method @{text foo}, which takes no arguments (therefore the
207
+ − 1284
parser @{ML Scan.succeed}) and only applies a single tactic, namely the tactic which
256
+ − 1285
applies @{thm [source] conjE} and then @{thm [source] conjI}. The function
344
+ − 1286
@{ML_ind SIMPLE_METHOD in Method}
287
+ − 1287
turns such a tactic into a method. The method @{text "foo"} can be used as follows
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1288
*}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1289
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1290
lemma shows "A \<and> B \<Longrightarrow> C \<and> D"
244
+ − 1291
apply(foo)
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1292
txt {*
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1293
where it results in the goal state
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1294
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1295
\begin{minipage}{\textwidth}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1296
@{subgoals}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1297
\end{minipage} *}
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1298
(*<*)oops(*>*)
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1299
193
+ − 1300
319
+ − 1301
+ − 1302
+ − 1303
+ − 1304
193
+ − 1305
(*
+ − 1306
ML {* SIMPLE_METHOD *}
+ − 1307
ML {* METHOD *}
+ − 1308
ML {* K (SIMPLE_METHOD ((etac @{thm conjE} THEN' rtac @{thm conjI}) 1)) *}
+ − 1309
ML {* Scan.succeed *}
+ − 1310
*)
+ − 1311
186
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1312
text {*
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1313
(FIXME: explain a version of rule-tac)
371e4375c994
made the Ackermann function example safer and included suggestions from MW
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1314
*}
178
fb8f22dd8ad0
adapted to latest Attrib.setup changes and more work on the simple induct chapter
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1315
75
+ − 1316
(*<*)
194
+ − 1317
(* THIS IS AN OLD VERSION OF THE PARSING CHAPTER BY JEREMY DAWSON *)
38
+ − 1318
chapter {* Parsing *}
+ − 1319
+ − 1320
text {*
+ − 1321
4
+ − 1322
Lots of Standard ML code is given in this document, for various reasons,
+ − 1323
including:
+ − 1324
\begin{itemize}
+ − 1325
\item direct quotation of code found in the Isabelle source files,
+ − 1326
or simplified versions of such code
+ − 1327
\item identifiers found in the Isabelle source code, with their types
+ − 1328
(or specialisations of their types)
+ − 1329
\item code examples, which can be run by the reader, to help illustrate the
+ − 1330
behaviour of functions found in the Isabelle source code
+ − 1331
\item ancillary functions, not from the Isabelle source code,
+ − 1332
which enable the reader to run relevant code examples
+ − 1333
\item type abbreviations, which help explain the uses of certain functions
+ − 1334
\end{itemize}
+ − 1335
+ − 1336
*}
+ − 1337
+ − 1338
section {* Parsing Isar input *}
+ − 1339
+ − 1340
text {*
+ − 1341
+ − 1342
The typical parsing function has the type
+ − 1343
\texttt{'src -> 'res * 'src}, with input
+ − 1344
of type \texttt{'src}, returning a result
+ − 1345
of type \texttt{'res}, which is (or is derived from) the first part of the
+ − 1346
input, and also returning the remainder of the input.
+ − 1347
(In the common case, when it is clear what the ``remainder of the input''
+ − 1348
means, we will just say that the functions ``returns'' the
+ − 1349
value of type \texttt{'res}).
+ − 1350
An exception is raised if an appropriate value
+ − 1351
cannot be produced from the input.
+ − 1352
A range of exceptions can be used to identify different reasons
+ − 1353
for the failure of a parse.
+ − 1354
+ − 1355
This contrasts the standard parsing function in Standard ML,
+ − 1356
which is of type
+ − 1357
\texttt{type ('res, 'src) reader = 'src -> ('res * 'src) option};
+ − 1358
(for example, \texttt{List.getItem} and \texttt{Substring.getc}).
+ − 1359
However, much of the discussion at
+ − 1360
FIX file:/home/jeremy/html/ml/SMLBasis/string-cvt.html
+ − 1361
is relevant.
+ − 1362
+ − 1363
Naturally one may convert between the two different sorts of parsing functions
+ − 1364
as follows:
+ − 1365
\begin{verbatim}
+ − 1366
open StringCvt ;
+ − 1367
type ('res, 'src) ex_reader = 'src -> 'res * 'src
75
+ − 1368
ex_reader : ('res, 'src) reader -> ('res, 'src) ex_reader
4
+ − 1369
fun ex_reader rdr src = Option.valOf (rdr src) ;
75
+ − 1370
reader : ('res, 'src) ex_reader -> ('res, 'src) reader
4
+ − 1371
fun reader exrdr src = SOME (exrdr src) handle _ => NONE ;
+ − 1372
\end{verbatim}
+ − 1373
+ − 1374
*}
+ − 1375
+ − 1376
section{* The \texttt{Scan} structure *}
+ − 1377
+ − 1378
text {*
+ − 1379
The source file is \texttt{src/General/scan.ML}.
+ − 1380
This structure provides functions for using and combining parsing functions
+ − 1381
of the type \texttt{'src -> 'res * 'src}.
+ − 1382
Three exceptions are used:
+ − 1383
\begin{verbatim}
+ − 1384
exception MORE of string option; (*need more input (prompt)*)
+ − 1385
exception FAIL of string option; (*try alternatives (reason of failure)*)
+ − 1386
exception ABORT of string; (*dead end*)
+ − 1387
\end{verbatim}
+ − 1388
Many functions in this structure (generally those with names composed of
+ − 1389
symbols) are declared as infix.
+ − 1390
+ − 1391
Some functions from that structure are
+ − 1392
\begin{verbatim}
+ − 1393
|-- : ('src -> 'res1 * 'src') * ('src' -> 'res2 * 'src'') ->
+ − 1394
'src -> 'res2 * 'src''
+ − 1395
--| : ('src -> 'res1 * 'src') * ('src' -> 'res2 * 'src'') ->
+ − 1396
'src -> 'res1 * 'src''
+ − 1397
-- : ('src -> 'res1 * 'src') * ('src' -> 'res2 * 'src'') ->
+ − 1398
'src -> ('res1 * 'res2) * 'src''
+ − 1399
^^ : ('src -> string * 'src') * ('src' -> string * 'src'') ->
+ − 1400
'src -> string * 'src''
+ − 1401
\end{verbatim}
+ − 1402
These functions parse a result off the input source twice.
+ − 1403
+ − 1404
\texttt{|--} and \texttt{--|}
+ − 1405
return the first result and the second result, respectively.
+ − 1406
+ − 1407
\texttt{--} returns both.
+ − 1408
+ − 1409
\verb|^^| returns the result of concatenating the two results
+ − 1410
(which must be strings).
+ − 1411
+ − 1412
Note how, although the types
+ − 1413
\texttt{'src}, \texttt{'src'} and \texttt{'src''} will normally be the same,
+ − 1414
the types as shown help suggest the behaviour of the functions.
+ − 1415
\begin{verbatim}
+ − 1416
:-- : ('src -> 'res1 * 'src') * ('res1 -> 'src' -> 'res2 * 'src'') ->
+ − 1417
'src -> ('res1 * 'res2) * 'src''
+ − 1418
:|-- : ('src -> 'res1 * 'src') * ('res1 -> 'src' -> 'res2 * 'src'') ->
+ − 1419
'src -> 'res2 * 'src''
+ − 1420
\end{verbatim}
+ − 1421
These are similar to \texttt{|--} and \texttt{--|},
+ − 1422
except that the second parsing function can depend on the result of the first.
+ − 1423
\begin{verbatim}
+ − 1424
>> : ('src -> 'res1 * 'src') * ('res1 -> 'res2) -> 'src -> 'res2 * 'src'
+ − 1425
|| : ('src -> 'res_src) * ('src -> 'res_src) -> 'src -> 'res_src
+ − 1426
\end{verbatim}
+ − 1427
\texttt{p >> f} applies a function \texttt{f} to the result of a parse.
+ − 1428
+ − 1429
\texttt{||} tries a second parsing function if the first one
+ − 1430
fails by raising an exception of the form \texttt{FAIL \_}.
+ − 1431
+ − 1432
\begin{verbatim}
+ − 1433
succeed : 'res -> ('src -> 'res * 'src) ;
+ − 1434
fail : ('src -> 'res_src) ;
+ − 1435
!! : ('src * string option -> string) ->
+ − 1436
('src -> 'res_src) -> ('src -> 'res_src) ;
+ − 1437
\end{verbatim}
+ − 1438
\texttt{succeed r} returns \texttt{r}, with the input unchanged.
+ − 1439
\texttt{fail} always fails, raising exception \texttt{FAIL NONE}.
+ − 1440
\texttt{!! f} only affects the failure mode, turning a failure that
+ − 1441
raises \texttt{FAIL \_} into a failure that raises \texttt{ABORT ...}.
+ − 1442
This is used to prevent recovery from the failure ---
+ − 1443
thus, in \texttt{!! parse1 || parse2}, if \texttt{parse1} fails,
+ − 1444
it won't recover by trying \texttt{parse2}.
+ − 1445
+ − 1446
\begin{verbatim}
+ − 1447
one : ('si -> bool) -> ('si list -> 'si * 'si list) ;
+ − 1448
some : ('si -> 'res option) -> ('si list -> 'res * 'si list) ;
+ − 1449
\end{verbatim}
+ − 1450
These require the input to be a list of items:
+ − 1451
they fail, raising \texttt{MORE NONE} if the list is empty.
+ − 1452
On other failures they raise \texttt{FAIL NONE}
+ − 1453
+ − 1454
\texttt{one p} takes the first
+ − 1455
item from the list if it satisfies \texttt{p}, otherwise fails.
+ − 1456
+ − 1457
\texttt{some f} takes the first
+ − 1458
item from the list and applies \texttt{f} to it, failing if this returns
+ − 1459
\texttt{NONE}.
+ − 1460
+ − 1461
\begin{verbatim}
+ − 1462
many : ('si -> bool) -> 'si list -> 'si list * 'si list ;
+ − 1463
\end{verbatim}
+ − 1464
\texttt{many p} takes items from the input until it encounters one
+ − 1465
which does not satisfy \texttt{p}. If it reaches the end of the input
+ − 1466
it fails, raising \texttt{MORE NONE}.
+ − 1467
+ − 1468
\texttt{many1} (with the same type) fails if the first item
+ − 1469
does not satisfy \texttt{p}.
+ − 1470
+ − 1471
\begin{verbatim}
+ − 1472
option : ('src -> 'res * 'src) -> ('src -> 'res option * 'src)
+ − 1473
optional : ('src -> 'res * 'src) -> 'res -> ('src -> 'res * 'src)
+ − 1474
\end{verbatim}
+ − 1475
\texttt{option}:
+ − 1476
where the parser \texttt{f} succeeds with result \texttt{r}
+ − 1477
or raises \texttt{FAIL \_},
+ − 1478
\texttt{option f} gives the result \texttt{SOME r} or \texttt{NONE}.
+ − 1479
+ − 1480
\texttt{optional}: if parser \texttt{f} fails by raising \texttt{FAIL \_},
+ − 1481
\texttt{optional f default} provides the result \texttt{default}.
+ − 1482
+ − 1483
\begin{verbatim}
+ − 1484
repeat : ('src -> 'res * 'src) -> 'src -> 'res list * 'src
+ − 1485
repeat1 : ('src -> 'res * 'src) -> 'src -> 'res list * 'src
+ − 1486
bulk : ('src -> 'res * 'src) -> 'src -> 'res list * 'src
+ − 1487
\end{verbatim}
+ − 1488
\texttt{repeat f} repeatedly parses an item off the remaining input until
+ − 1489
\texttt{f} fails with \texttt{FAIL \_}
+ − 1490
+ − 1491
\texttt{repeat1} is as for \texttt{repeat}, but requires at least one
+ − 1492
successful parse.
+ − 1493
+ − 1494
\begin{verbatim}
+ − 1495
lift : ('src -> 'res * 'src) -> ('ex * 'src -> 'res * ('ex * 'src))
+ − 1496
\end{verbatim}
+ − 1497
\texttt{lift} changes the source type of a parser by putting in an extra
+ − 1498
component \texttt{'ex}, which is ignored in the parsing.
+ − 1499
+ − 1500
The \texttt{Scan} structure also provides the type \texttt{lexicon},
+ − 1501
HOW DO THEY WORK ?? TO BE COMPLETED
+ − 1502
\begin{verbatim}
+ − 1503
dest_lexicon: lexicon -> string list ;
+ − 1504
make_lexicon: string list list -> lexicon ;
+ − 1505
empty_lexicon: lexicon ;
+ − 1506
extend_lexicon: string list list -> lexicon -> lexicon ;
+ − 1507
merge_lexicons: lexicon -> lexicon -> lexicon ;
+ − 1508
is_literal: lexicon -> string list -> bool ;
+ − 1509
literal: lexicon -> string list -> string list * string list ;
+ − 1510
\end{verbatim}
+ − 1511
Two lexicons, for the commands and keywords, are stored and can be retrieved
+ − 1512
by:
+ − 1513
\begin{verbatim}
+ − 1514
val (command_lexicon, keyword_lexicon) = OuterSyntax.get_lexicons () ;
+ − 1515
val commands = Scan.dest_lexicon command_lexicon ;
+ − 1516
val keywords = Scan.dest_lexicon keyword_lexicon ;
+ − 1517
\end{verbatim}
+ − 1518
*}
+ − 1519
+ − 1520
section{* The \texttt{OuterLex} structure *}
+ − 1521
+ − 1522
text {*
+ − 1523
The source file is @{text "src/Pure/Isar/outer_lex.ML"}.
+ − 1524
In some other source files its name is abbreviated:
+ − 1525
\begin{verbatim}
+ − 1526
structure T = OuterLex;
+ − 1527
\end{verbatim}
+ − 1528
This structure defines the type \texttt{token}.
+ − 1529
(The types
+ − 1530
\texttt{OuterLex.token},
+ − 1531
\texttt{OuterParse.token} and
+ − 1532
\texttt{SpecParse.token} are all the same).
+ − 1533
+ − 1534
Input text is split up into tokens, and the input source type for many parsing
+ − 1535
functions is \texttt{token list}.
+ − 1536
250
ab9e09076462
some polishing; added together with Jasmin more examples to the pretty printing section
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1537
The datatype definition (which is not published in the signature) is
4
+ − 1538
\begin{verbatim}
+ − 1539
datatype token = Token of Position.T * (token_kind * string);
+ − 1540
\end{verbatim}
+ − 1541
but here are some runnable examples for viewing tokens:
+ − 1542
+ − 1543
*}
+ − 1544
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1545
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1546
4
+ − 1547
69
+ − 1548
ML{*
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1549
val toks = OuterSyntax.scan Position.none
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1550
"theory,imports;begin x.y.z apply ?v1 ?'a 'a -- || 44 simp (* xx *) { * fff * }" ;
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1551
*}
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1552
69
+ − 1553
ML{*
4
+ − 1554
print_depth 20 ;
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1555
*}
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1556
69
+ − 1557
ML{*
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1558
map OuterLex.text_of toks ;
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1559
*}
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1560
69
+ − 1561
ML{*
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1562
val proper_toks = filter OuterLex.is_proper toks ;
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1563
*}
4
+ − 1564
69
+ − 1565
ML{*
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1566
map OuterLex.kind_of proper_toks
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1567
*}
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1568
69
+ − 1569
ML{*
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1570
map OuterLex.unparse proper_toks ;
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1571
*}
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1572
69
+ − 1573
ML{*
47
4daf913fdbe1
hakked latex so that it does not display ML {* *}; general tuning
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1574
OuterLex.stopper
4
+ − 1575
*}
+ − 1576
+ − 1577
text {*
+ − 1578
+ − 1579
The function \texttt{is\_proper : token -> bool} identifies tokens which are
+ − 1580
not white space or comments: many parsing functions assume require spaces or
+ − 1581
comments to have been filtered out.
+ − 1582
+ − 1583
There is a special end-of-file token:
+ − 1584
\begin{verbatim}
+ − 1585
val (tok_eof : token, is_eof : token -> bool) = T.stopper ;
+ − 1586
(* end of file token *)
+ − 1587
\end{verbatim}
+ − 1588
+ − 1589
*}
+ − 1590
+ − 1591
section {* The \texttt{OuterParse} structure *}
+ − 1592
+ − 1593
text {*
+ − 1594
The source file is \texttt{src/Pure/Isar/outer\_parse.ML}.
+ − 1595
In some other source files its name is abbreviated:
+ − 1596
\begin{verbatim}
+ − 1597
structure P = OuterParse;
+ − 1598
\end{verbatim}
+ − 1599
Here the parsers use \texttt{token list} as the input source type.
+ − 1600
+ − 1601
Some of the parsers simply select the first token, provided that it is of the
+ − 1602
right kind (as returned by \texttt{T.kind\_of}): these are
+ − 1603
\texttt{ command, keyword, short\_ident, long\_ident, sym\_ident, term\_var,
+ − 1604
type\_ident, type\_var, number, string, alt\_string, verbatim, sync, eof}
+ − 1605
Others select the first token, provided that it is one of several kinds,
+ − 1606
(eg, \texttt{name, xname, text, typ}).
+ − 1607
+ − 1608
\begin{verbatim}
+ − 1609
type 'a tlp = token list -> 'a * token list ; (* token list parser *)
+ − 1610
$$$ : string -> string tlp
+ − 1611
nat : int tlp ;
+ − 1612
maybe : 'a tlp -> 'a option tlp ;
+ − 1613
\end{verbatim}
+ − 1614
+ − 1615
\texttt{\$\$\$ s} returns the first token,
+ − 1616
if it equals \texttt{s} \emph{and} \texttt{s} is a keyword.
+ − 1617
+ − 1618
\texttt{nat} returns the first token, if it is a number, and evaluates it.
+ − 1619
+ − 1620
\texttt{maybe}: if \texttt{p} returns \texttt{r},
+ − 1621
then \texttt{maybe p} returns \texttt{SOME r} ;
+ − 1622
if the first token is an underscore, it returns \texttt{NONE}.
+ − 1623
+ − 1624
A few examples:
+ − 1625
\begin{verbatim}
+ − 1626
P.list : 'a tlp -> 'a list tlp ; (* likewise P.list1 *)
+ − 1627
P.and_list : 'a tlp -> 'a list tlp ; (* likewise P.and_list1 *)
+ − 1628
val toks : token list = OuterSyntax.scan "44 ,_, 66,77" ;
+ − 1629
val proper_toks = List.filter T.is_proper toks ;
+ − 1630
P.list P.nat toks ; (* OK, doesn't recognize white space *)
+ − 1631
P.list P.nat proper_toks ; (* fails, doesn't recognize what follows ',' *)
+ − 1632
P.list (P.maybe P.nat) proper_toks ; (* fails, end of input *)
+ − 1633
P.list (P.maybe P.nat) (proper_toks @ [tok_eof]) ; (* OK *)
+ − 1634
val toks : token list = OuterSyntax.scan "44 and 55 and 66 and 77" ;
+ − 1635
P.and_list P.nat (List.filter T.is_proper toks @ [tok_eof]) ; (* ??? *)
+ − 1636
\end{verbatim}
+ − 1637
+ − 1638
The following code helps run examples:
+ − 1639
\begin{verbatim}
+ − 1640
fun parse_str tlp str =
+ − 1641
let val toks : token list = OuterSyntax.scan str ;
+ − 1642
val proper_toks = List.filter T.is_proper toks @ [tok_eof] ;
+ − 1643
val (res, rem_toks) = tlp proper_toks ;
+ − 1644
val rem_str = String.concat
+ − 1645
(Library.separate " " (List.map T.unparse rem_toks)) ;
+ − 1646
in (res, rem_str) end ;
+ − 1647
\end{verbatim}
+ − 1648
+ − 1649
Some examples from \texttt{src/Pure/Isar/outer\_parse.ML}
+ − 1650
\begin{verbatim}
+ − 1651
val type_args =
+ − 1652
type_ident >> Library.single ||
+ − 1653
$$$ "(" |-- !!! (list1 type_ident --| $$$ ")") ||
+ − 1654
Scan.succeed [];
+ − 1655
\end{verbatim}
+ − 1656
There are three ways parsing a list of type arguments can succeed.
+ − 1657
The first line reads a single type argument, and turns it into a singleton
+ − 1658
list.
+ − 1659
The second line reads "(", and then the remainder, ignoring the "(" ;
+ − 1660
the remainder consists of a list of type identifiers (at least one),
+ − 1661
and then a ")" which is also ignored.
+ − 1662
The \texttt{!!!} ensures that if the parsing proceeds this far and then fails,
+ − 1663
it won't try the third line (see the description of \texttt{Scan.!!}).
+ − 1664
The third line consumes no input and returns the empty list.
+ − 1665
+ − 1666
\begin{verbatim}
+ − 1667
fun triple2 (x, (y, z)) = (x, y, z);
+ − 1668
val arity = xname -- ($$$ "::" |-- !!! (
+ − 1669
Scan.optional ($$$ "(" |-- !!! (list1 sort --| $$$ ")")) []
+ − 1670
-- sort)) >> triple2;
+ − 1671
\end{verbatim}
+ − 1672
The parser \texttt{arity} reads a typename $t$, then ``\texttt{::}'' (which is
+ − 1673
ignored), then optionally a list $ss$ of sorts and then another sort $s$.
+ − 1674
The result $(t, (ss, s))$ is transformed by \texttt{triple2} to $(t, ss, s)$.
+ − 1675
The second line reads the optional list of sorts:
+ − 1676
it reads first ``\texttt{(}'' and last ``\texttt{)}'', which are both ignored,
+ − 1677
and between them a comma-separated list of sorts.
+ − 1678
If this list is absent, the default \texttt{[]} provides the list of sorts.
+ − 1679
+ − 1680
\begin{verbatim}
+ − 1681
parse_str P.type_args "('a, 'b) ntyp" ;
+ − 1682
parse_str P.type_args "'a ntyp" ;
+ − 1683
parse_str P.type_args "ntyp" ;
+ − 1684
parse_str P.arity "ty :: tycl" ;
+ − 1685
parse_str P.arity "ty :: (tycl1, tycl2) tycl" ;
+ − 1686
\end{verbatim}
+ − 1687
+ − 1688
*}
+ − 1689
+ − 1690
section {* The \texttt{SpecParse} structure *}
+ − 1691
+ − 1692
text {*
+ − 1693
The source file is \texttt{src/Pure/Isar/spec\_parse.ML}.
+ − 1694
This structure contains token list parsers for more complicated values.
+ − 1695
For example,
+ − 1696
\begin{verbatim}
+ − 1697
open SpecParse ;
+ − 1698
attrib : Attrib.src tok_rdr ;
+ − 1699
attribs : Attrib.src list tok_rdr ;
+ − 1700
opt_attribs : Attrib.src list tok_rdr ;
+ − 1701
xthm : (thmref * Attrib.src list) tok_rdr ;
+ − 1702
xthms1 : (thmref * Attrib.src list) list tok_rdr ;
+ − 1703
+ − 1704
parse_str attrib "simp" ;
+ − 1705
parse_str opt_attribs "hello" ;
+ − 1706
val (ass, "") = parse_str attribs "[standard, xxxx, simp, intro, OF sym]" ;
+ − 1707
map Args.dest_src ass ;
+ − 1708
val (asrc, "") = parse_str attrib "THEN trans [THEN sym]" ;
+ − 1709
+ − 1710
parse_str xthm "mythm [attr]" ;
+ − 1711
parse_str xthms1 "thm1 [attr] thms2" ;
+ − 1712
\end{verbatim}
+ − 1713
+ − 1714
As you can see, attributes are described using types of the \texttt{Args}
+ − 1715
structure, described below.
+ − 1716
*}
+ − 1717
+ − 1718
section{* The \texttt{Args} structure *}
+ − 1719
+ − 1720
text {*
+ − 1721
The source file is \texttt{src/Pure/Isar/args.ML}.
250
ab9e09076462
some polishing; added together with Jasmin more examples to the pretty printing section
Christian Urban <urbanc@in.tum.de>
diff
changeset
+ − 1722
The primary type of this structure is the \texttt{src} datatype;
4
+ − 1723
the single constructors not published in the signature, but
+ − 1724
\texttt{Args.src} and \texttt{Args.dest\_src}
+ − 1725
are in fact the constructor and destructor functions.
+ − 1726
Note that the types \texttt{Attrib.src} and \texttt{Method.src}
+ − 1727
are in fact \texttt{Args.src}.
+ − 1728
+ − 1729
\begin{verbatim}
+ − 1730
src : (string * Args.T list) * Position.T -> Args.src ;
+ − 1731
dest_src : Args.src -> (string * Args.T list) * Position.T ;
+ − 1732
Args.pretty_src : Proof.context -> Args.src -> Pretty.T ;
+ − 1733
fun pr_src ctxt src = Pretty.string_of (Args.pretty_src ctxt src) ;
+ − 1734
+ − 1735
val thy = ML_Context.the_context () ;
+ − 1736
val ctxt = ProofContext.init thy ;
+ − 1737
map (pr_src ctxt) ass ;
+ − 1738
\end{verbatim}
+ − 1739
+ − 1740
So an \texttt{Args.src} consists of the first word, then a list of further
+ − 1741
``arguments'', of type \texttt{Args.T}, with information about position in the
+ − 1742
input.
+ − 1743
\begin{verbatim}
+ − 1744
(* how an Args.src is parsed *)
+ − 1745
P.position : 'a tlp -> ('a * Position.T) tlp ;
+ − 1746
P.arguments : Args.T list tlp ;
+ − 1747
+ − 1748
val parse_src : Args.src tlp =
+ − 1749
P.position (P.xname -- P.arguments) >> Args.src ;
+ − 1750
\end{verbatim}
+ − 1751
+ − 1752
\begin{verbatim}
+ − 1753
val ((first_word, args), pos) = Args.dest_src asrc ;
+ − 1754
map Args.string_of args ;
+ − 1755
\end{verbatim}
+ − 1756
+ − 1757
The \texttt{Args} structure contains more parsers and parser transformers
+ − 1758
for which the input source type is \texttt{Args.T list}. For example,
+ − 1759
\begin{verbatim}
+ − 1760
type 'a atlp = Args.T list -> 'a * Args.T list ;
+ − 1761
open Args ;
+ − 1762
nat : int atlp ; (* also Args.int *)
+ − 1763
thm_sel : PureThy.interval list atlp ;
+ − 1764
list : 'a atlp -> 'a list atlp ;
+ − 1765
attribs : (string -> string) -> Args.src list atlp ;
+ − 1766
opt_attribs : (string -> string) -> Args.src list atlp ;
+ − 1767
+ − 1768
(* parse_atl_str : 'a atlp -> (string -> 'a * string) ;
+ − 1769
given an Args.T list parser, to get a string parser *)
+ − 1770
fun parse_atl_str atlp str =
+ − 1771
let val (ats, rem_str) = parse_str P.arguments str ;
+ − 1772
val (res, rem_ats) = atlp ats ;
+ − 1773
in (res, String.concat (Library.separate " "
+ − 1774
(List.map Args.string_of rem_ats @ [rem_str]))) end ;
+ − 1775
+ − 1776
parse_atl_str Args.int "-1-," ;
+ − 1777
parse_atl_str (Scan.option Args.int) "x1-," ;
+ − 1778
parse_atl_str Args.thm_sel "(1-,4,13-22)" ;
+ − 1779
+ − 1780
val (ats as atsrc :: _, "") = parse_atl_str (Args.attribs I)
+ − 1781
"[THEN trans [THEN sym], simp, OF sym]" ;
+ − 1782
\end{verbatim}
+ − 1783
+ − 1784
From here, an attribute is interpreted using \texttt{Attrib.attribute}.
+ − 1785
+ − 1786
\texttt{Args} has a large number of functions which parse an \texttt{Args.src}
+ − 1787
and also refer to a generic context.
+ − 1788
Note the use of \texttt{Scan.lift} for this.
+ − 1789
(as does \texttt{Attrib} - RETHINK THIS)
+ − 1790
+ − 1791
(\texttt{Args.syntax} shown below has type specialised)
+ − 1792
+ − 1793
\begin{verbatim}
+ − 1794
type ('res, 'src) parse_fn = 'src -> 'res * 'src ;
+ − 1795
type 'a cgatlp = ('a, Context.generic * Args.T list) parse_fn ;
+ − 1796
Scan.lift : 'a atlp -> 'a cgatlp ;
+ − 1797
term : term cgatlp ;
+ − 1798
typ : typ cgatlp ;
+ − 1799
+ − 1800
Args.syntax : string -> 'res cgatlp -> src -> ('res, Context.generic) parse_fn ;
+ − 1801
Attrib.thm : thm cgatlp ;
+ − 1802
Attrib.thms : thm list cgatlp ;
+ − 1803
Attrib.multi_thm : thm list cgatlp ;
+ − 1804
+ − 1805
(* parse_cgatl_str : 'a cgatlp -> (string -> 'a * string) ;
+ − 1806
given a (Context.generic * Args.T list) parser, to get a string parser *)
+ − 1807
fun parse_cgatl_str cgatlp str =
+ − 1808
let
+ − 1809
(* use the current generic context *)
+ − 1810
val generic = Context.Theory thy ;
+ − 1811
val (ats, rem_str) = parse_str P.arguments str ;
+ − 1812
(* ignore any change to the generic context *)
+ − 1813
val (res, (_, rem_ats)) = cgatlp (generic, ats) ;
+ − 1814
in (res, String.concat (Library.separate " "
+ − 1815
(List.map Args.string_of rem_ats @ [rem_str]))) end ;
+ − 1816
\end{verbatim}
+ − 1817
*}
+ − 1818
+ − 1819
section{* Attributes, and the \texttt{Attrib} structure *}
+ − 1820
+ − 1821
text {*
+ − 1822
The type \texttt{attribute} is declared in \texttt{src/Pure/thm.ML}.
+ − 1823
The source file for the \texttt{Attrib} structure is
+ − 1824
\texttt{src/Pure/Isar/attrib.ML}.
+ − 1825
Most attributes use a theorem to change a generic context (for example,
+ − 1826
by declaring that the theorem should be used, by default, in simplification),
+ − 1827
or change a theorem (which most often involves referring to the current
+ − 1828
theory).
+ − 1829
The functions \texttt{Thm.rule\_attribute} and
+ − 1830
\texttt{Thm.declaration\_attribute} create attributes of these kinds.
+ − 1831
+ − 1832
\begin{verbatim}
+ − 1833
type attribute = Context.generic * thm -> Context.generic * thm;
+ − 1834
type 'a trf = 'a -> 'a ; (* transformer of a given type *)
+ − 1835
Thm.rule_attribute : (Context.generic -> thm -> thm) -> attribute ;
+ − 1836
Thm.declaration_attribute : (thm -> Context.generic trf) -> attribute ;
+ − 1837
+ − 1838
Attrib.print_attributes : theory -> unit ;
+ − 1839
Attrib.pretty_attribs : Proof.context -> src list -> Pretty.T list ;
+ − 1840
+ − 1841
List.app Pretty.writeln (Attrib.pretty_attribs ctxt ass) ;
+ − 1842
\end{verbatim}
+ − 1843
+ − 1844
An attribute is stored in a theory as indicated by:
+ − 1845
\begin{verbatim}
+ − 1846
Attrib.add_attributes :
+ − 1847
(bstring * (src -> attribute) * string) list -> theory trf ;
+ − 1848
(*
+ − 1849
Attrib.add_attributes [("THEN", THEN_att, "resolution with rule")] ;
+ − 1850
*)
+ − 1851
\end{verbatim}
+ − 1852
where the first and third arguments are name and description of the attribute,
+ − 1853
and the second is a function which parses the attribute input text
+ − 1854
(including the attribute name, which has necessarily already been parsed).
+ − 1855
Here, \texttt{THEN\_att} is a function declared in the code for the
+ − 1856
structure \texttt{Attrib}, but not published in its signature.
+ − 1857
The source file \texttt{src/Pure/Isar/attrib.ML} shows the use of
+ − 1858
\texttt{Attrib.add\_attributes} to add a number of attributes.
+ − 1859
+ − 1860
\begin{verbatim}
+ − 1861
FullAttrib.THEN_att : src -> attribute ;
+ − 1862
FullAttrib.THEN_att atsrc (generic, ML_Context.thm "sym") ;
+ − 1863
FullAttrib.THEN_att atsrc (generic, ML_Context.thm "all_comm") ;
+ − 1864
\end{verbatim}
+ − 1865
+ − 1866
\begin{verbatim}
+ − 1867
Attrib.syntax : attribute cgatlp -> src -> attribute ;
+ − 1868
Attrib.no_args : attribute -> src -> attribute ;
+ − 1869
\end{verbatim}
+ − 1870
When this is called as \texttt{syntax scan src (gc, th)}
+ − 1871
the generic context \texttt{gc} is used
+ − 1872
(and potentially changed to \texttt{gc'})
+ − 1873
by \texttt{scan} in parsing to obtain an attribute \texttt{attr} which would
+ − 1874
then be applied to \texttt{(gc', th)}.
+ − 1875
The source for parsing the attribute is the arguments part of \texttt{src},
+ − 1876
which must all be consumed by the parse.
+ − 1877
+ − 1878
For example, for \texttt{Attrib.no\_args attr src}, the attribute parser
+ − 1879
simply returns \texttt{attr}, requiring that the arguments part of
+ − 1880
\texttt{src} must be empty.
+ − 1881
+ − 1882
Some examples from \texttt{src/Pure/Isar/attrib.ML}, modified:
+ − 1883
\begin{verbatim}
+ − 1884
fun rot_att_n n (gc, th) = (gc, rotate_prems n th) ;
+ − 1885
rot_att_n : int -> attribute ;
+ − 1886
val rot_arg = Scan.lift (Scan.optional Args.int 1 : int atlp) : int cgatlp ;
+ − 1887
val rotated_att : src -> attribute =
+ − 1888
Attrib.syntax (rot_arg >> rot_att_n : attribute cgatlp) ;
+ − 1889
+ − 1890
val THEN_arg : int cgatlp = Scan.lift
+ − 1891
(Scan.optional (Args.bracks Args.nat : int atlp) 1 : int atlp) ;
+ − 1892
+ − 1893
Attrib.thm : thm cgatlp ;
+ − 1894
+ − 1895
THEN_arg -- Attrib.thm : (int * thm) cgatlp ;
+ − 1896
+ − 1897
fun THEN_att_n (n, tht) (gc, th) = (gc, th RSN (n, tht)) ;
+ − 1898
THEN_att_n : int * thm -> attribute ;
+ − 1899
+ − 1900
val THEN_att : src -> attribute = Attrib.syntax
+ − 1901
(THEN_arg -- Attrib.thm >> THEN_att_n : attribute cgatlp);
+ − 1902
\end{verbatim}
+ − 1903
The functions I've called \texttt{rot\_arg} and \texttt{THEN\_arg}
+ − 1904
read an optional argument, which for \texttt{rotated} is an integer,
+ − 1905
and for \texttt{THEN} is a natural enclosed in square brackets;
+ − 1906
the default, if the argument is absent, is 1 in each case.
+ − 1907
Functions \texttt{rot\_att\_n} and \texttt{THEN\_att\_n} turn these into
+ − 1908
attributes, where \texttt{THEN\_att\_n} also requires a theorem, which is
+ − 1909
parsed by \texttt{Attrib.thm}.
+ − 1910
Infix operators \texttt{--} and \texttt{>>} are in the structure \texttt{Scan}.
+ − 1911
+ − 1912
*}
+ − 1913
+ − 1914
section{* Methods, and the \texttt{Method} structure *}
+ − 1915
+ − 1916
text {*
+ − 1917
The source file is \texttt{src/Pure/Isar/method.ML}.
+ − 1918
The type \texttt{method} is defined by the datatype declaration
+ − 1919
\begin{verbatim}
+ − 1920
(* datatype method = Meth of thm list -> cases_tactic; *)
+ − 1921
RuleCases.NO_CASES : tactic -> cases_tactic ;
+ − 1922
\end{verbatim}
+ − 1923
In fact \texttt{RAW\_METHOD\_CASES} (below) is exactly the constructor
+ − 1924
\texttt{Meth}.
+ − 1925
A \texttt{cases\_tactic} is an elaborated version of a tactic.
+ − 1926
\texttt{NO\_CASES tac} is a \texttt{cases\_tactic} which consists of a
+ − 1927
\texttt{cases\_tactic} without any further case information.
+ − 1928
For further details see the description of structure \texttt{RuleCases} below.
+ − 1929
The list of theorems to be passed to a method consists of the current
+ − 1930
\emph{facts} in the proof.
+ − 1931
+ − 1932
\begin{verbatim}
+ − 1933
RAW_METHOD : (thm list -> tactic) -> method ;
+ − 1934
METHOD : (thm list -> tactic) -> method ;
+ − 1935
+ − 1936
SIMPLE_METHOD : tactic -> method ;
+ − 1937
SIMPLE_METHOD' : (int -> tactic) -> method ;
+ − 1938
SIMPLE_METHOD'' : ((int -> tactic) -> tactic) -> (int -> tactic) -> method ;
+ − 1939
+ − 1940
RAW_METHOD_CASES : (thm list -> cases_tactic) -> method ;
+ − 1941
METHOD_CASES : (thm list -> cases_tactic) -> method ;
+ − 1942
\end{verbatim}
+ − 1943
A method is, in its simplest form, a tactic; applying the method is to apply
+ − 1944
the tactic to the current goal state.
+ − 1945
+ − 1946
Applying \texttt{RAW\_METHOD tacf} creates a tactic by applying
+ − 1947
\texttt{tacf} to the current {facts}, and applying that tactic to the
+ − 1948
goal state.
+ − 1949
+ − 1950
\texttt{METHOD} is similar but also first applies
+ − 1951
\texttt{Goal.conjunction\_tac} to all subgoals.
+ − 1952
+ − 1953
\texttt{SIMPLE\_METHOD tac} inserts the facts into all subgoals and then
+ − 1954
applies \texttt{tacf}.
+ − 1955
+ − 1956
\texttt{SIMPLE\_METHOD' tacf} inserts the facts and then
+ − 1957
applies \texttt{tacf} to subgoal 1.
+ − 1958
+ − 1959
\texttt{SIMPLE\_METHOD'' quant tacf} does this for subgoal(s) selected by
+ − 1960
\texttt{quant}, which may be, for example,
+ − 1961
\texttt{ALLGOALS} (all subgoals),
+ − 1962
\texttt{TRYALL} (try all subgoals, failure is OK),
+ − 1963
\texttt{FIRSTGOAL} (try subgoals until it succeeds once),
+ − 1964
\texttt{(fn tacf => tacf 4)} (subgoal 4), etc
16
+ − 1965
(see the \texttt{Tactical} structure, FIXME) %%\cite[Chapter 4]{ref}).
4
+ − 1966
+ − 1967
A method is stored in a theory as indicated by:
+ − 1968
\begin{verbatim}
+ − 1969
Method.add_method :
+ − 1970
(bstring * (src -> Proof.context -> method) * string) -> theory trf ;
+ − 1971
( *
+ − 1972
* )
+ − 1973
\end{verbatim}
+ − 1974
where the first and third arguments are name and description of the method,
+ − 1975
and the second is a function which parses the method input text
+ − 1976
(including the method name, which has necessarily already been parsed).
+ − 1977
+ − 1978
Here, \texttt{xxx} is a function declared in the code for the
+ − 1979
structure \texttt{Method}, but not published in its signature.
+ − 1980
The source file \texttt{src/Pure/Isar/method.ML} shows the use of
+ − 1981
\texttt{Method.add\_method} to add a number of methods.
240
+ − 1982
*}
4
+ − 1983
75
+ − 1984
(*>*)
220
+ − 1985
end