61 in @{ML_text "($$ \"h\") []"}. |
61 in @{ML_text "($$ \"h\") []"}. |
62 \item @{ML_text "ABORT"} is the exception which is raised when a dead end is reached. |
62 \item @{ML_text "ABORT"} is the exception which is raised when a dead end is reached. |
63 It is used for example in the function @{ML "(op !!)"} (see below). |
63 It is used for example in the function @{ML "(op !!)"} (see below). |
64 \end{itemize} |
64 \end{itemize} |
65 |
65 |
66 (FIXME: do the exception need to be explained, because the user cannot use them from ``outside''?) |
66 However, note that these exception private to the parser and cannot be accessed |
67 |
67 by the programmer (for example to handle them). |
68 Slightly more general than the parser @{ML "(op $$)"} is the function @{ML Scan.one}, in that it |
68 |
69 takes a predicate as argument and then parses exactly one item from the input list |
69 Slightly more general than the parser @{ML "(op $$)"} is the function @{ML |
70 satisfying this prediate. For example the following parser either consumes an |
70 Scan.one}, in that it takes a predicate as argument and then parses exactly |
71 @{ML_text [quotes] "h"} or a @{ML_text [quotes] "w"}: |
71 one item from the input list satisfying this prediate. For example the |
|
72 following parser either consumes an @{ML_text [quotes] "h"} or a @{ML_text |
|
73 [quotes] "w"}: |
|
74 |
72 |
75 |
73 @{ML_response [display] |
76 @{ML_response [display] |
74 "let |
77 "let |
75 val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\") |
78 val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\") |
76 val input1 = (explode \"hello\") |
79 val input1 = (explode \"hello\") |
129 val input2 = (explode \"world\") |
132 val input2 = (explode \"world\") |
130 in |
133 in |
131 (p input1, p input2) |
134 (p input1, p input2) |
132 end" |
135 end" |
133 "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"} |
136 "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"} |
|
137 |
|
138 The function @{ML Scan.option} works similarly, except no default value can |
|
139 be given and in the failure case this parser does nothing. |
134 |
140 |
135 The function @{ML "(op !!)"} helps to produce appropriate error messages |
141 The function @{ML "(op !!)"} helps to produce appropriate error messages |
136 during parsing. For example if one wants to parse that @{ML_text p} is immediately |
142 during parsing. For example if one wants to parse that @{ML_text p} is immediately |
137 followed by @{ML_text q}, or start a completely different parser @{ML_text r}, |
143 followed by @{ML_text q}, or start a completely different parser @{ML_text r}, |
138 one might write |
144 one might write |
212 |
218 |
213 Note that @{ML "Scan.repeat"} stores the parsed items in a list. The function |
219 Note that @{ML "Scan.repeat"} stores the parsed items in a list. The function |
214 @{ML "Scan.repeat1"} is similar, but requires that the parser @{ML_text "p"} |
220 @{ML "Scan.repeat1"} is similar, but requires that the parser @{ML_text "p"} |
215 succeeds at least once. |
221 succeeds at least once. |
216 |
222 |
|
223 Also note that the parser would have aborted with the exception @{ML_text MORE}, if |
|
224 we had run it only on just @{ML_text [quotes] "hhhh"}. This can be awoided using |
|
225 the wrapper @{ML Scan.finite} and the ``stopper-token'' @{ML Symbol.stopper}. With |
|
226 them we can write |
|
227 |
|
228 @{ML_response [display] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")" |
|
229 "([\"h\", \"h\", \"h\", \"h\"], [])"} |
|
230 |
|
231 However, the Isabelle-develloper rarely needs to do this kind of wrapping manually. |
|
232 |
217 After parsing succeeded, one nearly always wants to apply a function on the parsed |
233 After parsing succeeded, one nearly always wants to apply a function on the parsed |
218 items. This is done using the function @{ML_open "(p >> f)" for p f} which runs |
234 items. This is done using the function @{ML_open "(p >> f)" for p f} which runs |
219 first the parser @{ML_text p} and upon successful completion applies the |
235 first the parser @{ML_text p} and upon successful completion applies the |
220 function @{ML_text f} to the result. For example |
236 function @{ML_text f} to the result. For example |
221 |
237 |
262 We can generate a token list using the function @{ML "OuterSyntax.scan"}, which we give |
278 We can generate a token list using the function @{ML "OuterSyntax.scan"}, which we give |
263 below @{ML "Position.none"} as argument since, at the moment, we are not interested in |
279 below @{ML "Position.none"} as argument since, at the moment, we are not interested in |
264 generating precise error messages. The following\footnote{There is something funny |
280 generating precise error messages. The following\footnote{There is something funny |
265 going on with the pretty printing of the result token list.} |
281 going on with the pretty printing of the result token list.} |
266 |
282 |
267 @{ML_response [display] "OuterSyntax.scan Position.none \"hello world\"" |
283 @{ML_response_fake [display] "OuterSyntax.scan Position.none \"hello world\"" |
268 "[OuterLex.Token (\<dots>,(OuterLex.Ident, \"hello\"),\<dots>), |
284 "[Token (\<dots>,(OuterLex.Ident, \"hello\"),\<dots>), |
269 OuterLex.Token (\<dots>,(OuterLex.Space, \" \"),\<dots>), |
285 Token (\<dots>,(OuterLex.Space, \" \"),\<dots>), |
270 OuterLex.Token (\<dots>,(OuterLex.Ident, \"world\"),\<dots>)]"} |
286 Token (\<dots>,(OuterLex.Ident, \"world\"),\<dots>)]"} |
271 |
287 |
272 produces three tokens where the first and the last are identifiers, since |
288 produces three tokens where the first and the last are identifiers, since |
273 @{ML_text [quotes] "hello"} and @{ML_text [quotes] "world"} do not match |
289 @{ML_text [quotes] "hello"} and @{ML_text [quotes] "world"} do not match |
274 any other category. The second indicates a space. Many parsing functions |
290 any other category. The second indicates a space. Many parsing functions |
275 later on will require spaces, comments and the like to have been filtered out. |
291 later on will require spaces, comments and the like to have been filtered out. |
276 If we parse |
292 If we parse |
277 |
293 |
278 @{ML_response [display] "OuterSyntax.scan Position.none \"inductive|for\"" |
294 @{ML_response_fake [display] "OuterSyntax.scan Position.none \"inductive|for\"" |
279 "[OuterLex.Token (\<dots>,(OuterLex.Command, \"inductive\"),\<dots>), |
295 "[Token (\<dots>,(OuterLex.Command, \"inductive\"),\<dots>), |
280 OuterLex.Token (\<dots>,(OuterLex.Keyword, \"|\"),\<dots>), |
296 Token (\<dots>,(OuterLex.Keyword, \"|\"),\<dots>), |
281 OuterLex.Token (\<dots>,(OuterLex.Keyword, \"for\"),\<dots>)]"} |
297 Token (\<dots>,(OuterLex.Keyword, \"for\"),\<dots>)]"} |
282 |
298 |
283 we obtain a list consiting of only command and keyword tokens. |
299 we obtain a list consiting of only command and keyword tokens. |
284 If you want to see which keywords and commands are currently known, use |
300 If you want to see which keywords and commands are currently known, use |
285 the following (you might have to adjust the @{ML print_depth} in order to |
301 the following (you might have to adjust the @{ML print_depth} in order to |
286 see the complete list): |
302 see the complete list): |
314 end" |
330 end" |
315 "((\"|\",\"in\"),[])"} |
331 "((\"|\",\"in\"),[])"} |
316 |
332 |
317 The parser @{ML_open "OuterParse.enum s p" for s p} parses a possibly empty |
333 The parser @{ML_open "OuterParse.enum s p" for s p} parses a possibly empty |
318 list of items recognised by the parser @{ML_text p}, where the items are |
334 list of items recognised by the parser @{ML_text p}, where the items are |
319 separated by @{ML_text s}. For example |
335 separated by the string @{ML_text s}. For example |
320 |
336 |
321 @{ML_response [display] |
337 @{ML_response [display] |
322 "let |
338 "let |
323 val input = OuterSyntax.scan Position.none \"in|in|in\\n\" |
339 val input = OuterSyntax.scan Position.none \"in|in|in\\n\" |
324 in |
340 in |
325 (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input |
341 (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input |
326 end" |
342 end" |
327 "([\"in\",\"in\",\"in\"],[\<dots>])"} |
343 "([\"in\",\"in\",\"in\"],[\<dots>])"} |
328 |
344 |
|
345 @{ML "OuterParse.enum1"} works similarly, except that the parsed list must |
|
346 be non-empty. |
|
347 |
329 Note that we had to add a @{ML_text [quotes] "\\n"} at the end of the parsed |
348 Note that we had to add a @{ML_text [quotes] "\\n"} at the end of the parsed |
330 string, otherwise the parser would have consumed all tokens and then failed with |
349 string, otherwise the parser would have consumed all tokens and then failed with |
331 the exception @{ML_text "MORE"}. @{ML "OuterParse.enum1"} works similarly, |
350 the exception @{ML_text "MORE"}. Like in the previous section, we can avoid |
332 except that the parsed list must be non-empty. |
351 this exception using the wrapper @{ML Scan.finite}. This time, however, we |
333 |
352 have to use the ``stopper-token'' @{ML OuterLex.stopper}. We can write |
334 *} |
353 |
335 |
354 @{ML_response [display] |
336 text {* FIXME explain @{ML "OuterParse.!!!"} *} |
355 "let |
|
356 val input = OuterSyntax.scan Position.none \"in|in|in\" |
|
357 in |
|
358 Scan.finite OuterLex.stopper (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input |
|
359 end" |
|
360 "([\"in\",\"in\",\"in\"],[])"} |
|
361 |
|
362 The function @{ML "OuterParse.!!!"} can be used to force termination of the |
|
363 parser in case of a dead end, just like @{ML "Scan.!!"} (see previous section), |
|
364 except that the error message is fixed to be @{text [quotes] "Outer syntax error"} |
|
365 with a relatively precise description of the failure. For example: |
|
366 |
|
367 @{ML_response_fake [display] |
|
368 "let |
|
369 val input = OuterSyntax.scan Position.none \"in|\" |
|
370 val parse_bar_then_in = OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\" |
|
371 in |
|
372 Scan.error (OuterParse.!!! parse_bar_then_in) input |
|
373 end" |
|
374 "Exception ERROR \"Outer syntax error: keyword \"|\" expected, |
|
375 but keyword in was found\" raised" |
|
376 } |
|
377 |
|
378 *} |
|
379 |
|
380 |
|
381 ML {* |
|
382 let |
|
383 val input = filter OuterLex.is_proper (OuterSyntax.scan Position.none "(in foo)") |
|
384 in |
|
385 OuterParse.target input |
|
386 end |
|
387 *} |
|
388 |
|
389 section {* Positional Information *} |
|
390 |
|
391 text {* |
|
392 |
|
393 @{ML OuterParse.position} |
|
394 |
|
395 *} |
|
396 |
|
397 ML {* |
|
398 OuterParse.position |
|
399 *} |
|
400 |
337 |
401 |
338 section {* Parsing Inner Syntax *} |
402 section {* Parsing Inner Syntax *} |
339 |
403 |
340 ML {* |
404 ML {* |
341 let |
405 let |