90 @{ML_response [display] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")" |
90 @{ML_response [display] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")" |
91 "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"} |
91 "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"} |
92 |
92 |
93 Note how the result of consumed strings builds up on the left as nested pairs. |
93 Note how the result of consumed strings builds up on the left as nested pairs. |
94 |
94 |
95 Parsers that explore |
95 If, as in the previous example, one wants to parse a particular string, |
96 alternatives can be constructed using the function @{ML "(op ||)"}. For example, the |
96 then one should rather use the function @{ML Scan.this_string}: |
97 parser @{ML "(p || q)" for p q} returns the result of @{ML_text "p"}, in case it succeeds, |
97 |
98 otherwise it returns the result of @{ML_text "q"}. For example |
98 @{ML_response [display] "Scan.this_string \"hel\" (explode \"hello\")" |
|
99 "(\"hel\", [\"l\", \"o\"])"} |
|
100 |
|
101 Parsers that explore alternatives can be constructed using the function @{ML |
|
102 "(op ||)"}. For example, the parser @{ML "(p || q)" for p q} returns the |
|
103 result of @{ML_text "p"}, in case it succeeds, otherwise it returns the |
|
104 result of @{ML_text "q"}. For example |
|
105 |
99 |
106 |
100 @{ML_response [display] |
107 @{ML_response [display] |
101 "let |
108 "let |
102 val hw = ($$ \"h\") || ($$ \"w\") |
109 val hw = ($$ \"h\") || ($$ \"w\") |
103 val input1 = (explode \"hello\") |
110 val input1 = (explode \"hello\") |
238 "([\"h\", \"h\", \"h\", \"h\"], [])"} |
245 "([\"h\", \"h\", \"h\", \"h\"], [])"} |
239 |
246 |
240 However, this kind of manually wrapping needs to be done only very rarely |
247 However, this kind of manually wrapping needs to be done only very rarely |
241 in practise, because it is already done by the infrastructure for you. |
248 in practise, because it is already done by the infrastructure for you. |
242 |
249 |
|
250 The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any |
|
251 string as in |
|
252 |
|
253 @{ML_response [display] |
|
254 "let |
|
255 val p = Scan.repeat (Scan.one Symbol.not_eof) |
|
256 val input = (explode \"foo bar foo\") |
|
257 in |
|
258 Scan.finite Symbol.stopper p input |
|
259 end" |
|
260 "([\"f\", \"o\", \"o\", \" \", \"b\", \"a\", \"r\", \" \", \"f\", \"o\", \"o\"], [])"} |
|
261 |
|
262 where the function @{ML Symbol.not_eof} ensures that we do not read beyond the |
|
263 end of the input string. |
|
264 |
|
265 The function @{ML "Scan.unless p q" for p q} takes two parsers: if the first one can |
|
266 parse the input then the whole parser fails; if not, then the second is tried. Therefore |
|
267 |
|
268 @{ML_response_fake_both [display] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"hello\")" |
|
269 "Exception FAIL raised"} |
|
270 |
|
271 fails, while |
|
272 |
|
273 @{ML_response [display] "Scan.unless ($$ \"h\") ($$ \"w\") (explode \"world\")" |
|
274 "(\"w\",[\"o\", \"r\", \"l\", \"d\"])"} |
|
275 |
|
276 succeeds. |
|
277 |
|
278 The functions @{ML Scan.repeat} and @{ML Scan.unless} can be combined to read any |
|
279 input until a certain marker symbol is reached. In the example below the marker |
|
280 symbol is a @{text [quotes] "*"}: |
|
281 |
|
282 @{ML_response [display] |
|
283 "let |
|
284 val p = Scan.repeat (Scan.unless ($$ \"*\") (Scan.one Symbol.not_eof)) |
|
285 val input1 = (explode \"fooooo\") |
|
286 val input2 = (explode \"foo*ooo\") |
|
287 in |
|
288 (Scan.finite Symbol.stopper p input1, |
|
289 Scan.finite Symbol.stopper p input2) |
|
290 end" |
|
291 "(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []), |
|
292 ([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"} |
|
293 |
243 After parsing succeeded, one nearly always wants to apply a function on the parsed |
294 After parsing succeeded, one nearly always wants to apply a function on the parsed |
244 items. This is done using the function @{ML "(p >> f)" for p f} which runs |
295 items. This is done using the function @{ML "(p >> f)" for p f} which runs |
245 first the parser @{ML_text p} and upon successful completion applies the |
296 first the parser @{ML_text p} and upon successful completion applies the |
246 function @{ML_text f} to the result. For example |
297 function @{ML_text f} to the result. For example |
247 |
298 |
253 end" |
304 end" |
254 "((\"hh\", \"ee\"), [\"l\", \"l\", \"o\"])"} |
305 "((\"hh\", \"ee\"), [\"l\", \"l\", \"o\"])"} |
255 |
306 |
256 doubles the two parsed input strings. |
307 doubles the two parsed input strings. |
257 |
308 |
|
309 \begin{exercise}\label{ex:scancmts} |
|
310 Write a parser parses an input string so that any comment enclosed inside |
|
311 @{text "(*\<dots>*)"} is enclosed inside @{text "(**\<dots>**)"} in the output |
|
312 string. To enclose a string, you can use the function @{ML "enclose s1 s2 s" |
|
313 for s1 s2 s} which produces the string @{ML "s1^s^s2" for s1 s2 s}. |
|
314 \end{exercise} |
|
315 |
|
316 |
258 The function @{ML Scan.lift} takes a parser and a pair as arguments. This function applies |
317 The function @{ML Scan.lift} takes a parser and a pair as arguments. This function applies |
259 the given parser to the second component of the pair and leaves the first component |
318 the given parser to the second component of the pair and leaves the first component |
260 untouched. For example |
319 untouched. For example |
261 |
320 |
262 @{ML_response [display] |
321 @{ML_response [display] |