changeset 108 | 8bea3f74889d |
parent 105 | f49dc7e96235 |
child 114 | 13fd0a83d3c3 |
107:258ce361ba1b | 108:8bea3f74889d |
---|---|
35 section {* Building Generic Parsers *} |
35 section {* Building Generic Parsers *} |
36 |
36 |
37 text {* |
37 text {* |
38 |
38 |
39 Let us first have a look at parsing strings using generic parsing combinators. |
39 Let us first have a look at parsing strings using generic parsing combinators. |
40 The function @{ML "(op $$)"} takes a string as argument and will ``consume'' this string from |
40 The function @{ML "$$"} takes a string as argument and will ``consume'' this string from |
41 a given input list of strings. ``Consume'' in this context means that it will |
41 a given input list of strings. ``Consume'' in this context means that it will |
42 return a pair consisting of this string and the rest of the input list. |
42 return a pair consisting of this string and the rest of the input list. |
43 For example: |
43 For example: |
44 |
44 |
45 @{ML_response [display,gray] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} |
45 @{ML_response [display,gray] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} |
59 \item @{text "FAIL"} is used to indicate that alternative routes of parsing |
59 \item @{text "FAIL"} is used to indicate that alternative routes of parsing |
60 might be explored. |
60 might be explored. |
61 \item @{text "MORE"} indicates that there is not enough input for the parser. For example |
61 \item @{text "MORE"} indicates that there is not enough input for the parser. For example |
62 in @{text "($$ \"h\") []"}. |
62 in @{text "($$ \"h\") []"}. |
63 \item @{text "ABORT"} is the exception that is raised when a dead end is reached. |
63 \item @{text "ABORT"} is the exception that is raised when a dead end is reached. |
64 It is used for example in the function @{ML "(op !!)"} (see below). |
64 It is used for example in the function @{ML "!!"} (see below). |
65 \end{itemize} |
65 \end{itemize} |
66 |
66 |
67 However, note that these exceptions are private to the parser and cannot be accessed |
67 However, note that these exceptions are private to the parser and cannot be accessed |
68 by the programmer (for example to handle them). |
68 by the programmer (for example to handle them). |
69 |
69 |
70 Slightly more general than the parser @{ML "(op $$)"} is the function @{ML |
70 Slightly more general than the parser @{ML "$$"} is the function @{ML |
71 Scan.one}, in that it takes a predicate as argument and then parses exactly |
71 Scan.one}, in that it takes a predicate as argument and then parses exactly |
72 one item from the input list satisfying this predicate. For example the |
72 one item from the input list satisfying this predicate. For example the |
73 following parser either consumes an @{text [quotes] "h"} or a @{text |
73 following parser either consumes an @{text [quotes] "h"} or a @{text |
74 [quotes] "w"}: |
74 [quotes] "w"}: |
75 |
75 |
82 in |
82 in |
83 (hw input1, hw input2) |
83 (hw input1, hw input2) |
84 end" |
84 end" |
85 "((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"} |
85 "((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"} |
86 |
86 |
87 Two parser can be connected in sequence by using the function @{ML "(op --)"}. |
87 Two parser can be connected in sequence by using the function @{ML "--"}. |
88 For example parsing @{text "h"}, @{text "e"} and @{text "l"} in this |
88 For example parsing @{text "h"}, @{text "e"} and @{text "l"} in this |
89 sequence you can achieve by: |
89 sequence you can achieve by: |
90 |
90 |
91 @{ML_response [display,gray] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")" |
91 @{ML_response [display,gray] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")" |
92 "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"} |
92 "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"} |
98 |
98 |
99 @{ML_response [display,gray] "Scan.this_string \"hell\" (explode \"hello\")" |
99 @{ML_response [display,gray] "Scan.this_string \"hell\" (explode \"hello\")" |
100 "(\"hell\", [\"o\"])"} |
100 "(\"hell\", [\"o\"])"} |
101 |
101 |
102 Parsers that explore alternatives can be constructed using the function @{ML |
102 Parsers that explore alternatives can be constructed using the function @{ML |
103 "(op ||)"}. For example, the parser @{ML "(p || q)" for p q} returns the |
103 "||"}. For example, the parser @{ML "(p || q)" for p q} returns the |
104 result of @{text "p"}, in case it succeeds, otherwise it returns the |
104 result of @{text "p"}, in case it succeeds, otherwise it returns the |
105 result of @{text "q"}. For example: |
105 result of @{text "q"}. For example: |
106 |
106 |
107 |
107 |
108 @{ML_response [display,gray] |
108 @{ML_response [display,gray] |
113 in |
113 in |
114 (hw input1, hw input2) |
114 (hw input1, hw input2) |
115 end" |
115 end" |
116 "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"} |
116 "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"} |
117 |
117 |
118 The functions @{ML "(op |--)"} and @{ML "(op --|)"} work like the sequencing function |
118 The functions @{ML "|--"} and @{ML "--|"} work like the sequencing function |
119 for parsers, except that they discard the item being parsed by the first (respectively second) |
119 for parsers, except that they discard the item being parsed by the first (respectively second) |
120 parser. For example: |
120 parser. For example: |
121 |
121 |
122 @{ML_response [display,gray] |
122 @{ML_response [display,gray] |
123 "let |
123 "let |
153 val input2 = (explode \"world\") |
153 val input2 = (explode \"world\") |
154 in |
154 in |
155 (p input1, p input2) |
155 (p input1, p input2) |
156 end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"} |
156 end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"} |
157 |
157 |
158 The function @{ML "(op !!)"} helps to produce appropriate error messages |
158 The function @{ML "!!"} helps to produce appropriate error messages |
159 during parsing. For example if you want to parse that @{text p} is immediately |
159 during parsing. For example if you want to parse that @{text p} is immediately |
160 followed by @{text q}, or start a completely different parser @{text r}, |
160 followed by @{text q}, or start a completely different parser @{text r}, |
161 you might write: |
161 you might write: |
162 |
162 |
163 @{ML [display,gray] "(p -- q) || r" for p q r} |
163 @{ML [display,gray] "(p -- q) || r" for p q r} |
169 the input, but not @{text q}. That means @{ML "(p -- q)" for p q} will fail |
169 the input, but not @{text q}. That means @{ML "(p -- q)" for p q} will fail |
170 and the alternative parser @{text r} will be tried. However in many |
170 and the alternative parser @{text r} will be tried. However in many |
171 circumstance this will be the wrong parser for the input ``p-followed-by-q'' |
171 circumstance this will be the wrong parser for the input ``p-followed-by-q'' |
172 and therefore will also fail. The error message is then caused by the |
172 and therefore will also fail. The error message is then caused by the |
173 failure of @{text r}, not by the absence of @{text q} in the input. This |
173 failure of @{text r}, not by the absence of @{text q} in the input. This |
174 kind of situation can be avoided when using the function @{ML "(op !!)"}. |
174 kind of situation can be avoided when using the function @{ML "!!"}. |
175 This function aborts the whole process of parsing in case of a |
175 This function aborts the whole process of parsing in case of a |
176 failure and prints an error message. For example if you invoke the parser |
176 failure and prints an error message. For example if you invoke the parser |
177 |
177 |
178 |
178 |
179 @{ML [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\"))"} |
179 @{ML [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\"))"} |
187 but if you invoke it on @{text [quotes] "world"} |
187 but if you invoke it on @{text [quotes] "world"} |
188 |
188 |
189 @{ML_response_fake [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")" |
189 @{ML_response_fake [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")" |
190 "Exception ABORT raised"} |
190 "Exception ABORT raised"} |
191 |
191 |
192 then the parsing aborts and the error message @{text "foo"} is printed out. In order to |
192 then the parsing aborts and the error message @{text "foo"} is printed. In order to |
193 see the error message properly, we need to prefix the parser with the function |
193 see the error message properly, we need to prefix the parser with the function |
194 @{ML "Scan.error"}. For example: |
194 @{ML "Scan.error"}. For example: |
195 |
195 |
196 @{ML_response_fake [display,gray] "Scan.error (!! (fn _ => \"foo\") ($$ \"h\"))" |
196 @{ML_response_fake [display,gray] "Scan.error (!! (fn _ => \"foo\") ($$ \"h\"))" |
197 "Exception Error \"foo\" raised"} |
197 "Exception Error \"foo\" raised"} |
428 in |
428 in |
429 (OuterParse.$$$ \"where\" input1, OuterParse.$$$ \"|\" input2) |
429 (OuterParse.$$$ \"where\" input1, OuterParse.$$$ \"|\" input2) |
430 end" |
430 end" |
431 "((\"where\",\<dots>),(\"|\",\<dots>))"} |
431 "((\"where\",\<dots>),(\"|\",\<dots>))"} |
432 |
432 |
433 Like before, you can sequentially connect parsers with @{ML "(op --)"}. For example: |
433 Like before, you can sequentially connect parsers with @{ML "--"}. For example: |
434 |
434 |
435 @{ML_response [display,gray] |
435 @{ML_response [display,gray] |
436 "let |
436 "let |
437 val input = filtered_input \"| in\" |
437 val input = filtered_input \"| in\" |
438 in |
438 in |
569 indicated by @{ML NoSyn}. |
569 indicated by @{ML NoSyn}. |
570 |
570 |
571 (FIXME: should for-fixes take any syntax annotation?) |
571 (FIXME: should for-fixes take any syntax annotation?) |
572 |
572 |
573 @{ML OuterParse.for_fixes} is an ``optional'' that prefixes |
573 @{ML OuterParse.for_fixes} is an ``optional'' that prefixes |
574 @{ML OuterParse.fixes} with the comman \isacommand{for}. |
574 @{ML OuterParse.fixes} with the command \isacommand{for}. |
575 (FIXME give an example and explain more) |
575 (FIXME give an example and explain more) |
576 |
576 |
577 @{ML_response [display,gray] |
577 @{ML_response [display,gray] |
578 "let |
578 "let |
579 val input = filtered_input |
579 val input = filtered_input |