153
+ − 1
-------------- Reviews on submission 104 -------------
+ − 2
+ − 3
-------------------- review 1 --------------------
+ − 4
+ − 5
PAPER: 104
+ − 6
TITLE: Parsing with derivatives
+ − 7
+ − 8
OVERALL RATING: 4 (OK paper, but I will not champion it.)
+ − 9
REVIEWER'S CONFIDENCE: 2 (medium)
+ − 10
----------------------- REVIEW --------------------
+ − 11
+ − 12
MAIN PART
+ − 13
The paper describes an original approach to the derivation of parsers from grammars. The approach is based on the observation that context-free grammars are closed under Brzozowski derivatives, and such derivatives can be easily computed. The parser described hence parses a string by lazily computing the derivatives of the grammar for each character met. Quite surprisingly, such a parser seems very easy to encode, especially in a language that supports streams and lazy computations, can parse every grammar, and is quite efficient.
+ − 14
I found the paper very interesting. The studied problem is very important, and is surprising to meet original results in a field that has been explored so much. I enjoyed the first pages of the paper very much, they are extremely crisp and clear. Unfortunately, around page 8, the paper starts getting more and more obscure; chapter 7 end abruptly, and chapter 8, which is probably the most important one, is quite obscure. At the end I still think that the paper should be accepted, but I really hope that the authors will be able to beef section 8 up to the point of making it readable for me.
+ − 15
Detailed remarks for the authors only
+ − 16
Example 1: the parse tree is: => the pasre tree of ' () ' is;
+ − 17
In general: you should write <B||B->\epsilon> rather then <B|B->\epsilon> (this is not quantum theory, after all...)
+ − 18
The non-deterministic state machine: the language in \Sigma is a language over A', not over A, right? You should say that explicitly.
+ − 19
In the rules for bra's and ket's, the consequences should look like
+ − 20
(L,w) =>^{\bra{n}} (D_{\bra{n}}(L),w)
+ − 21
I think you forgot the derivative, right?
+ − 22
The parse-string machine pushed a bra when it meets it, but never pops it. You can easily adjust the last rule of section 7 to pop it, but the real question is: why do you push it?
+ − 23
The end of section 7 is too abrupt. You have described a nice pair of abstract machines, but now you should tell me something a bit more concrete about the amount of nondetermins involved, how much would that hurt me in practice...
+ − 24
Page 10: the monotonicity for parser combinators should look like:
+ − 25
forall p in \tau_X(w) exists p' in \tau_X(w') such that first p prefix of first p'
+ − 26
Section 8 is obscure. It is too terse. The equations look right and look convincing, but I was totally unable to create a mental picture about how this machinery actually work. I understand that such a combination of non-determinism and lazyness creates code that tends to work in a mysterious way, but still the section is extremely unsatisfactory, especially.
+ − 27
Section 9: a bit unreadable, at least for a non-scala programmer.
+ − 28
PROS AND CONS
+ − 29
+ interesting solution for an interesting problem
+ − 30
+ elegant
+ − 31
+ innovative
+ − 32
+ first 8 pages are a pleasure to read
+ − 33
- crucial sections are almost unreadable
+ − 34
REBUTTAL
+ − 35
+ − 36
+ − 37
-------------------- review 2 --------------------
+ − 38
+ − 39
PAPER: 104
+ − 40
TITLE: Parsing with derivatives
+ − 41
+ − 42
OVERALL RATING: 2 (Weak paper, though I will not fight strongly against it.)
+ − 43
REVIEWER'S CONFIDENCE: 4 (expert)
+ − 44
----------------------- REVIEW --------------------
+ − 45
+ − 46
MAIN PART
+ − 47
The authors describe how one can use Brzozowski-style derivatives to
+ − 48
implement parser combinators. They claim that their technique leads to
+ − 49
simple implementations, and this seems to be the case. They also claim
+ − 50
that their implementation is efficient, but give no solid evidence for
+ − 51
this (they added a special combinator to get decent numbers from their
+ − 52
single reported experiment). Another claim is that their parser
+ − 53
combinators can handle left recursion, but they give no evidence for
+ − 54
this.
+ − 55
The authors seem to be unaware that Danielsson presented essentially
+ − 56
the same technique at TYPES 2009 ("Mixing Induction and Coinduction").
+ − 57
However, Danielsson's draft paper does not try to sell the technique,
+ − 58
only using it as a vehicle to implement provably total parser
+ − 59
combinators, and he does not address efficiency of parsing, so there
+ − 60
could be a place for both papers.
+ − 61
The authors seem to claim that their implementation is a lot more
+ − 62
efficient than other parser combinator libraries:
+ − 63
"[...] they are easy to implement, and easy to implement as
+ − 64
libraries (like top-down parsers), yet they are efficient and
+ − 65
expressive (like bottom-up parsers)."
+ − 66
However, they have not made any head-to-head comparisons, and they
+ − 67
have not cited a single paper discussing efficient parser combinators.
+ − 68
Generally there is a serious lack of references to related work on
+ − 69
parser combinators.
+ − 70
Section 7 discusses another parsing technique (also based on
+ − 71
derivatives). However, I found this section to be poorly written and
+ − 72
hard to follow. I suggest that it should be removed.
+ − 73
If the claims about efficiency had been followed up with substantial
+ − 74
evidence, then I would have recommended that the paper should be
+ − 75
published. As it stands I do not make such a recommendation.
+ − 76
Detailed comments:
+ − 77
Section 2:
+ − 78
"This technique was lost to the “sands of time†until [paper from
+ − 79
2009]" A quick web search will tell you that a large number of papers
+ − 80
published before 2009 have used Brzozowski derivatives (or
+ − 81
generalisations), so this comment is incorrect. See for instance the
+ − 82
work of Rutten, including "Automata and Coinduction; an exercise in
+ − 83
coalgebra" and "Automata, Power Series, and Coinduction: Taking Input
+ − 84
Derivatives Seriously".
+ − 85
Section 6:
+ − 86
The expansion of D_x(List) would be easier to follow if you added a
+ − 87
step between the second and third lines.
+ − 88
"the null string terminal" and "the empty nonterminal" may not exist
+ − 89
in N, so they may have to be added to N'.
+ − 90
One could be led to believe that the last "and:" only applies when
+ − 91
m = 0.
+ − 92
"The empty nonterminal arises frequently with derivatives, and if this
+ − 93
reduction is applied aggressively, the grammar will remain a
+ − 94
manageable size." You give no proper motivation for this statement. In
+ − 95
fact, your discussion of the "special closure construct" in Section 9
+ − 96
seems to imply that this statement is false. I suggest that you remove
+ − 97
this statement.
+ − 98
Section 7:
+ − 99
The machine in this section does not make sense to me. First of all
+ − 100
you do not state what the machine should accomplish; my guess is that
+ − 101
you intend
+ − 102
((L, s), <>) ->* ((L', <>), <t>)
+ − 103
to hold iff t is a parse tree which witnesses the membership of s in
+ − 104
L. Furthermore you give no reason (such as a proof) to believe that
+ − 105
the machine satisfies this specification. For instance, you push
+ − 106
nonterminals onto the stack, but you never remove them (I suspect that
+ − 107
the last rule should remove a nonterminal), and it seems strange that
+ − 108
the state is unchanged in the rules involving "first".
+ − 109
I think the paper would be better if Section 7, which is not central
+ − 110
to the paper, was omitted.
+ − 111
"This approach is not well suited to typed languages, since a literal
+ − 112
implementation of the parsing machine must use a generic parse-tree
+ − 113
type, or else violate type-safety with casts for reductions." I
+ − 114
suspect that you could do better with a dependently typed language, so
+ − 115
I suggest that you remove this sentence.
+ − 116
Example 1: "the parse tree is:" I think you mean "one parse tree is:".
+ − 117
"The parsing machine demonstrates that, even without reduction rules
+ − 118
associated with the grammar, we can use derivatives to construct a
+ − 119
concrete syntax tree." I do not follow this sentence. What kind of
+ − 120
reduction rules are you referring to?
+ − 121
Section 8:
+ − 122
The parser combinators that you give seem quite outdated to me. The
+ − 123
functional programming community moved away from this set of
+ − 124
combinators more than a decade ago, in favour of combinators based on
+ − 125
monads or applicative functors. Please motivate this choice.
+ − 126
You state two properties that parser combinators must satisfy, but you
+ − 127
never really make any use of these properties. Are these properties
+ − 128
necessary for your development? Is your definition of a parser
+ − 129
combinator a function which satisfies these properties?
+ − 130
"A parser combinator is monotonic iff..." I assume that the second
+ − 131
instance of \sqsubseteq should be read pointwise.
+ − 132
"the meaning of an input expression w <- A^∗ is contained within
+ − 133
tau_N(w)" In fact it is the one and only (complete) result, right?
+ − 134
"D_c(c) = eps -> \w.{(c, w)}" This does not make sense, the second
+ − 135
argument of -> is not a parser.
+ − 136
Section 9:
+ − 137
Your parsers accept infinite streams as input. However, parseFull
+ − 138
clearly fails to terminate for infinite input. Why do you not restrict
+ − 139
the input to finite lists?
+ − 140
Your parsers return potentially infinite streams of results. However,
+ − 141
your parse function uses "append", which (essentially) throws away its
+ − 142
second argument if the first is infinite. Wouldn't it be better to
+ − 143
return a fair interleaving of the two streams?
+ − 144
In internalDerive you seem to have renamed a and b to first and
+ − 145
second, respectively.
+ − 146
"the longest possible parse first" Why is this relevant for
+ − 147
performance? If you return all possible parses, then the order in
+ − 148
which they are returned should not matter, and for unambiguous parsers
+ − 149
at most one result is returned anyway. I can imagine that your
+ − 150
optimisation matters in the (atypical?) case of an ambiguous parse, if
+ − 151
you are not interested in inspecting all the results, but only want to
+ − 152
see that there is more than one.
+ − 153
The sentence before the table seems to be unfinished.
+ − 154
I assume that "43 m" should be "43 ms".
+ − 155
"These results suggest that derivative-based parser combinators scale
+ − 156
well enough to be useful in practice." To me this result suggests that
+ − 157
I may have to repeatedly add new special-case combinators (like your
+ − 158
closure operation) to ensure decent performance.
+ − 159
Section 10:
+ − 160
"derivative-based parsing defies the classical taxonomies", "Top-down
+ − 161
methods", "back-tracking"... Are you not aware of parser combinator
+ − 162
libraries which use breadth-first techniques (which your combinators
+ − 163
also do)? See, for instance, Claessen's "Parallel Parsing Processes".
+ − 164
"these methods have not been able to handle left-recursive grammars"
+ − 165
You should check out Lickman's MSc thesis "Parsing With Fixed Points"
+ − 166
and "Parser Combinators for Ambiguous Left-Recursive Grammars" by
+ − 167
Frost et al.
+ − 168
"Philosophically, neither derivative-based method “feels†like a
+ − 169
top-down parsing method." To me your combinator implementation does
+ − 170
feel like a (breadth-first) top-down method.
+ − 171
You claim (implicitly) that your parser combinators can handle left
+ − 172
recursive grammars. However, I doubt that you can handle grammars of
+ − 173
the form "n ::= n". If you can handle such grammars, please explain
+ − 174
how you do it. Otherwise it would be interesting to know exactly what
+ − 175
kind of left recursion you support.
+ − 176
"they are efficient and expressive (like bottom-up parsers)" This
+ − 177
seems to imply that there is a language which can be parsed using
+ − 178
bottom-up techniques, but not using top-down techniques. Can you
+ − 179
verify this claim?
+ − 180
You may want to cite Frost and Szydlowski, who claim worst-case cubic
+ − 181
time for parsing ("Memoizing purely functional top-down backtracking
+ − 182
language processors"). Swierstra's work on efficient parser
+ − 183
combinators is also relevant (many papers).
+ − 184
References:
+ − 185
Improve formatting.
+ − 186
PROS AND CONS
+ − 187
+ Simple technique for implementing parser combinators.
+ − 188
- The technique has been presented before.
+ − 189
- No substantial evidence for claims of efficiency.
+ − 190
- Section 7 does not make sense to me (but could be removed).
+ − 191
REBUTTAL
+ − 192
+ − 193
+ − 194
-------------------- review 3 --------------------
+ − 195
+ − 196
PAPER: 104
+ − 197
TITLE: Parsing with derivatives
+ − 198
+ − 199
OVERALL RATING: 2 (Weak paper, though I will not fight strongly against it.)
+ − 200
REVIEWER'S CONFIDENCE: 2 (medium)
+ − 201
----------------------- REVIEW --------------------
+ − 202
+ − 203
MAIN PART
+ − 204
The paper proposes a new parsing technique based on "derivatives",
+ − 205
inspired by the derivatives of regular expressions and extended to the
+ − 206
derivatives of context free grammars.
+ − 207
Because context free grammars may be ambiguous, they propose to compute the
+ − 208
set of all parse trees but using lazy evaluation to share and delay the
+ − 209
computation. They present their framework and an implementation in Scala.
+ − 210
They claim that their technique is efficient in practice.
+ − 211
+ − 212
EVALUATION
+ − 213
Parsing techniques are old and well-established, although the current status
+ − 214
of the theory is still unsatisfactory: grammars are still a burden to write
+ − 215
and any advance of this topic is welcomed. However, the topic must be
+ − 216
addressed very seriously.
+ − 217
Given this, this work is potentially interesting as a possible way to
+ − 218
improve the state of the art situation, but not in its current
+ − 219
presentation, as the paper should not just _claim_ that using derivatives
+ − 220
for parsing is feasible, but _demonstrate_ that writing grammars with
+ − 221
combinators is at least both as convenient and as efficient as using
+ − 222
existing technology.
+ − 223
The first aspect is not discussed at all in the paper. As for efficiency,
+ − 224
which is claimed, the benchmarks are really small, and perhaps probably
+ − 225
representative of existing grammars, and the results are not compared with
+ − 226
other parsing techniques.
+ − 227
Other questions arise: what happens when the grammar is largely ambiguous?
+ − 228
Do you intend to reject those, or pick the meaning of the things that you
+ − 229
are parsing at random?
+ − 230
The overall structure of the paper is not very clear either. You should
+ − 231
gives us a road map at the beginning and make it clear how the different
+ − 232
sections fits together.
+ − 233
+ − 234
OTHER COMMENTS
+ − 235
Section 7 and 8 are the key sections of the paper to understand your
+ − 236
proposal. However, I have problems with both.
+ − 237
I don't understand while your parsing is decomposed into production of an
+ − 238
intermediate parsing string that is then consumed by the machine that
+ − 239
produces a syntax tree. Since the parsing string is a linear representation
+ − 240
of the parsing tree, why can't you work with the parsing tree itself, which
+ − 241
is more structured than the parsing string? Why can't you directly extract
+ − 242
a syntax tree from the parsing tree? Could you run the algorithm in Section
+ − 243
7 on an example?
+ − 244
In section 8, you compute the derivatives of parser combinators, but you
+ − 245
could instead compute the grammar defined by the parser combinator and
+ − 246
compute its derivative in one step. What is this bad/worse than computing it
+ − 247
directly?
+ − 248
+ − 249
- One line above section 2: "could [to<-] modify"
+ − 250
- Section 2, Line 10: "derivative---[they<-??] critical property"
+ − 251
- One line above Section 7: "remain [<-of] a manageable size.
+ − 252
- Computation of the derivative, above theorem 1:
+ − 253
You could explain the derivatives, especially for rules n-> s1..sn
+ − 254
and perhaps give an example.
+ − 255
- Last two lines above section 8: "The parsing machine demonstrates...
+ − 256
a concrete syntax tree".
+ − 257
I don't really understand what you (intend to) mean.
+ − 258
- Section 8.1, Line 1: "ability [the<-to] combine them"
+ − 259
- End of section 8.1:
+ − 260
you should say and comment on the fact that these definitions are
+ − 261
recursive.
+ − 262
- First line of section 8.4: "to be able [<-to] parse"
+ − 263
- Section 8.4: How do you use these inequations? Why do you give this list of
+ − 264
formulas without any comment?
+ − 265
+ − 266
- Section 8.5: what are the lambda-terms on the right of --> ?
+ − 267
+ − 268
+ − 269
-------------------- review 4 --------------------
+ − 270
+ − 271
PAPER: 104
+ − 272
TITLE: Parsing with derivatives
+ − 273
+ − 274
OVERALL RATING: 1 (Serious problems. I will argue to reject this paper.)
+ − 275
REVIEWER'S CONFIDENCE: 3 (high)
+ − 276
----------------------- REVIEW --------------------
+ − 277
+ − 278
MAIN PART
+ − 279
SUMMARY
+ − 280
The paper proposes a basic algorithm for determining whether a sentence s is a
+ − 281
member of a context-free language L: (i) if the sentence is empty, determine
+ − 282
whether L contains the empty string (a decidable problem); (ii) if the
+ − 283
sentence is of the form c.s, determine whether the suffix s is a member of the
+ − 284
derivative c^{-1}.L. Because the derivative of a context-free language can be
+ − 285
effectively constructed, this is an algorithm.
+ − 286
The contribution of the paper is in generalizing this basic algorithm in two
+ − 287
ways: (i) instead of just determining membership, the parsing algorithm can be
+ − 288
made to return a concrete syntax tree, or a semantic value; (ii) instead of
+ − 289
viewing the derivative as an operation that accepts a monolithic grammar and
+ − 290
produces a monolithic grammar, one can view it as an operation that accepts a
+ − 291
parser combinator and produces a parser combinator; this means that the
+ − 292
parsing algorithm is amenable to a modular, combinator-based implementation.
+ − 293
EVALUATION
+ − 294
The introduction of the paper is somewhat irritating, for two reasons. First,
+ − 295
it promises a lot ("simplicity, feasibility, and ubiquity"), which the paper
+ − 296
does not clearly deliver. Second, it criticizes the most popular parsing
+ − 297
methods (LL and LR) on the grounds that the constraints that they impose, in
+ − 298
order to rule out ambiguous grammars, are difficult to grasp. I can accept
+ − 299
this criticism. But the authors do not offer any solution to the ambiguity
+ − 300
problem: their algorithm produces a parse forest. This leaves the programmer
+ − 301
faced with the task of writing and debugging disambiguation filters, an error
+ − 302
prone task. At least, an LR parser generator is able (in principle, and some
+ − 303
implementations actually do this) to accurately explain why the grammar seems
+ − 304
ambiguous. It can produce a sentence that is a prefix of the fringes of two
+ − 305
distinct, valid parse trees.
+ − 306
The first claimed contribution of the paper is the remark that the derivative
+ − 307
of a context free language can be effectively constructed. (A proof appears in
+ − 308
section 6.) This is in fact not a contribution. This property of context free
+ − 309
languages is well-known, at least in the formal language community (perhaps
+ − 310
not in the programming language community). The derivative of a formal language
+ − 311
is also known as its left quotient, and written c^{-1}L. See, for instance, the
+ − 312
following paper, which exploits this property intensively:
+ − 313
Jean Berstel and Luc Boasson. Towards an algebraic theory of context-free
+ − 314
languages. Fundamentae Informaticae, 25(3):217-239, 1996.
+ − 315
The remainder of the paper contains two distinct developments. In section 7, a
+ − 316
version of the derivative-based parsing algorithm is developed, which produces
+ − 317
concrete syntax trees. Then, in sections 8 and 9, another version is developed,
+ − 318
which produces programmer-defined semantic values, and is combinator-based.
+ − 319
It is not clear to me why these two distinct developments co-exist. In my eyes,
+ − 320
the approach of sections 8 and 9 seems superior, because it is more flexible
+ − 321
(it is not limited to producing concrete syntax trees) and modular.
+ − 322
The idea that the notion of a derivative can be generalized to parser
+ − 323
combinators is interesting, and worthy of thought. To some extent, section 8.2
+ − 324
is where the paper really begins! Section 8 provides a mathematical view of
+ − 325
the problem, while section 9 describes part of a Scala implementation.
+ − 326
This material is interesting, but can be criticized on several grounds:
+ − 327
1. The complexity analysis of the algorithm is absent. The algorithm seems
+ − 328
very expensive in the worst case: indeed, the effective construction of a
+ − 329
derivative in section 6 causes a quadratic blow up of the size of the grammar,
+ − 330
and this operation is iterated at every character of the input stream! In
+ − 331
fact, section 9.1 mentions that a naive implementation of the algorithm is
+ − 332
unusable (exponentially slow), and that special support for the Kleene star
+ − 333
had to be added. However, there is no argument that this patch is sufficient
+ − 334
for the algorithm to become useable. A worst-case analysis of the space and
+ − 335
time complexity is needed, and, if this worst-case complexity is bad (say,
+ − 336
exponential), then a formal or semi-formal argument is needed why the behavior
+ − 337
of the algorithm remains tolerable in practice. In light of the efficiency
+ − 338
claim found in the introduction, this is indispensable!
+ − 339
2. The description of the implementation could be improved. In particular,
+ − 340
memoization and sharing seem to play a crucial role. It seems that two
+ − 341
attempts to compute the derivative of a parser combinator should return the
+ − 342
same object. This includes the particular case where the second attempt is in
+ − 343
fact nested within the first one (i.e., it is a recursive call). For instance,
+ − 344
if the grammar contains the single production L -> LM, a call to L.derive(t)
+ − 345
will (indirectly) cause another call to L.derive(t). What happens then? By
+ − 346
which mechanism is non-termination avoided? This is not clearly described.
+ − 347
The paper contains four lines of text that seem to concern this topic (p.15,
+ − 348
"It is important to note [...] immediately suspended."), but I find them
+ − 349
obscure. I note that there seems to be a distinction between derive and
+ − 350
internalDerive, which is not explained in the paper; my guess is that the
+ − 351
former may be a memoizing version of the latter, using a table indexed by
+ − 352
tokens. Is this the case?
+ − 353
3. The introduction claims "ubiquity", that is, if I understand correctly, it
+ − 354
claims that the method is very easy to implement as a library. This may be
+ − 355
true, but the paper lacks a convincing argument why this is so. In particular,
+ − 356
how is this approach superior to the other parsing combinator libraries in
+ − 357
existence? The paper lacks a good comparison with other combinator-based
+ − 358
parsing techniques.
+ − 359
It seems that there might be a connection between the technique presented in
+ − 360
the paper and intersection parsing (see e.g. chapter 13 of the excellent book
+ − 361
"Parsing techniques", by Grune and Jacobs). Computing the derivative of a
+ − 362
context free language with respect to a terminal symbol c is closely related
+ − 363
to computing its intersection with the regular language c(.*). Intersection
+ − 364
parsing, as described by Grune and Jacobs, computes the intersection of the
+ − 365
grammar with the input (which is a string, so a regular language!) and
+ − 366
simplifies the resulting grammar, which can then be viewed as a parse
+ − 367
tree. Could one view the approach presented in the paper as an incremental
+ − 368
(and modular) way of doing intersection parsing?
+ − 369
In conclusion, although the paper contains an interesting idea, its
+ − 370
investigation is not sufficiently thorough, and a clear case in favor
+ − 371
of this parsing technique is missing.
+ − 372
DETAILED COMMENTS
+ − 373
The authors state in section 7 that "a literal implementation [...] must use a
+ − 374
generic parse-tree type, or else violate type safety with casts for
+ − 375
reductions". Strictly speaking, this depends on which type system one has in
+ − 376
mind. If the type system is ML, for instance, the claim is probably true. If
+ − 377
the type system has GADTs, however, it is quite possible that the discipline
+ − 378
of the parsing machine can be encoded in the types. This has been done, for an
+ − 379
LR parsing machine, and for an arbitrary but fixed grammar, in the following
+ − 380
paper:
+ − 381
François Pottier and Yann Régis-Gianas. Towards efficient, typed LR parsers.
+ − 382
In ACM Workshop on ML, volume 148(2) of Electronic Notes in Theoretical Computer
+ − 383
Science, pages 155-180, 2006.
+ − 384
To the best of my knowledge, both Haskell and Scala have GADTs, so the authors
+ − 385
might wish to check this out.
+ − 386
p.5, "for any class of decidable languages closed under the derivative,
+ − 387
derivatives may be used to determine membership". I am not sure what is meant
+ − 388
by "decidable" here. If it is meant that membership in the language is
+ − 389
decidable, then this is a tautology. Perhaps one should understand that
+ − 390
membership *of the empty word* in the language is decidable, since effectively
+ − 391
that is all that is needed?
+ − 392
p.5, "the derivative is closed under regular operations": unclear formulation.
+ − 393
p.7, before theorem 1: please indicate the interval over which the index i
+ − 394
ranges. My guess is 0 \geq i < m. The particular case where m = 0 seems
+ − 395
redundant: when m = 0, there is no suitable index i, so no production is
+ − 396
generated, which means D_c(n) is empty. In fact, the notation \emptyset
+ − 397
in the right-hand side of a production does not make sense, since the
+ − 398
right-hand side of a production is supposed to be a sequence of symbols.
+ − 399
p.7, "the empty nonterminal": this has not been defined. I assume this
+ − 400
explains the use of \emptyset above. Define it.
+ − 401
p.8, example 1: "for the grammar [...] of balanced parentheses [...] the parse
+ − 402
tree is": I assume this should be: "the parse tree for the string '()'".
+ − 403
p.8, "A parse string is [...] containing both nonterminal markers and
+ − 404
reduction markers"." Please indicate at this point that a parse string also
+ − 405
contains elements of the alphabet A. In fact, the definition of the alphabet
+ − 406
A' should be moved to this point.
+ − 407
p.8, "we construct a new grammar [...] over parse strings": please change
+ − 408
this to "a new grammar over the alphabet A'".
+ − 409
p.8, the definition of \Sigma mentions \mathbb{L}: I guess this should be
+ − 410
\mathbb{L}_{A'}.
+ − 411
p.9, "a second rule for producing bras": is the rule correct? In the
+ − 412
right-hand side of the conclusion, instead of L, I would have expected the
+ − 413
derivative of L with respect to <n|. (The rule, as it stands, clearly causes
+ − 414
an infinite stream of <n|'s to be produced! Similarly for the third rule.
+ − 415
p.9, "the parsing machine reduces": one might wish to state the invariant
+ − 416
of the parsing machine which guarantees that reduction can never fail (by
+ − 417
lack of material on the stack).
+ − 418
p.9, "popular as of late": please provide citation(s).
+ − 419
p.10, the statement of the monotonicity condition is somewhat unclear: the
+ − 420
assertion \tau_X(w) \sqsubseteq \tau_X(w') compares two sets of pairs. An
+ − 421
ordering over pairs is defined, but an ordering over sets of pairs is not
+ − 422
defined.
+ − 423
p.11, the authors might want to spend more space explaining the definition of
+ − 424
the derivative of a parser combinator -- why is the right definition? and how
+ − 425
does it relate with the definition of the derivative of a formal language?
+ − 426
Can one recover the latter as a degenerate case of the former?
+ − 427
p.12, section 8.3: the parser combinator \epsilon is used, but has not been
+ − 428
defined. What is its type? In particular, what semantic value does it produce?
+ − 429
(A unit value?) What is the meaning of the notation \lambda\epsilon. ...
+ − 430
p.12, equation (3): the elegance afforded by the use of \delta in the
+ − 431
definition of the derivative of a concatenation (p.5) seems to have been lost
+ − 432
here.
+ − 433
p.13, "in practice, the cost [...] as they are derived". I don't know what
+ − 434
this means. Please be more precise and include a formal complexity claim.
+ − 435
p.14, "parse trees of type A" should really be referred to as "semantic
+ − 436
values", since they may not be trees at all -- they are programmer-defined.
+ − 437
p.14, "its implementation follows eq.2": actually, eq.2 corresponds only
+ − 438
to the "else" clause of the code, so the text is badly worded. Same problem
+ − 439
with eq.1 a few lines later.
+ − 440
p.15, the distinction between "derive" and "internalDerive" is not explained.
+ − 441
p.15, the derivative of union "short-circuits if it finds either arm is
+ − 442
empty": is this important, say, to ensure termination? or is it just a minor
+ − 443
optimization?
+ − 444
p.16, the fact that a naïve treatment of the Kleene star required exponential
+ − 445
time, and the fact that right recursion is significantly more efficient than
+ − 446
left recursion, are worrisome. These problems are perhaps partially masked by
+ − 447
the introduction of "the special closure construct", but how do we know that
+ − 448
they are gone?
+ − 449
p.16, the test suite consists of just one grammar and a set of randomly
+ − 450
generated inputs. This seems far from representative of actual usage...
+ − 451
p.17, "the cost of that compilation is amortized across the entire parsing
+ − 452
process". I don't know what is meant by "compilation" here, nor do I know
+ − 453
why its cost is amortized. Please explain.
+ − 454
p.18, "these combinators are slowly transforming and unfolding the grammar
+ − 455
itself into a parse tree": this seems to be an interesting intuition. Why
+ − 456
not try to make it more formal?
+ − 457
+ − 458
TYPOS
+ − 459
p.2, "could to modify"
+ − 460
p.2, "they critical"
+ − 461
p.6, "attempt interpret"
+ − 462
p.10, "Operations or combinators"
+ − 463
p.10, "the remainders of the input"
+ − 464
p.12, "is base case"
+ − 465
p.13, "able parse"
+ − 466
p.15, "short-circuiting if the first element is nullable": should this be: "is
+ − 467
not nullable"?
+ − 468
p.16, "43 m" -> "43 ms"
+ − 469
p.16, "surpising"
+ − 470
p.17, "they extent"
+ − 471
PROS AND CONS
+ − 472
+ interesting and possibly novel idea
+ − 473
- no time/space complexity analysis
+ − 474
- implementation not clearly described
+ − 475
- case in favor of the idea not clearly made
+ − 476
- comparison with related work should be more thorough
+ − 477
REBUTTAL
+ − 478
No specific questions.