regexp: Parsing-Literature/yacc-is-dead-review@e5e32faa2446 (annotated)

153 e7e4e490326b added literature about parsing urbanc parents: diff changeset	1	-------------- Reviews on submission 104 -------------
e7e4e490326b added literature about parsing urbanc parents: diff changeset	2
e7e4e490326b added literature about parsing urbanc parents: diff changeset	3	-------------------- review 1 --------------------
e7e4e490326b added literature about parsing urbanc parents: diff changeset	4
e7e4e490326b added literature about parsing urbanc parents: diff changeset	5	PAPER: 104
e7e4e490326b added literature about parsing urbanc parents: diff changeset	6	TITLE: Parsing with derivatives
e7e4e490326b added literature about parsing urbanc parents: diff changeset	7
e7e4e490326b added literature about parsing urbanc parents: diff changeset	8	OVERALL RATING: 4 (OK paper, but I will not champion it.)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	9	REVIEWER'S CONFIDENCE: 2 (medium)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	10	----------------------- REVIEW --------------------
e7e4e490326b added literature about parsing urbanc parents: diff changeset	11
e7e4e490326b added literature about parsing urbanc parents: diff changeset	12	MAIN PART
e7e4e490326b added literature about parsing urbanc parents: diff changeset	13	The paper describes an original approach to the derivation of parsers from grammars. The approach is based on the observation that context-free grammars are closed under Brzozowski derivatives, and such derivatives can be easily computed. The parser described hence parses a string by lazily computing the derivatives of the grammar for each character met. Quite surprisingly, such a parser seems very easy to encode, especially in a language that supports streams and lazy computations, can parse every grammar, and is quite efficient.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	14	I found the paper very interesting. The studied problem is very important, and is surprising to meet original results in a field that has been explored so much. I enjoyed the first pages of the paper very much, they are extremely crisp and clear. Unfortunately, around page 8, the paper starts getting more and more obscure; chapter 7 end abruptly, and chapter 8, which is probably the most important one, is quite obscure. At the end I still think that the paper should be accepted, but I really hope that the authors will be able to beef section 8 up to the point of making it readable for me.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	15	Detailed remarks for the authors only
e7e4e490326b added literature about parsing urbanc parents: diff changeset	16	Example 1: the parse tree is: => the pasre tree of ' () ' is;
e7e4e490326b added literature about parsing urbanc parents: diff changeset	17	In general: you should write <B\|\|B->\epsilon> rather then <B\|B->\epsilon> (this is not quantum theory, after all...)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	18	The non-deterministic state machine: the language in \Sigma is a language over A', not over A, right? You should say that explicitly.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	19	In the rules for bra's and ket's, the consequences should look like
e7e4e490326b added literature about parsing urbanc parents: diff changeset	20	(L,w) =>^{\bra{n}} (D_{\bra{n}}(L),w)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	21	I think you forgot the derivative, right?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	22	The parse-string machine pushed a bra when it meets it, but never pops it. You can easily adjust the last rule of section 7 to pop it, but the real question is: why do you push it?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	23	The end of section 7 is too abrupt. You have described a nice pair of abstract machines, but now you should tell me something a bit more concrete about the amount of nondetermins involved, how much would that hurt me in practice...
e7e4e490326b added literature about parsing urbanc parents: diff changeset	24	Page 10: the monotonicity for parser combinators should look like:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	25	forall p in \tau_X(w) exists p' in \tau_X(w') such that first p prefix of first p'
e7e4e490326b added literature about parsing urbanc parents: diff changeset	26	Section 8 is obscure. It is too terse. The equations look right and look convincing, but I was totally unable to create a mental picture about how this machinery actually work. I understand that such a combination of non-determinism and lazyness creates code that tends to work in a mysterious way, but still the section is extremely unsatisfactory, especially.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	27	Section 9: a bit unreadable, at least for a non-scala programmer.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	28	PROS AND CONS
e7e4e490326b added literature about parsing urbanc parents: diff changeset	29	+ interesting solution for an interesting problem
e7e4e490326b added literature about parsing urbanc parents: diff changeset	30	+ elegant
e7e4e490326b added literature about parsing urbanc parents: diff changeset	31	+ innovative
e7e4e490326b added literature about parsing urbanc parents: diff changeset	32	+ first 8 pages are a pleasure to read
e7e4e490326b added literature about parsing urbanc parents: diff changeset	33	- crucial sections are almost unreadable
e7e4e490326b added literature about parsing urbanc parents: diff changeset	34	REBUTTAL
e7e4e490326b added literature about parsing urbanc parents: diff changeset	35
e7e4e490326b added literature about parsing urbanc parents: diff changeset	36
e7e4e490326b added literature about parsing urbanc parents: diff changeset	37	-------------------- review 2 --------------------
e7e4e490326b added literature about parsing urbanc parents: diff changeset	38
e7e4e490326b added literature about parsing urbanc parents: diff changeset	39	PAPER: 104
e7e4e490326b added literature about parsing urbanc parents: diff changeset	40	TITLE: Parsing with derivatives
e7e4e490326b added literature about parsing urbanc parents: diff changeset	41
e7e4e490326b added literature about parsing urbanc parents: diff changeset	42	OVERALL RATING: 2 (Weak paper, though I will not fight strongly against it.)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	43	REVIEWER'S CONFIDENCE: 4 (expert)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	44	----------------------- REVIEW --------------------
e7e4e490326b added literature about parsing urbanc parents: diff changeset	45
e7e4e490326b added literature about parsing urbanc parents: diff changeset	46	MAIN PART
e7e4e490326b added literature about parsing urbanc parents: diff changeset	47	The authors describe how one can use Brzozowski-style derivatives to
e7e4e490326b added literature about parsing urbanc parents: diff changeset	48	implement parser combinators. They claim that their technique leads to
e7e4e490326b added literature about parsing urbanc parents: diff changeset	49	simple implementations, and this seems to be the case. They also claim
e7e4e490326b added literature about parsing urbanc parents: diff changeset	50	that their implementation is efficient, but give no solid evidence for
e7e4e490326b added literature about parsing urbanc parents: diff changeset	51	this (they added a special combinator to get decent numbers from their
e7e4e490326b added literature about parsing urbanc parents: diff changeset	52	single reported experiment). Another claim is that their parser
e7e4e490326b added literature about parsing urbanc parents: diff changeset	53	combinators can handle left recursion, but they give no evidence for
e7e4e490326b added literature about parsing urbanc parents: diff changeset	54	this.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	55	The authors seem to be unaware that Danielsson presented essentially
e7e4e490326b added literature about parsing urbanc parents: diff changeset	56	the same technique at TYPES 2009 ("Mixing Induction and Coinduction").
e7e4e490326b added literature about parsing urbanc parents: diff changeset	57	However, Danielsson's draft paper does not try to sell the technique,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	58	only using it as a vehicle to implement provably total parser
e7e4e490326b added literature about parsing urbanc parents: diff changeset	59	combinators, and he does not address efficiency of parsing, so there
e7e4e490326b added literature about parsing urbanc parents: diff changeset	60	could be a place for both papers.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	61	The authors seem to claim that their implementation is a lot more
e7e4e490326b added literature about parsing urbanc parents: diff changeset	62	efficient than other parser combinator libraries:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	63	"[...] they are easy to implement, and easy to implement as
e7e4e490326b added literature about parsing urbanc parents: diff changeset	64	libraries (like top-down parsers), yet they are efficient and
e7e4e490326b added literature about parsing urbanc parents: diff changeset	65	expressive (like bottom-up parsers)."
e7e4e490326b added literature about parsing urbanc parents: diff changeset	66	However, they have not made any head-to-head comparisons, and they
e7e4e490326b added literature about parsing urbanc parents: diff changeset	67	have not cited a single paper discussing efficient parser combinators.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	68	Generally there is a serious lack of references to related work on
e7e4e490326b added literature about parsing urbanc parents: diff changeset	69	parser combinators.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	70	Section 7 discusses another parsing technique (also based on
e7e4e490326b added literature about parsing urbanc parents: diff changeset	71	derivatives). However, I found this section to be poorly written and
e7e4e490326b added literature about parsing urbanc parents: diff changeset	72	hard to follow. I suggest that it should be removed.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	73	If the claims about efficiency had been followed up with substantial
e7e4e490326b added literature about parsing urbanc parents: diff changeset	74	evidence, then I would have recommended that the paper should be
e7e4e490326b added literature about parsing urbanc parents: diff changeset	75	published. As it stands I do not make such a recommendation.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	76	Detailed comments:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	77	Section 2:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	78	"This technique was lost to the “sands of time” until [paper from
e7e4e490326b added literature about parsing urbanc parents: diff changeset	79	2009]" A quick web search will tell you that a large number of papers
e7e4e490326b added literature about parsing urbanc parents: diff changeset	80	published before 2009 have used Brzozowski derivatives (or
e7e4e490326b added literature about parsing urbanc parents: diff changeset	81	generalisations), so this comment is incorrect. See for instance the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	82	work of Rutten, including "Automata and Coinduction; an exercise in
e7e4e490326b added literature about parsing urbanc parents: diff changeset	83	coalgebra" and "Automata, Power Series, and Coinduction: Taking Input
e7e4e490326b added literature about parsing urbanc parents: diff changeset	84	Derivatives Seriously".
e7e4e490326b added literature about parsing urbanc parents: diff changeset	85	Section 6:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	86	The expansion of D_x(List) would be easier to follow if you added a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	87	step between the second and third lines.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	88	"the null string terminal" and "the empty nonterminal" may not exist
e7e4e490326b added literature about parsing urbanc parents: diff changeset	89	in N, so they may have to be added to N'.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	90	One could be led to believe that the last "and:" only applies when
e7e4e490326b added literature about parsing urbanc parents: diff changeset	91	m = 0.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	92	"The empty nonterminal arises frequently with derivatives, and if this
e7e4e490326b added literature about parsing urbanc parents: diff changeset	93	reduction is applied aggressively, the grammar will remain a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	94	manageable size." You give no proper motivation for this statement. In
e7e4e490326b added literature about parsing urbanc parents: diff changeset	95	fact, your discussion of the "special closure construct" in Section 9
e7e4e490326b added literature about parsing urbanc parents: diff changeset	96	seems to imply that this statement is false. I suggest that you remove
e7e4e490326b added literature about parsing urbanc parents: diff changeset	97	this statement.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	98	Section 7:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	99	The machine in this section does not make sense to me. First of all
e7e4e490326b added literature about parsing urbanc parents: diff changeset	100	you do not state what the machine should accomplish; my guess is that
e7e4e490326b added literature about parsing urbanc parents: diff changeset	101	you intend
e7e4e490326b added literature about parsing urbanc parents: diff changeset	102	((L, s), <>) ->* ((L', <>), <t>)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	103	to hold iff t is a parse tree which witnesses the membership of s in
e7e4e490326b added literature about parsing urbanc parents: diff changeset	104	L. Furthermore you give no reason (such as a proof) to believe that
e7e4e490326b added literature about parsing urbanc parents: diff changeset	105	the machine satisfies this specification. For instance, you push
e7e4e490326b added literature about parsing urbanc parents: diff changeset	106	nonterminals onto the stack, but you never remove them (I suspect that
e7e4e490326b added literature about parsing urbanc parents: diff changeset	107	the last rule should remove a nonterminal), and it seems strange that
e7e4e490326b added literature about parsing urbanc parents: diff changeset	108	the state is unchanged in the rules involving "first".
e7e4e490326b added literature about parsing urbanc parents: diff changeset	109	I think the paper would be better if Section 7, which is not central
e7e4e490326b added literature about parsing urbanc parents: diff changeset	110	to the paper, was omitted.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	111	"This approach is not well suited to typed languages, since a literal
e7e4e490326b added literature about parsing urbanc parents: diff changeset	112	implementation of the parsing machine must use a generic parse-tree
e7e4e490326b added literature about parsing urbanc parents: diff changeset	113	type, or else violate type-safety with casts for reductions." I
e7e4e490326b added literature about parsing urbanc parents: diff changeset	114	suspect that you could do better with a dependently typed language, so
e7e4e490326b added literature about parsing urbanc parents: diff changeset	115	I suggest that you remove this sentence.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	116	Example 1: "the parse tree is:" I think you mean "one parse tree is:".
e7e4e490326b added literature about parsing urbanc parents: diff changeset	117	"The parsing machine demonstrates that, even without reduction rules
e7e4e490326b added literature about parsing urbanc parents: diff changeset	118	associated with the grammar, we can use derivatives to construct a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	119	concrete syntax tree." I do not follow this sentence. What kind of
e7e4e490326b added literature about parsing urbanc parents: diff changeset	120	reduction rules are you referring to?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	121	Section 8:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	122	The parser combinators that you give seem quite outdated to me. The
e7e4e490326b added literature about parsing urbanc parents: diff changeset	123	functional programming community moved away from this set of
e7e4e490326b added literature about parsing urbanc parents: diff changeset	124	combinators more than a decade ago, in favour of combinators based on
e7e4e490326b added literature about parsing urbanc parents: diff changeset	125	monads or applicative functors. Please motivate this choice.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	126	You state two properties that parser combinators must satisfy, but you
e7e4e490326b added literature about parsing urbanc parents: diff changeset	127	never really make any use of these properties. Are these properties
e7e4e490326b added literature about parsing urbanc parents: diff changeset	128	necessary for your development? Is your definition of a parser
e7e4e490326b added literature about parsing urbanc parents: diff changeset	129	combinator a function which satisfies these properties?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	130	"A parser combinator is monotonic iff..." I assume that the second
e7e4e490326b added literature about parsing urbanc parents: diff changeset	131	instance of \sqsubseteq should be read pointwise.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	132	"the meaning of an input expression w <- A^∗ is contained within
e7e4e490326b added literature about parsing urbanc parents: diff changeset	133	tau_N(w)" In fact it is the one and only (complete) result, right?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	134	"D_c(c) = eps -> \w.{(c, w)}" This does not make sense, the second
e7e4e490326b added literature about parsing urbanc parents: diff changeset	135	argument of -> is not a parser.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	136	Section 9:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	137	Your parsers accept infinite streams as input. However, parseFull
e7e4e490326b added literature about parsing urbanc parents: diff changeset	138	clearly fails to terminate for infinite input. Why do you not restrict
e7e4e490326b added literature about parsing urbanc parents: diff changeset	139	the input to finite lists?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	140	Your parsers return potentially infinite streams of results. However,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	141	your parse function uses "append", which (essentially) throws away its
e7e4e490326b added literature about parsing urbanc parents: diff changeset	142	second argument if the first is infinite. Wouldn't it be better to
e7e4e490326b added literature about parsing urbanc parents: diff changeset	143	return a fair interleaving of the two streams?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	144	In internalDerive you seem to have renamed a and b to first and
e7e4e490326b added literature about parsing urbanc parents: diff changeset	145	second, respectively.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	146	"the longest possible parse first" Why is this relevant for
e7e4e490326b added literature about parsing urbanc parents: diff changeset	147	performance? If you return all possible parses, then the order in
e7e4e490326b added literature about parsing urbanc parents: diff changeset	148	which they are returned should not matter, and for unambiguous parsers
e7e4e490326b added literature about parsing urbanc parents: diff changeset	149	at most one result is returned anyway. I can imagine that your
e7e4e490326b added literature about parsing urbanc parents: diff changeset	150	optimisation matters in the (atypical?) case of an ambiguous parse, if
e7e4e490326b added literature about parsing urbanc parents: diff changeset	151	you are not interested in inspecting all the results, but only want to
e7e4e490326b added literature about parsing urbanc parents: diff changeset	152	see that there is more than one.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	153	The sentence before the table seems to be unfinished.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	154	I assume that "43 m" should be "43 ms".
e7e4e490326b added literature about parsing urbanc parents: diff changeset	155	"These results suggest that derivative-based parser combinators scale
e7e4e490326b added literature about parsing urbanc parents: diff changeset	156	well enough to be useful in practice." To me this result suggests that
e7e4e490326b added literature about parsing urbanc parents: diff changeset	157	I may have to repeatedly add new special-case combinators (like your
e7e4e490326b added literature about parsing urbanc parents: diff changeset	158	closure operation) to ensure decent performance.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	159	Section 10:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	160	"derivative-based parsing defies the classical taxonomies", "Top-down
e7e4e490326b added literature about parsing urbanc parents: diff changeset	161	methods", "back-tracking"... Are you not aware of parser combinator
e7e4e490326b added literature about parsing urbanc parents: diff changeset	162	libraries which use breadth-first techniques (which your combinators
e7e4e490326b added literature about parsing urbanc parents: diff changeset	163	also do)? See, for instance, Claessen's "Parallel Parsing Processes".
e7e4e490326b added literature about parsing urbanc parents: diff changeset	164	"these methods have not been able to handle left-recursive grammars"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	165	You should check out Lickman's MSc thesis "Parsing With Fixed Points"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	166	and "Parser Combinators for Ambiguous Left-Recursive Grammars" by
e7e4e490326b added literature about parsing urbanc parents: diff changeset	167	Frost et al.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	168	"Philosophically, neither derivative-based method “feels” like a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	169	top-down parsing method." To me your combinator implementation does
e7e4e490326b added literature about parsing urbanc parents: diff changeset	170	feel like a (breadth-first) top-down method.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	171	You claim (implicitly) that your parser combinators can handle left
e7e4e490326b added literature about parsing urbanc parents: diff changeset	172	recursive grammars. However, I doubt that you can handle grammars of
e7e4e490326b added literature about parsing urbanc parents: diff changeset	173	the form "n ::= n". If you can handle such grammars, please explain
e7e4e490326b added literature about parsing urbanc parents: diff changeset	174	how you do it. Otherwise it would be interesting to know exactly what
e7e4e490326b added literature about parsing urbanc parents: diff changeset	175	kind of left recursion you support.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	176	"they are efficient and expressive (like bottom-up parsers)" This
e7e4e490326b added literature about parsing urbanc parents: diff changeset	177	seems to imply that there is a language which can be parsed using
e7e4e490326b added literature about parsing urbanc parents: diff changeset	178	bottom-up techniques, but not using top-down techniques. Can you
e7e4e490326b added literature about parsing urbanc parents: diff changeset	179	verify this claim?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	180	You may want to cite Frost and Szydlowski, who claim worst-case cubic
e7e4e490326b added literature about parsing urbanc parents: diff changeset	181	time for parsing ("Memoizing purely functional top-down backtracking
e7e4e490326b added literature about parsing urbanc parents: diff changeset	182	language processors"). Swierstra's work on efficient parser
e7e4e490326b added literature about parsing urbanc parents: diff changeset	183	combinators is also relevant (many papers).
e7e4e490326b added literature about parsing urbanc parents: diff changeset	184	References:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	185	Improve formatting.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	186	PROS AND CONS
e7e4e490326b added literature about parsing urbanc parents: diff changeset	187	+ Simple technique for implementing parser combinators.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	188	- The technique has been presented before.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	189	- No substantial evidence for claims of efficiency.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	190	- Section 7 does not make sense to me (but could be removed).
e7e4e490326b added literature about parsing urbanc parents: diff changeset	191	REBUTTAL
e7e4e490326b added literature about parsing urbanc parents: diff changeset	192
e7e4e490326b added literature about parsing urbanc parents: diff changeset	193
e7e4e490326b added literature about parsing urbanc parents: diff changeset	194	-------------------- review 3 --------------------
e7e4e490326b added literature about parsing urbanc parents: diff changeset	195
e7e4e490326b added literature about parsing urbanc parents: diff changeset	196	PAPER: 104
e7e4e490326b added literature about parsing urbanc parents: diff changeset	197	TITLE: Parsing with derivatives
e7e4e490326b added literature about parsing urbanc parents: diff changeset	198
e7e4e490326b added literature about parsing urbanc parents: diff changeset	199	OVERALL RATING: 2 (Weak paper, though I will not fight strongly against it.)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	200	REVIEWER'S CONFIDENCE: 2 (medium)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	201	----------------------- REVIEW --------------------
e7e4e490326b added literature about parsing urbanc parents: diff changeset	202
e7e4e490326b added literature about parsing urbanc parents: diff changeset	203	MAIN PART
e7e4e490326b added literature about parsing urbanc parents: diff changeset	204	The paper proposes a new parsing technique based on "derivatives",
e7e4e490326b added literature about parsing urbanc parents: diff changeset	205	inspired by the derivatives of regular expressions and extended to the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	206	derivatives of context free grammars.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	207	Because context free grammars may be ambiguous, they propose to compute the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	208	set of all parse trees but using lazy evaluation to share and delay the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	209	computation. They present their framework and an implementation in Scala.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	210	They claim that their technique is efficient in practice.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	211
e7e4e490326b added literature about parsing urbanc parents: diff changeset	212	EVALUATION
e7e4e490326b added literature about parsing urbanc parents: diff changeset	213	Parsing techniques are old and well-established, although the current status
e7e4e490326b added literature about parsing urbanc parents: diff changeset	214	of the theory is still unsatisfactory: grammars are still a burden to write
e7e4e490326b added literature about parsing urbanc parents: diff changeset	215	and any advance of this topic is welcomed. However, the topic must be
e7e4e490326b added literature about parsing urbanc parents: diff changeset	216	addressed very seriously.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	217	Given this, this work is potentially interesting as a possible way to
e7e4e490326b added literature about parsing urbanc parents: diff changeset	218	improve the state of the art situation, but not in its current
e7e4e490326b added literature about parsing urbanc parents: diff changeset	219	presentation, as the paper should not just _claim_ that using derivatives
e7e4e490326b added literature about parsing urbanc parents: diff changeset	220	for parsing is feasible, but _demonstrate_ that writing grammars with
e7e4e490326b added literature about parsing urbanc parents: diff changeset	221	combinators is at least both as convenient and as efficient as using
e7e4e490326b added literature about parsing urbanc parents: diff changeset	222	existing technology.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	223	The first aspect is not discussed at all in the paper. As for efficiency,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	224	which is claimed, the benchmarks are really small, and perhaps probably
e7e4e490326b added literature about parsing urbanc parents: diff changeset	225	representative of existing grammars, and the results are not compared with
e7e4e490326b added literature about parsing urbanc parents: diff changeset	226	other parsing techniques.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	227	Other questions arise: what happens when the grammar is largely ambiguous?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	228	Do you intend to reject those, or pick the meaning of the things that you
e7e4e490326b added literature about parsing urbanc parents: diff changeset	229	are parsing at random?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	230	The overall structure of the paper is not very clear either. You should
e7e4e490326b added literature about parsing urbanc parents: diff changeset	231	gives us a road map at the beginning and make it clear how the different
e7e4e490326b added literature about parsing urbanc parents: diff changeset	232	sections fits together.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	233
e7e4e490326b added literature about parsing urbanc parents: diff changeset	234	OTHER COMMENTS
e7e4e490326b added literature about parsing urbanc parents: diff changeset	235	Section 7 and 8 are the key sections of the paper to understand your
e7e4e490326b added literature about parsing urbanc parents: diff changeset	236	proposal. However, I have problems with both.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	237	I don't understand while your parsing is decomposed into production of an
e7e4e490326b added literature about parsing urbanc parents: diff changeset	238	intermediate parsing string that is then consumed by the machine that
e7e4e490326b added literature about parsing urbanc parents: diff changeset	239	produces a syntax tree. Since the parsing string is a linear representation
e7e4e490326b added literature about parsing urbanc parents: diff changeset	240	of the parsing tree, why can't you work with the parsing tree itself, which
e7e4e490326b added literature about parsing urbanc parents: diff changeset	241	is more structured than the parsing string? Why can't you directly extract
e7e4e490326b added literature about parsing urbanc parents: diff changeset	242	a syntax tree from the parsing tree? Could you run the algorithm in Section
e7e4e490326b added literature about parsing urbanc parents: diff changeset	243	7 on an example?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	244	In section 8, you compute the derivatives of parser combinators, but you
e7e4e490326b added literature about parsing urbanc parents: diff changeset	245	could instead compute the grammar defined by the parser combinator and
e7e4e490326b added literature about parsing urbanc parents: diff changeset	246	compute its derivative in one step. What is this bad/worse than computing it
e7e4e490326b added literature about parsing urbanc parents: diff changeset	247	directly?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	248
e7e4e490326b added literature about parsing urbanc parents: diff changeset	249	- One line above section 2: "could [to<-] modify"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	250	- Section 2, Line 10: "derivative---[they<-??] critical property"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	251	- One line above Section 7: "remain [<-of] a manageable size.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	252	- Computation of the derivative, above theorem 1:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	253	You could explain the derivatives, especially for rules n-> s1..sn
e7e4e490326b added literature about parsing urbanc parents: diff changeset	254	and perhaps give an example.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	255	- Last two lines above section 8: "The parsing machine demonstrates...
e7e4e490326b added literature about parsing urbanc parents: diff changeset	256	a concrete syntax tree".
e7e4e490326b added literature about parsing urbanc parents: diff changeset	257	I don't really understand what you (intend to) mean.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	258	- Section 8.1, Line 1: "ability [the<-to] combine them"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	259	- End of section 8.1:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	260	you should say and comment on the fact that these definitions are
e7e4e490326b added literature about parsing urbanc parents: diff changeset	261	recursive.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	262	- First line of section 8.4: "to be able [<-to] parse"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	263	- Section 8.4: How do you use these inequations? Why do you give this list of
e7e4e490326b added literature about parsing urbanc parents: diff changeset	264	formulas without any comment?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	265
e7e4e490326b added literature about parsing urbanc parents: diff changeset	266	- Section 8.5: what are the lambda-terms on the right of --> ?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	267
e7e4e490326b added literature about parsing urbanc parents: diff changeset	268
e7e4e490326b added literature about parsing urbanc parents: diff changeset	269	-------------------- review 4 --------------------
e7e4e490326b added literature about parsing urbanc parents: diff changeset	270
e7e4e490326b added literature about parsing urbanc parents: diff changeset	271	PAPER: 104
e7e4e490326b added literature about parsing urbanc parents: diff changeset	272	TITLE: Parsing with derivatives
e7e4e490326b added literature about parsing urbanc parents: diff changeset	273
e7e4e490326b added literature about parsing urbanc parents: diff changeset	274	OVERALL RATING: 1 (Serious problems. I will argue to reject this paper.)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	275	REVIEWER'S CONFIDENCE: 3 (high)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	276	----------------------- REVIEW --------------------
e7e4e490326b added literature about parsing urbanc parents: diff changeset	277
e7e4e490326b added literature about parsing urbanc parents: diff changeset	278	MAIN PART
e7e4e490326b added literature about parsing urbanc parents: diff changeset	279	SUMMARY
e7e4e490326b added literature about parsing urbanc parents: diff changeset	280	The paper proposes a basic algorithm for determining whether a sentence s is a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	281	member of a context-free language L: (i) if the sentence is empty, determine
e7e4e490326b added literature about parsing urbanc parents: diff changeset	282	whether L contains the empty string (a decidable problem); (ii) if the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	283	sentence is of the form c.s, determine whether the suffix s is a member of the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	284	derivative c^{-1}.L. Because the derivative of a context-free language can be
e7e4e490326b added literature about parsing urbanc parents: diff changeset	285	effectively constructed, this is an algorithm.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	286	The contribution of the paper is in generalizing this basic algorithm in two
e7e4e490326b added literature about parsing urbanc parents: diff changeset	287	ways: (i) instead of just determining membership, the parsing algorithm can be
e7e4e490326b added literature about parsing urbanc parents: diff changeset	288	made to return a concrete syntax tree, or a semantic value; (ii) instead of
e7e4e490326b added literature about parsing urbanc parents: diff changeset	289	viewing the derivative as an operation that accepts a monolithic grammar and
e7e4e490326b added literature about parsing urbanc parents: diff changeset	290	produces a monolithic grammar, one can view it as an operation that accepts a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	291	parser combinator and produces a parser combinator; this means that the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	292	parsing algorithm is amenable to a modular, combinator-based implementation.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	293	EVALUATION
e7e4e490326b added literature about parsing urbanc parents: diff changeset	294	The introduction of the paper is somewhat irritating, for two reasons. First,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	295	it promises a lot ("simplicity, feasibility, and ubiquity"), which the paper
e7e4e490326b added literature about parsing urbanc parents: diff changeset	296	does not clearly deliver. Second, it criticizes the most popular parsing
e7e4e490326b added literature about parsing urbanc parents: diff changeset	297	methods (LL and LR) on the grounds that the constraints that they impose, in
e7e4e490326b added literature about parsing urbanc parents: diff changeset	298	order to rule out ambiguous grammars, are difficult to grasp. I can accept
e7e4e490326b added literature about parsing urbanc parents: diff changeset	299	this criticism. But the authors do not offer any solution to the ambiguity
e7e4e490326b added literature about parsing urbanc parents: diff changeset	300	problem: their algorithm produces a parse forest. This leaves the programmer
e7e4e490326b added literature about parsing urbanc parents: diff changeset	301	faced with the task of writing and debugging disambiguation filters, an error
e7e4e490326b added literature about parsing urbanc parents: diff changeset	302	prone task. At least, an LR parser generator is able (in principle, and some
e7e4e490326b added literature about parsing urbanc parents: diff changeset	303	implementations actually do this) to accurately explain why the grammar seems
e7e4e490326b added literature about parsing urbanc parents: diff changeset	304	ambiguous. It can produce a sentence that is a prefix of the fringes of two
e7e4e490326b added literature about parsing urbanc parents: diff changeset	305	distinct, valid parse trees.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	306	The first claimed contribution of the paper is the remark that the derivative
e7e4e490326b added literature about parsing urbanc parents: diff changeset	307	of a context free language can be effectively constructed. (A proof appears in
e7e4e490326b added literature about parsing urbanc parents: diff changeset	308	section 6.) This is in fact not a contribution. This property of context free
e7e4e490326b added literature about parsing urbanc parents: diff changeset	309	languages is well-known, at least in the formal language community (perhaps
e7e4e490326b added literature about parsing urbanc parents: diff changeset	310	not in the programming language community). The derivative of a formal language
e7e4e490326b added literature about parsing urbanc parents: diff changeset	311	is also known as its left quotient, and written c^{-1}L. See, for instance, the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	312	following paper, which exploits this property intensively:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	313	Jean Berstel and Luc Boasson. Towards an algebraic theory of context-free
e7e4e490326b added literature about parsing urbanc parents: diff changeset	314	languages. Fundamentae Informaticae, 25(3):217-239, 1996.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	315	The remainder of the paper contains two distinct developments. In section 7, a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	316	version of the derivative-based parsing algorithm is developed, which produces
e7e4e490326b added literature about parsing urbanc parents: diff changeset	317	concrete syntax trees. Then, in sections 8 and 9, another version is developed,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	318	which produces programmer-defined semantic values, and is combinator-based.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	319	It is not clear to me why these two distinct developments co-exist. In my eyes,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	320	the approach of sections 8 and 9 seems superior, because it is more flexible
e7e4e490326b added literature about parsing urbanc parents: diff changeset	321	(it is not limited to producing concrete syntax trees) and modular.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	322	The idea that the notion of a derivative can be generalized to parser
e7e4e490326b added literature about parsing urbanc parents: diff changeset	323	combinators is interesting, and worthy of thought. To some extent, section 8.2
e7e4e490326b added literature about parsing urbanc parents: diff changeset	324	is where the paper really begins! Section 8 provides a mathematical view of
e7e4e490326b added literature about parsing urbanc parents: diff changeset	325	the problem, while section 9 describes part of a Scala implementation.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	326	This material is interesting, but can be criticized on several grounds:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	327	1. The complexity analysis of the algorithm is absent. The algorithm seems
e7e4e490326b added literature about parsing urbanc parents: diff changeset	328	very expensive in the worst case: indeed, the effective construction of a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	329	derivative in section 6 causes a quadratic blow up of the size of the grammar,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	330	and this operation is iterated at every character of the input stream! In
e7e4e490326b added literature about parsing urbanc parents: diff changeset	331	fact, section 9.1 mentions that a naive implementation of the algorithm is
e7e4e490326b added literature about parsing urbanc parents: diff changeset	332	unusable (exponentially slow), and that special support for the Kleene star
e7e4e490326b added literature about parsing urbanc parents: diff changeset	333	had to be added. However, there is no argument that this patch is sufficient
e7e4e490326b added literature about parsing urbanc parents: diff changeset	334	for the algorithm to become useable. A worst-case analysis of the space and
e7e4e490326b added literature about parsing urbanc parents: diff changeset	335	time complexity is needed, and, if this worst-case complexity is bad (say,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	336	exponential), then a formal or semi-formal argument is needed why the behavior
e7e4e490326b added literature about parsing urbanc parents: diff changeset	337	of the algorithm remains tolerable in practice. In light of the efficiency
e7e4e490326b added literature about parsing urbanc parents: diff changeset	338	claim found in the introduction, this is indispensable!
e7e4e490326b added literature about parsing urbanc parents: diff changeset	339	2. The description of the implementation could be improved. In particular,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	340	memoization and sharing seem to play a crucial role. It seems that two
e7e4e490326b added literature about parsing urbanc parents: diff changeset	341	attempts to compute the derivative of a parser combinator should return the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	342	same object. This includes the particular case where the second attempt is in
e7e4e490326b added literature about parsing urbanc parents: diff changeset	343	fact nested within the first one (i.e., it is a recursive call). For instance,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	344	if the grammar contains the single production L -> LM, a call to L.derive(t)
e7e4e490326b added literature about parsing urbanc parents: diff changeset	345	will (indirectly) cause another call to L.derive(t). What happens then? By
e7e4e490326b added literature about parsing urbanc parents: diff changeset	346	which mechanism is non-termination avoided? This is not clearly described.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	347	The paper contains four lines of text that seem to concern this topic (p.15,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	348	"It is important to note [...] immediately suspended."), but I find them
e7e4e490326b added literature about parsing urbanc parents: diff changeset	349	obscure. I note that there seems to be a distinction between derive and
e7e4e490326b added literature about parsing urbanc parents: diff changeset	350	internalDerive, which is not explained in the paper; my guess is that the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	351	former may be a memoizing version of the latter, using a table indexed by
e7e4e490326b added literature about parsing urbanc parents: diff changeset	352	tokens. Is this the case?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	353	3. The introduction claims "ubiquity", that is, if I understand correctly, it
e7e4e490326b added literature about parsing urbanc parents: diff changeset	354	claims that the method is very easy to implement as a library. This may be
e7e4e490326b added literature about parsing urbanc parents: diff changeset	355	true, but the paper lacks a convincing argument why this is so. In particular,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	356	how is this approach superior to the other parsing combinator libraries in
e7e4e490326b added literature about parsing urbanc parents: diff changeset	357	existence? The paper lacks a good comparison with other combinator-based
e7e4e490326b added literature about parsing urbanc parents: diff changeset	358	parsing techniques.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	359	It seems that there might be a connection between the technique presented in
e7e4e490326b added literature about parsing urbanc parents: diff changeset	360	the paper and intersection parsing (see e.g. chapter 13 of the excellent book
e7e4e490326b added literature about parsing urbanc parents: diff changeset	361	"Parsing techniques", by Grune and Jacobs). Computing the derivative of a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	362	context free language with respect to a terminal symbol c is closely related
e7e4e490326b added literature about parsing urbanc parents: diff changeset	363	to computing its intersection with the regular language c(.*). Intersection
e7e4e490326b added literature about parsing urbanc parents: diff changeset	364	parsing, as described by Grune and Jacobs, computes the intersection of the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	365	grammar with the input (which is a string, so a regular language!) and
e7e4e490326b added literature about parsing urbanc parents: diff changeset	366	simplifies the resulting grammar, which can then be viewed as a parse
e7e4e490326b added literature about parsing urbanc parents: diff changeset	367	tree. Could one view the approach presented in the paper as an incremental
e7e4e490326b added literature about parsing urbanc parents: diff changeset	368	(and modular) way of doing intersection parsing?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	369	In conclusion, although the paper contains an interesting idea, its
e7e4e490326b added literature about parsing urbanc parents: diff changeset	370	investigation is not sufficiently thorough, and a clear case in favor
e7e4e490326b added literature about parsing urbanc parents: diff changeset	371	of this parsing technique is missing.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	372	DETAILED COMMENTS
e7e4e490326b added literature about parsing urbanc parents: diff changeset	373	The authors state in section 7 that "a literal implementation [...] must use a
e7e4e490326b added literature about parsing urbanc parents: diff changeset	374	generic parse-tree type, or else violate type safety with casts for
e7e4e490326b added literature about parsing urbanc parents: diff changeset	375	reductions". Strictly speaking, this depends on which type system one has in
e7e4e490326b added literature about parsing urbanc parents: diff changeset	376	mind. If the type system is ML, for instance, the claim is probably true. If
e7e4e490326b added literature about parsing urbanc parents: diff changeset	377	the type system has GADTs, however, it is quite possible that the discipline
e7e4e490326b added literature about parsing urbanc parents: diff changeset	378	of the parsing machine can be encoded in the types. This has been done, for an
e7e4e490326b added literature about parsing urbanc parents: diff changeset	379	LR parsing machine, and for an arbitrary but fixed grammar, in the following
e7e4e490326b added literature about parsing urbanc parents: diff changeset	380	paper:
e7e4e490326b added literature about parsing urbanc parents: diff changeset	381	François Pottier and Yann Régis-Gianas. Towards efficient, typed LR parsers.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	382	In ACM Workshop on ML, volume 148(2) of Electronic Notes in Theoretical Computer
e7e4e490326b added literature about parsing urbanc parents: diff changeset	383	Science, pages 155-180, 2006.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	384	To the best of my knowledge, both Haskell and Scala have GADTs, so the authors
e7e4e490326b added literature about parsing urbanc parents: diff changeset	385	might wish to check this out.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	386	p.5, "for any class of decidable languages closed under the derivative,
e7e4e490326b added literature about parsing urbanc parents: diff changeset	387	derivatives may be used to determine membership". I am not sure what is meant
e7e4e490326b added literature about parsing urbanc parents: diff changeset	388	by "decidable" here. If it is meant that membership in the language is
e7e4e490326b added literature about parsing urbanc parents: diff changeset	389	decidable, then this is a tautology. Perhaps one should understand that
e7e4e490326b added literature about parsing urbanc parents: diff changeset	390	membership of the empty word in the language is decidable, since effectively
e7e4e490326b added literature about parsing urbanc parents: diff changeset	391	that is all that is needed?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	392	p.5, "the derivative is closed under regular operations": unclear formulation.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	393	p.7, before theorem 1: please indicate the interval over which the index i
e7e4e490326b added literature about parsing urbanc parents: diff changeset	394	ranges. My guess is 0 \geq i < m. The particular case where m = 0 seems
e7e4e490326b added literature about parsing urbanc parents: diff changeset	395	redundant: when m = 0, there is no suitable index i, so no production is
e7e4e490326b added literature about parsing urbanc parents: diff changeset	396	generated, which means D_c(n) is empty. In fact, the notation \emptyset
e7e4e490326b added literature about parsing urbanc parents: diff changeset	397	in the right-hand side of a production does not make sense, since the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	398	right-hand side of a production is supposed to be a sequence of symbols.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	399	p.7, "the empty nonterminal": this has not been defined. I assume this
e7e4e490326b added literature about parsing urbanc parents: diff changeset	400	explains the use of \emptyset above. Define it.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	401	p.8, example 1: "for the grammar [...] of balanced parentheses [...] the parse
e7e4e490326b added literature about parsing urbanc parents: diff changeset	402	tree is": I assume this should be: "the parse tree for the string '()'".
e7e4e490326b added literature about parsing urbanc parents: diff changeset	403	p.8, "A parse string is [...] containing both nonterminal markers and
e7e4e490326b added literature about parsing urbanc parents: diff changeset	404	reduction markers"." Please indicate at this point that a parse string also
e7e4e490326b added literature about parsing urbanc parents: diff changeset	405	contains elements of the alphabet A. In fact, the definition of the alphabet
e7e4e490326b added literature about parsing urbanc parents: diff changeset	406	A' should be moved to this point.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	407	p.8, "we construct a new grammar [...] over parse strings": please change
e7e4e490326b added literature about parsing urbanc parents: diff changeset	408	this to "a new grammar over the alphabet A'".
e7e4e490326b added literature about parsing urbanc parents: diff changeset	409	p.8, the definition of \Sigma mentions \mathbb{L}: I guess this should be
e7e4e490326b added literature about parsing urbanc parents: diff changeset	410	\mathbb{L}_{A'}.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	411	p.9, "a second rule for producing bras": is the rule correct? In the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	412	right-hand side of the conclusion, instead of L, I would have expected the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	413	derivative of L with respect to <n\|. (The rule, as it stands, clearly causes
e7e4e490326b added literature about parsing urbanc parents: diff changeset	414	an infinite stream of <n\|'s to be produced! Similarly for the third rule.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	415	p.9, "the parsing machine reduces": one might wish to state the invariant
e7e4e490326b added literature about parsing urbanc parents: diff changeset	416	of the parsing machine which guarantees that reduction can never fail (by
e7e4e490326b added literature about parsing urbanc parents: diff changeset	417	lack of material on the stack).
e7e4e490326b added literature about parsing urbanc parents: diff changeset	418	p.9, "popular as of late": please provide citation(s).
e7e4e490326b added literature about parsing urbanc parents: diff changeset	419	p.10, the statement of the monotonicity condition is somewhat unclear: the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	420	assertion \tau_X(w) \sqsubseteq \tau_X(w') compares two sets of pairs. An
e7e4e490326b added literature about parsing urbanc parents: diff changeset	421	ordering over pairs is defined, but an ordering over sets of pairs is not
e7e4e490326b added literature about parsing urbanc parents: diff changeset	422	defined.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	423	p.11, the authors might want to spend more space explaining the definition of
e7e4e490326b added literature about parsing urbanc parents: diff changeset	424	the derivative of a parser combinator -- why is the right definition? and how
e7e4e490326b added literature about parsing urbanc parents: diff changeset	425	does it relate with the definition of the derivative of a formal language?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	426	Can one recover the latter as a degenerate case of the former?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	427	p.12, section 8.3: the parser combinator \epsilon is used, but has not been
e7e4e490326b added literature about parsing urbanc parents: diff changeset	428	defined. What is its type? In particular, what semantic value does it produce?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	429	(A unit value?) What is the meaning of the notation \lambda\epsilon. ...
e7e4e490326b added literature about parsing urbanc parents: diff changeset	430	p.12, equation (3): the elegance afforded by the use of \delta in the
e7e4e490326b added literature about parsing urbanc parents: diff changeset	431	definition of the derivative of a concatenation (p.5) seems to have been lost
e7e4e490326b added literature about parsing urbanc parents: diff changeset	432	here.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	433	p.13, "in practice, the cost [...] as they are derived". I don't know what
e7e4e490326b added literature about parsing urbanc parents: diff changeset	434	this means. Please be more precise and include a formal complexity claim.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	435	p.14, "parse trees of type A" should really be referred to as "semantic
e7e4e490326b added literature about parsing urbanc parents: diff changeset	436	values", since they may not be trees at all -- they are programmer-defined.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	437	p.14, "its implementation follows eq.2": actually, eq.2 corresponds only
e7e4e490326b added literature about parsing urbanc parents: diff changeset	438	to the "else" clause of the code, so the text is badly worded. Same problem
e7e4e490326b added literature about parsing urbanc parents: diff changeset	439	with eq.1 a few lines later.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	440	p.15, the distinction between "derive" and "internalDerive" is not explained.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	441	p.15, the derivative of union "short-circuits if it finds either arm is
e7e4e490326b added literature about parsing urbanc parents: diff changeset	442	empty": is this important, say, to ensure termination? or is it just a minor
e7e4e490326b added literature about parsing urbanc parents: diff changeset	443	optimization?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	444	p.16, the fact that a naïve treatment of the Kleene star required exponential
e7e4e490326b added literature about parsing urbanc parents: diff changeset	445	time, and the fact that right recursion is significantly more efficient than
e7e4e490326b added literature about parsing urbanc parents: diff changeset	446	left recursion, are worrisome. These problems are perhaps partially masked by
e7e4e490326b added literature about parsing urbanc parents: diff changeset	447	the introduction of "the special closure construct", but how do we know that
e7e4e490326b added literature about parsing urbanc parents: diff changeset	448	they are gone?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	449	p.16, the test suite consists of just one grammar and a set of randomly
e7e4e490326b added literature about parsing urbanc parents: diff changeset	450	generated inputs. This seems far from representative of actual usage...
e7e4e490326b added literature about parsing urbanc parents: diff changeset	451	p.17, "the cost of that compilation is amortized across the entire parsing
e7e4e490326b added literature about parsing urbanc parents: diff changeset	452	process". I don't know what is meant by "compilation" here, nor do I know
e7e4e490326b added literature about parsing urbanc parents: diff changeset	453	why its cost is amortized. Please explain.
e7e4e490326b added literature about parsing urbanc parents: diff changeset	454	p.18, "these combinators are slowly transforming and unfolding the grammar
e7e4e490326b added literature about parsing urbanc parents: diff changeset	455	itself into a parse tree": this seems to be an interesting intuition. Why
e7e4e490326b added literature about parsing urbanc parents: diff changeset	456	not try to make it more formal?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	457
e7e4e490326b added literature about parsing urbanc parents: diff changeset	458	TYPOS
e7e4e490326b added literature about parsing urbanc parents: diff changeset	459	p.2, "could to modify"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	460	p.2, "they critical"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	461	p.6, "attempt interpret"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	462	p.10, "Operations or combinators"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	463	p.10, "the remainders of the input"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	464	p.12, "is base case"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	465	p.13, "able parse"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	466	p.15, "short-circuiting if the first element is nullable": should this be: "is
e7e4e490326b added literature about parsing urbanc parents: diff changeset	467	not nullable"?
e7e4e490326b added literature about parsing urbanc parents: diff changeset	468	p.16, "43 m" -> "43 ms"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	469	p.16, "surpising"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	470	p.17, "they extent"
e7e4e490326b added literature about parsing urbanc parents: diff changeset	471	PROS AND CONS
e7e4e490326b added literature about parsing urbanc parents: diff changeset	472	+ interesting and possibly novel idea
e7e4e490326b added literature about parsing urbanc parents: diff changeset	473	- no time/space complexity analysis
e7e4e490326b added literature about parsing urbanc parents: diff changeset	474	- implementation not clearly described
e7e4e490326b added literature about parsing urbanc parents: diff changeset	475	- case in favor of the idea not clearly made
e7e4e490326b added literature about parsing urbanc parents: diff changeset	476	- comparison with related work should be more thorough
e7e4e490326b added literature about parsing urbanc parents: diff changeset	477	REBUTTAL
e7e4e490326b added literature about parsing urbanc parents: diff changeset	478	No specific questions.

author	Christian Urban <christian dot urban at kcl dot ac dot uk>
	Thu, 12 Sep 2013 10:34:11 +0200
changeset 385	e5e32faa2446
parent 153	e7e4e490326b
permissions	-rw-r--r--