nominal2: LMCS-Review@502b5f02edaf



> Referee no 1:
>
>  * The paper can be accepted for Logical Methods in Computer Science 
> after minor
> revisions
>
> NUMBER    : LMCS-2011-675
> TITLE     : General Bindings and Alpha-Equivalence in Nominal Isabelle
> AUTHOR(S) : Christian Urban, Cezary Kaliszyk
>
> Recommendation: The paper can be accepted for Logical Methods in
> Computer Science after minor revisions.
>
> The work reported is very good, but the presentation of the paper can
> be improved.
>
> This paper continues a line of work called "Nominal Isabelle" carried
> out by the first author and his colleagues for many years.  The goal
> of this work is to support formal (machine checked) reasoning about
> languages with binding.  With the theoretical foundation of "nominal
> logic" developed by Pitts and colleagues, these authors and their
> co-workers have developed a package to support such reasoning in the
> Isabelle proof tool for Higher Order Logic.  This toolkit has been
> widely used, and although the technology sometimes shows through
> (e.g. explicit name swapping required in arguments) it is a very good
> package.
>
> Up to now, this package has supported single binders such as \lambda.
> Multiple simultaneous binding (e.g. letrec) had to be coded using
> iterated single binders.  Not only is this coding hard to reason
> about, it often isn't a correct representation of the intended
> language.  This paper describes a new version of the Isabelle package,
> "Nominal2", supporting binding of sets and lists of names in the
> Isabelle/HOL system.
>
> The amount of work involved is immense, and the first author
> especially has shown real commitment to continuing development of both
> theory and working tools. Everything provided in this package is
> claimed to be a definitional extension of HOL: no assumptions or
> built-in changes to the logic.  For all of these reasons, this is very
> good work.
>
> However, I recommend improvement of the presentation of the paper
> before it is accepted by LMCS.  While the motivation for the work of
> this paper is clear to anyone who has tried to formalize such
> reasoning, it is not explained in the paper.  E.g. on p.1 "However,
> Nominal Isabelle has fared less well in a formalisation of the
> algorithm W [...]."  But there is no analysis in the paper of what was
> hard in algorithm W coded with single binders, or explanation of how
> it would be done in the new system reported in this paper showing why
> the new approach works better in practice.  Although this example is
> one of the main motivations given for the work, there is apparently no
> formalization of algorithm W in the library of examples that comes
> with Nominal2 described in this paper.  I think that should be
> provided.  Similarly for the second motivating example (on p.2 "The
> need of iterating single binders is also one reason why Nominal
> Isabelle and similar theorem provers that only provide mechanisms for
> binding single variables have not fared extremely well with the more
> advanced tasks in the POPLmark challenge [2], because also there one
> would like to bind multiple variables at once.").
>
> The new Isabelle package "Nominal2", described in this paper, is not
> ready for users without a lot of hand-holding from the Nominal2
> developers.  This paper would have more impact if interested users
> could try the tool without so much difficulty.
>
> A few more specific points:
>
> Bottom of p.7: I don't understand the paragraph containing equations
> (2.4) and (2.5).
>
> Bottom of p.9: The parameters R and fa of the alpha equivalence
> relation are dropped in the examples, so the examples are not clear.
> I think there is a typo in the first example: "It can be easily
> checked that ({x,y},x->y) and ({y,x},y->x) are alpha-equivalent [...]"
> Did you mean "({x,y},x->y) and ({y,x},x->y) are alpha-equivalent"?
>
>
> Referee no 2:
>
>  * The paper can be accepted for Logical Methods in Computer Science 
> after minor
> revisions
>
> General comments
>
> This paper describes a new implementation of the nominal_datatype package
> within the Isabelle/HL theorem prover. This implementation is more modular
> than previous versions, because it relies on (I think) three non-trivial
> independent packages, namely the datatype package, the function package, and
> the quotient package. This implementation is also more powerful than previous
> versions, because it deals with abstractions that bind multiple names 
> at once,
> and because it offers two variants of these abstractions (baptised "set" and
> "set+") where certain structural equivalence laws, namely the exchange of two
> binders and the elimination/introduction of a vacuous binder, are built
> directly into the alpha-equivalence relation.
>
> Overall, I like the paper because it describes a useful piece of software,
> because the architecture of this software is quite non-trivial and well
> designed, and because the paper is written in a very understandable style.
> For these reasons, I believe the paper should be accepted. I do have a series
> of questions and suggestions for potential improvements and would be happy to
> review a revised version of the paper if the editor sees fit.
>
> My main criticisms of the paper are:
>
> * The definition of the "nominal signature" language is not completely clear.
>   The general format at the beginning of section 4 is very clear, but is in
>   fact too general: not everything that can be written in this format makes
>   sense. The authors then walk the reader through a series of 
> examples of what
>   is *forbidden* (with informal explanations why these examples are
>   forbidden), but in the end, a positive definition of what is *permitted*
>   seems to be missing.
>
> * The authors have isolated an important building block, the notion of
>   (multiple-name) abstraction (in Section 3). (Actually, there are three
>   variants of it.) This is good: it makes the whole construction modular
>   and helps simplify what follows. I don't know if this will make sense
>   for the authors, but I would like them to go further in this direction:
>   identify more elementary building blocks ("combinators", if you will),
>   study their properties in isolation, and in the end combine them to
>   obtain a very simple explanation of the "nominal signature" format
>   that is exposed to the user. In the present state of the paper, the
>   design of the "nominal signature" format seems somewhat ad hoc: the
>   format of the "binds" clauses is subject to several restrictions;
>   there seems to be a distinction between "binders" and ordinary
>   "terms"; there is a distinction between "recursive" and "non-recursive"
>   binders, and a distinction between "shallow" and "deep" binders. If
>   one could identify a small number of elementary building blocks and
>   explain/motivate the design of the surface specification language in
>   terms of these elementary notions, the paper might become all the more
>   compelling.
>
>   In the present state of the paper, I think the *implementation* of the
>   nominal package is very useful for the end user, but the *theory* that is
>   presented in this paper is still a bit cumbersome: the definitions of free
>   atoms, alpha-equivalence, etc. presented on pages 16-20 are understandable
>   but not compelling by their simplicity.
>
> * I do not quite understand the treatment of the finiteness restriction.
>   I understand that things must have finite support so as to allow picking
>   atoms outside of their support. But finiteness side conditions seem to
>   appear pretty early and in unexpected places; e.g. I would expect the
>   support of a set of atoms "as" to be equal to "as", regardless of whether
>   "as" is finite or infinite. This could be clarified.
>
> * The choice of abstraction "style" is limited to three built-in forms (list,
>   set, and set+). Perhaps one could make this user-extensible. After 
> all, very
>   few properties seem to be required of the basic abstraction forms, 
> so why not
>   let the user define new ones?
>
> * One may argue that the set-abstractions are an attempt to kill two birds
>   with one stone. On the one hand, we take the quotient raw terms modulo a
>   standard notion of alpha-equivalence; on the other hand, at the same time,
>   we take the quotient modulo a notion of structural equivalence (permutation
>   of binders, removal or introduction of vacuous binders). One could argue
>   that dealing with structural equivalence should be left to the 
> user, because
>   in general the structural equivalence axioms that the user needs can be
>   arbitrarily complex and application-specific. There are object languages,
>   for instance, where abstractions commute with pairs: binding a name in a
>   pair is the same as binding a name within each of the pair components.
>   (This is the case in first-order logic where forall distributes over
>   conjunction.) Thus, one may fear that in many cases, the set and set+
>   abstractions will not be sufficiently powerful to encode the desired
>   structural equivalence, and the user will need to explicitly define 
> a notion
>   of structural equivalence anyway. I don't think that the paper provides
>   convincing evidence that set and set+ abstractions are useful. (That said,
>   they don't cost much, so why not include them? Sure.)
>
> * Here is little challenge related to set-abstractions. Could you explain how
>   to define the syntax of an object language with a construct like this:
>
>     let x1 = t1 and ... and xn = tn in t
>
>   where the xi's are bound in t (this is a non-recursive multiple-let form)
>   and the order of the definitions does not matter (that is, "let x1 = t1
>   and x2 = t2 in t" is alpha-equivalent to "let x2 = t2 and x1 = t1 in t")?
>   Can you use a set-abstraction to achieve this? I am guessing that this
>   might be possible, if one represents the definitions "x1 = t1 and ..."
>   using a set of pairs (or a map of names to terms) as opposed to a list
>   of pairs. I think that the system should at the very least allow encoding
>   this example, otherwise set-abstractions will not be very useful in
>   practice.
>
> Detailed comments
>
> [Written while I was reading, so sometimes I ask a question whose 
> answer comes
> a bit later in the paper.]
>
> p.2, "this leads to a rather clumsy formalisation of W". Could you explain
> why? Although I can understand why in some circumstances it is desirable to
> have a notion of alpha-equivalence that includes re-ordering binders, 
> I am not
> sure that the ML type system (or its inference algorithm) is a good
> illustration. If one examines the typing rules of Core ML, one finds that
> their premises involve a notion of equality between *types* (for 
> instance, the
> function application rule requires that the types of the formal and actual
> arguments match) but do not involve any notion of equality between *type
> schemes*. Type schemes are constructed and eliminated; they are never 
> compared
> with one another. For this reason, it is not clear that a notion of
> alpha-equivalence for type schemes is required at all, let alone that it must
> allow re-ordering binders and/or disregarding vacuous binders.
>
> p.3, "let the user chose" -> "choose"
>
> p.5, I am not sure what you mean by "automatic proofs". Do you mean
> automatically-generated proof scripts, or proofs performed automatically by a
> decision procedure, or ... ?
>
> p.5, "adaption"
>
> p.5, it seems strange to use the symbol "+" for composition, a 
> non-commutative
> operation.
>
> Equation (2.2) is unfamiliar to me. I am used to seeing "supp x" defined as
> the least set L such that for every permutation pi, if pi fixes L, then pi
> fixes x. I assume that the two definitions are equivalent? Is there a reason
> why you prefer this one?
>
> Proposition 2.3, item (i) is not very easy to read, because text and math
> are mixed and "as" happens to be an English word. More importantly, could
> you explain why the hypothesis "finite as" is needed? The proposition seems
> intuitively true if we remove this hypothesis: it states exactly that 
> "supp x"
> is the least set that supports x (this is actually the definition of "supp"
> that I expected, as mentioned above).
>
> p.8, "equivariant functions have empty support". I suppose the converse is
> true, i.e. "functions that have empty support are equivariant". If this is
> correct, please say so.
>
> p.8, "we used extensively Property 2.1". You mean "Proposition 2.1". Perhaps
> it would be good to choose distinct numbers for inline equations and for
> propositions.
>
> p.8, "we identify four conditions: (i) [...] x and y need to have the same
> set of free atoms". You seem to be saying that fa(x) and fa(y) should be
> equal. But this is too strong; I suppose you mean fa(x) \ as = fa(y) \ bs.
> Please clarify. (Definition 3.1 indeed clarifies this, but I believe that
> the text that precedes it is a bit confusing.)
>
> p.9, it seems to me that alpha-equivalence for Set+ bindings (Definition 3.3)
> is in a sense the most general of the three notions presented here. Indeed,
> alpha-equivalence for Set bindings can be defined in terms of it, as follows:
>
>   (as, x) =_{Set} (bs, y)
>   if and only if
>   (as, (as, x)) =_{Set+} (bs, (bs, y))
>
> That is, I am comparing abstractions whose body has type "atom set * beta".
> The comparison of the set components forces condition (iv) of Definition 3.1.
> Similarly, alpha-equivalence for List bindings can be defined in terms of it,
> as follows:
>
>   (as, x) =_{List} (bs, y)
>   if and only if
>   (set as, (as, x)) =_{Set+} (set bs, (bs, y))
>
> That is, I am comparing abstractions whose body has type "atom list * beta".
> Am I correct to think that one can do this? If so, could this help eliminate
> some redundancy in the paper or in the implementation? And, for a 
> more radical
> suggestion, could one decide to expose only Set+ equality to the programmer,
> and let him/her explicitly encode Set/List equality where desired?
>
> p.10, "in these relation"
>
> p.10, isn't equation (3.3) a *definition* of the action of permutations
> on the newly defined quotient type "beta abs_{set}"?
>
> p.11, why do you need to "assume that x has finite support" in order to
> obtain property 3.4? It seems to me that this fact should also hold for
> an x with infinite support. Same remark in a couple of places further
> down on this page. You note that "supp bs = bs" holds "for every finite
> set of atoms bs". Is it *not* the case that this also holds for infinite
> sets? If so, what *is* the support of an infinite set of atoms? Why not
> adopt a definition of support that validates "supp bs = bs" for *every*
> set of atoms bs? Is there a difficulty due to the fact that what you
> call a "permutation" is in a fact "a permutation with finite support"?
> I think it would be good to motivate your technical choices and clarify
> exactly where/why a finite support assumption is required.
>
> p.11, "The other half is a bit more involved." I would suggest removing
> this rather scary sentence. The proof actually appears very simple and
> elegant to me.
>
> p.12, "mutual recursive" -> "mutually recursive"
>
> p.12, does the tool support parameterized data type definitions? If so,
> please mention it, otherwise explain whether there is a difficulty (e.g.
> the parameters would need to come with a notion of permutation).
>
> p.12, "Interestingly, [...] will make a difference [...]". At this 
> point, upon
> first reading, this is not "interesting" but rather frustrating, because it
> does not sound natural: my understanding would be very much simplified if
> "binds ... in t u" was equivalent to "binds ... in t, binds ... in 
> u". Because
> a forward pointer is missing, I cannot find immediately where this is
> explained, and this problem hinders my reading of the beginning of section 5.
>
> p.13, the type of sets now seems to be "fset" whereas it was "set"
> previously.
>
> p.13, the type of atoms now seems to be "name", whereas it was previously
> "atom". The remark on the last line of page 13 leads me to understand that
> "name" refers to one specific sort of atoms, whereas "atom" refers to an
> atom of any sort (right?). The function "atom" converts one to the other;
> but what is its type (is it overloaded?).
>
> p.13, you distinguish shallow binders (binds x in ...) and deep binders
> (binds bn(x) in ...). I would hope that a shallow binder is just syntactic
> sugar for a deep binder where "bn" is the "singleton list" or "singleton
> set" function. Is this the case? If not, why not? If yes, perhaps you could
> remove all mentions to shallow binders in section 5.
>
> p.14, "we cannot have more than one binding function for a deep binder".  You
> exclude "binds bn_1(p) bn_2(p) in t". Couldn't this be accepted and
> interpreted as "binds bn_1(p) \cup bn_2(p) in t"? (I guess it does not matter
> much either way.)
>
> p.14, you also exclude "binds bn1(p) in t1, binds bn2(p) in t2". Two
> questions. First, a clarification: if bn1 and bn2 are the same function, is
> this allowed or excluded? Second, I don't understand why you need this
> restriction, that is, why you are trying to prevent an atom to be "bound and
> free at the same time" (bound in one sub-term and free in another). I 
> mean, in
> the case of single binders, you seem to allow "binds x y in t1, binds 
> y in t2"
> (at least, you have not stated that you disallow this). There, occurrences of
> x in t1 are considered bound, whereas occurrences of x in t2 are considered
> free; is this correct? If so, why not allow "binds bn1(p) in t1, binds bn2(p)
> in t2", which seems to be of a similar nature? Is this a somewhat ad hoc
> restriction that simplifies your implementation work, or is there really a
> deep reason why accepting this clause would not make sense?
>
> p.14, example 4.4, the restriction that you impose here seems to rule out
> an interesting and potentially useful pattern, namely telescopes. A telescope
> is a list of binders, where each binder scopes over the rest of the 
> telescope,
> and in addition all of the names introduced by the telescope are considered
> bound by the telescope in some separate term. I am thinking of 
> something along
> the following lines:
>
>   nominal_datatype trm =
>   | Var name
>   | Let tele::telescope body::trm  binds bn(tele) in body
>   | ...
>
>   and telescope =
>   | TNil
>   | TCons x::name rhs::trm rest::telescope  binds x in rest
>
>   binder bn::telescope => atom list
>   where bn (TNil) = []
>       | bn (TCons x rhs rest) = [ atom x ] @ bn(rest)
>
> You write that "if we would permit bn to return y, then it would not be
> respectful and therefore cannot be lifted to alpha-equated lambda-terms". I
> can see why there is a problem: if "x" is considered bound (therefore
> anonymous) in the telescope "TCons x rhs rest", then it cannot possibly be
> returned by a (well-behaved) function "bn". I think that the answer to this
> problem should be: we must pick an appropriate notion of alpha-equivalence
> for telescopes, and this notion of alpha-equivalence must *not* consider x
> as anonymous in "TCons x rhs rest". Instead, x must be considered free in
> this telescope. The telescopes "TCons x rhs TNil" and "TCons y rhs TNil"
> must be considered distinct. Of course we could achieve this effect just by
> removing the clause "binds x in rest", but this would lead to a notion of
> alpha-equivalence for "Let" terms which is not the desired one: when writing
> "let (x1 = t1; x2 = t2) in t", we would like x1 to be bound in t2, and this
> will not be the case if we omit "binds x in rest" in the above definition.
> I conclude that your design (which seems very reasonable) cannot currently
> express telescopes. It would be nice if you could explicitly discuss this
> issue. Is it conceivable that an extension of your system could deal with
> telescopes? Other researchers have proposed approaches that can deal with
> them (I am thinking e.g. of ``Binders Unbound'' by Weirich et al.).
>
> Here is another general question. How would you declare a nominal data type
> for ML patterns? Informally, the syntax of patterns is:
>
>   p ::=
>     x                                                    (variable)
>   | (p, p)      where bn(p1) and bn(p2) are disjoint     (pair)
>   | (p | p)     where bn(p1) = bn(p2)                    (disjunction)
>   | ...
>
> In the case of a pair (or conjunction) pattern, one usually requires that the
> two components bind disjoint sets of names, whereas in the case of a
> disjunction pattern, one requires that the two components bind exactly the
> same sets of names. How would you deal with this? I imagine that one could
> just omit these two side conditions in the definition of the nominal data
> type, and deal with them separately by defining a well-formedness predicate.
> One question: in the definition of the "term" data type, at the point where
> one writes "binds bn(p) in t", which variant of the "binds" keyword would one
> use: "binds", "binds(set)", or "binds(set+)"? Does it make any difference,
> considering that a pattern can have multiple occurrences of a name in binding
> position? It would be interesting if you could explain how you would handle
> this example.
>
> Another interesting (perhaps even more tricky) example is the syntax of the
> join-calculus. In terms of binding, it is really quite subtle and worth a
> look.
>
> p.15, just before section 5, I note that the completion process does *not*
> produce any clause of the form "binds ... in x" (in the Lam case). One could
> have expected it to produce "binds x in x", for instance. One could imagine
> that, for *every* constructor argument t, there is a clause of the 
> form "binds
> .. in t". Here, you adopt a different approach: you seem to be partitioning
> the constructor arguments in two categories, the "terms" (which after
> completion appear in the right-hand side of exactly one "binds" clause) and
> the "binders" (which appear in the left-hand side of at least one "binds"
> clause). Please clarify whether this is indeed the case. (You have 
> presented a
> series of data type definitions that you forbid, but in the end, you should
> present a succinct summary of what is allowed.) Also, I seem to understand
> that the following definition is forbidden:
>
>   nominal_datatype trm =
>   | Foo t1::trm t2::trm  binds bn(t1) in t2, binds bn(t2) in t1
>
> (for some definition of "bn"). This would be forbidden because t1 and t2 are
> used both as "terms" and as "binders" (both on the left-hand and right-hand
> side of a "binds" clause). As far as I can see, however, you have not
> explicitly forbidden this situation. So, is it forbidden or allowed? Please
> clarify.
>
> If there is indeed a partition between "terms" and "binders", please justify
> why things must be so. I can think of a more general and more symmetric
> approach, where instead of writing "binds bn(p) in t" and considering that "p
> is a binder" and "t is a term", one would write "binds bn(p) in p t" and
> consider that p and t play a priori symmetric roles: the only difference
> between them stems from the fact that we collect the bound names 
> inside p, but
> not inside t. (I am not suggesting that the user should write this, but that
> the user syntax could be desugared down to something like this if this makes
> the theory simpler.) Ah, but I guess that if one were to follow this path,
> then one would need a way of distinguishing recursive versus non-recursive
> binders. I guess I see why your design makes sense, but perhaps you should
> better explain that it is a compromise between several other possible designs
> (``alphacaml'', ``binders unbound'', etc. are examples of other designs) and
> how you reached this particular point in the design space.
>
> OK, now I see that, since you allow ``recursive binders'', there is not a
> partition between ``terms'' and ``binders''. A recursive binder appears both
> on the left- and right-hand sides of a binds clause. Do you require that it
> appears on the left- and right-hand sides of *the same* binds clause, or do
> you allow the above example ("binds bn(t1) in t2, binds bn(t2) in t1")? If
> you do allow it, then I suppose t1 is viewed as a (non-recursive) binder in
> the first clause, while t2 is viewed as a (non-recursive) binder in the
> second clause. This would be kind of weird, and (I imagine) will not lead
> to a reasonable notion of alpha-equivalence. I am hoping to find out later
> in the paper.
>
> p.17, "we have to add in (5.3) the set [...]". It is not very clear whether
> you are suggesting that equation 5.3 is incomplete and something should be
> added to it, or equation 5.3 is fine and you are referring to B' which is
> there already. I suppose the latter.
>
> p.17, "for each of the arguments we calculate the free atoms as 
> follows": this
> definition relies on the fact that "rhs" must be of a specific *syntactic*
> form (unions of expressions of the form "constant set" or "recursive call").
> For instance, "rhs" cannot contain the expression "my_empty_set z_i", where
> "my_empty_set" is a user-defined function that always returns the empty set;
> otherwise the third bullet would apply and we would end up considering "z_i"
> as neither free nor bound. You have mentioned near the top of page 15 that
> binding functions "can only return" certain results. You should clarify that
> you are not restricting just *the values* that these functions can 
> return, but
> the *syntactic form* of these functions.
>
> p.23, "We call these conditions as": not really grammatical.
>
> p.23, "cases lemmas": I suppose this means an elimination principle?
>
> p.23, "Note that for the term constructors" -> "constructor"
>
> p.26, "avoid, or being fresh for" -> "avoid, or are fresh for"
>
> p.30, "Second, it covers cases of binders depending on other binders,
> which just do no not make sense [...]". I am curious why the designers
> of Ott thought that these cases make sense and you don't. Perhaps this
> point would deserve an example and a deeper discussion?
>
> p.30, at last, here is the discussion of "binds ... in s t" versus
> "binds ... in s, binds ... in t". I see that the difference in the
> two interpretations boils down to an abstraction whose body is a pair,
> versus a pair of abstractions. It is indeed interesting to note that
> these notions coincide for single-name abstractions, and for list
> abstractions, but not for set and set+ abstractions.
>
> p.32, "It remains to be seen whether properties like [...] allow us
> to support more interesting binding functions." Could you clarify
> what you mean? Do you mean (perhaps) that fa_bn(x) could be defined
> as fa_ty(x) \ bn(x), regardless of the definition of bn(x), instead
> of by induction over x? Do you mean something else?
>
> The example in Figures 1 and 2 do not seem very interesting to me.  It
> involves single binders and flat lists of binders. Not much subtlety going on
> here. I think this example could be reduced in size without losing 
> anything in
> terms of content. And perhaps a trickier example could be added (I have two
> suggestions, which I mentioned above already: ML with conjunction and
> disjunction patterns; and the join-calculus).
>
>
>
>
>
>
>
author	Christian Urban <urbanc@in.tum.de>
	Thu, 29 Dec 2011 12:40:36 +0000
changeset 3099	502b5f02edaf
child 3102	5b5ade6bc889
permissions	-rw-r--r--