Slides: “On the atoms of linguistic computation”

 Posted by on 10/14/2020  Comments Off on Slides: “On the atoms of linguistic computation”
Oct 142020

I’ve posted the slides for a guest seminar I gave recently as part of the More Advanced Syntax graduate course at MIT.

These slides represent my latest thinking (as of Oct 2020, anyway) about the question of how syntax interfaces with morpho-phonology and with semantics.

For those of you who are well-versed in some of these questions and are in a rush, here’s the tl;dr version: it’s “Nanosyntax-style spanning meets the ‘three lists’ architecture of Distributed Morphology.” But it’s not some arbitrary mix-and-match of these two pieces of grammatical architecture. Arguments are provided that this is actually the right way to proceed.

Relevant background reading:

New paper: “Taxonomies of case and ontologies of case”

 Posted by on 09/23/2020  Comments Off on New paper: “Taxonomies of case and ontologies of case”
Sep 232020

I’ve posted a pre-print of a paper of mine that’s set to appear in an edited volume. The paper is titled Taxonomies of case and ontologies of case. It is a theoretical review paper of sorts, and it has several intertwined goals:

  1. To show what a system of configurational case assignment would look like when formulated in current syntactic terms (rather than the GB terms in which it was originally proposed, e.g. in Marantz’s 1991 paper).
  2. To show that given (1), the proposal in Baker’s (2015) book, to add case-assignment-under-phi-agreement to a configurational case system, is an empirically vacuous one. Everything it can account for can also be accounted for under a purely configurational system as construed in (1), with no appeal whatsoever to phi-features within the theory of case.
  3. To argue that the system in (1) is therefore sufficient to account for case, cross-linguistically. It is also necessary, in the sense that theories with no dependent-case component are unable to serve as general theories of case.
  4. To remind ourselves that one cannot argue against (3) by, e.g., presenting a language in which the-case-pretheoretically-identified-as-‘accusative’ doesn’t conform to the predictions of dependent case. That would only work if descriptive labels like ‘accusative’ were guaranteed to carve out a natural class of grammatical phenomena, but there is no reason to believe that they do.

The paper can be downloaded here.

(Backup link in case lingbuzz is down: here.)

Jul 252020

This is not so much a blog post as it is a collection of things that I think deserve your attention. As you will see, it is quite a self-serving list, in that several of these works provide evidence in favor of claims that I have also been arguing for. But hey, it’s my blog, right? 😊

  1. Pavel Rudnev has a paper set to appear in Glossa arguing against approaches to anaphoric binding in terms of phi-Agree, and in favor of an encapsulation-based account of the Anaphor Agreement Effect, of the kind I have argued for as well. (More converging evidence, with a twist, comes from the work of Rafael Abramovitz on the AAE in Koryak.)
  2. Recent work by Susi Wurmbrand & Magdalena Lohninger on clausal complementation, showing (among other things) that the semantics of clausal complements cannot be read directly off the syntax. Instead, the syntax of a given language will determine which complementation options a given verb in that language will have (subject to an implicational hierarchy that Wurmbrand & Lohninger uncover, but, importantly, underdetermined by the semantics). The semantics then has to map the possible readings of a given complement onto what these syntactically-prescribed structural possibilities happen to be. As readers of this blog know, this is entirely in line with what we find in other empirical domains. My slogan for this has been: “Meaning contrasts are not generated by syntax, they are parasitic on the contrasts syntax happens to make available.” (Not so pithy, I know. But still, this flies in the face of standard wisdom in the Montagovian tradition, so I think it’s worth hammering this point home.)
  3. Pavel Rudnev again! This time, in a paper that’s already available for “early view” in Linguistic Inquiry. The paper provides an argument based on agreement in Avar in favor of restricting phi-agreement to Downward Agree (a.k.a. Upward Valuation; Diercks, Koppen & Putnam 2019, as well as various papers of mine, some of them co-authored with Maria Polinsky).

V-NYI 2020

 Posted by on 07/15/2020  Comments Off on V-NYI 2020
Jul 152020

The 2020 edition of the NY ‑ St. Petersburg Institute of Linguistics, Cognition, and Culture (NYI) will take place virtually, during the last two weeks of July!

Together with Asia Pietraszko (University of Rochester), I’ll be teaching a course called Words and other things: what do you need to list in your head?

For more information, including the course description, please see my Teaching & Advising page.

· · · · · · · · · · · · · · · · · · · · 

The course is now complete. Asia and I are happy to share the course materials with any interested individuals. If you are interested, drop me a line.

Suyoung Bae defends!

 Posted by on 07/03/2020  Comments Off on Suyoung Bae defends!
Jul 032020

I am proud to announce that Suyoung Bae, my second-ever PhD advisee (co-advised with Howard Lasnik), has defended her thesis!

The thesis is an investigation of Korean amwu-. Suyoung shows that this negation-dependent expression is neither a Negative-Polarity Item (NPI) nor a Negative-Concord Item (NCI), but instead, a third type of negation-dependent item, whose distribution is governed by purely syntactic factors: constituency, the restrictions on A-movement, and the restrictions on long head movement. On the way, she makes novel observations about “radical reconstruction” in Korean (spoiler: it’s not always radical!) and consequently, about long-distance scrambling in Korean (it can’t be “PF movement”), how Cyclic Linearization constrains subextraction from complex noun phrases, and more.

I hope Suyoung makes the thesis available on lingbuzz once she has filed it – in the meantime, if you are interested, please get in touch with her!

Congrats, Suyoung!

Published in Glossa: “Functional structure in the noun phrase: revisiting Hebrew nominals”

 Posted by on 07/02/2020  Comments Off on Published in Glossa: “Functional structure in the noun phrase: revisiting Hebrew nominals”
Jul 022020

My paper, “Functional structure in the noun phrase: revisiting Hebrew nominals,” has been published in Glossa. This paper revisits Ritter’s (1991) findings about nominals in Hebrew in light of recent proposals that nominal phrases lack outer functional structure and are instead projections of the noun (Bruening 2010, 2020 and Bruening, Dinh & Kim 2018). Ritter’s findings, though originally put forth to argue for an additional functional projection (NumP) between DP and NP, can be seen to provide even stronger evidence that the nominal phrase is not a projection of the noun itself. You can read more about this on my research page.

The published version is freely available for download here. (Yay for Open Access!)

If you prefer a pre-print, that is still available here.

	Author = {Preminger, Omer},
	Doi = {10.5334/gjgl.1244},
	Journal = {Glossa},
	Number = {1},
	Pages = {68},
	Title = {Functional structure in the noun phrase: revisiting {H}ebrew nominals},
	Volume = {5},
	Year = {2020}}
Jun 182020

For a while now, I have been pondering the prospects of a realizational/interpretive theory in which spellout to PF and spellout to LF involve separate collections of rules, and where individual spellout rules crucially map from sets of syntactic terminals to exponents or to meaning primitives. (For a given spellout rule to be applicable, the set of nodes in its input specification must appear contiguously in the input structure. In this post, I’m not going to concentrate on how contiguity is defined at each of the interfaces. Email me if you would like some details on my in-progress thinking on these matters.) The sets may happen to be singleton sets, but that is not an architecturally-privileged state of affairs. This way of thinking of spellout makes a number of desirable predictions; for example:

  • There should be instances where the applicable PF-spellout rule and the applicable LF-spellout rule stand in a relation of partial overlap.
  • There should be terminals for which there happens to be no singleton PF-spellout, or for which there happens to be no LF-spellout rule; such terminals will have no elsewhere form, or no context-free interpretation, respectively.

As I’ve noted elsewhere, these things are well-attested.

The picture that emerges is in some ways similar to Nanosyntax, where the candidates for spellout are also sets of nodes (that are contiguous in a specific technical sense). It departs from Nanosyntax in entirely divorcing spellout to PF from spellout to LF (see above), as well as in defining contiguity differently for PF-spellout rules and LF-spellout rules.

Such a system, in which the input to PF- and LF-spellout rules is a set of terminals rather than an individual terminal, seems to fly in the face of the claim made in Embick & Marantz’s 2008 paper, Architecture and Blocking (henceforth EM08). This paper argues that architecturally speaking, “words” do not enter into competition and blocking with phrases, nor do they even enter into competition and blocking with other “words”. The only blocking that is part of the grammatical architecture occurs at the morpheme level, where one exponent competes with another for insertion (see also Embick 2017). Crucially, this requires a framework in which lexical insertion specifically targets individual syntactic terminals.1Confusingly enough, the DM literature refers to syntactic terminals as ‘morphemes’. I have translated this back to more sane terminology, here. It is this claim, that vocabulary insertion (what I termed PF-spellout, above) is restricted to individual syntactic terminals, that I am interested in here – since it appears at first glance to stand in direct conflict with the system I am envisioning. I am going to argue instead that, depending on how one chooses to understand EM08, this claim about vocabulary insertion is either vacuous or wrong. That is not to say that EM08’s other claims, e.g. about the empirical picture concerning “word” and phrase blocking effects, are incorrect. I am only taking aim at the particular claim about the locus/granularity of vocabulary insertion.

Let’s consider the alternations in (1) as our test case:

a. dog (dɔg) – dogs (dɔgz)
b. goose (gu:s) – geese (gi:s)

There are a few clarifications to make about (1) before we get started. First, the plural morpheme is not some Agr node “added at PF” (though you will find claims like this in the DM literature). Number in the noun phrase comes from a dedicated functional projection, NumP. The degrees of freedom are whether the overt phonological material associated with the plural is the spellout of the head Num0, or the spellout of a feature (call it [plural]) whose occurrence on the noun (or more accurately, on n0) is conditioned by the presence of the right variant of Num0, perhaps via syntactic Agree. But neither of these things (the plural variant of Num0 or the feature [plural] on n0) are “added at PF.”2One might even say that the “added at PF” gambit suggests that when syntacticians talk, DMers nap. (This is a callback to Marantz 1997, and a loving one at that!) Second, I am going to treat geese as suppletive, relative to goose, even though some treatments of alternations like these would characterize this as a “readjustment rule.” What is clear here is that there is no reasonable phonological rule of English (please note: “rule” implies productive knowledge) that would trigger this alternation. If your phonological theory is sufficiently sophisticated to treat geese as the result of affixing some autosegmental/suprasegmental material to goose, more power to you, but then you’ll have to change this example, in your mind’s eye, to one in which such a move is not possible (person-people or whatever you prefer).

Okay, with that out of the way, let’s talk about (1) from the perspective of EM08. If the English plural /‑z/ is the (elsewhere) spellout of plural Num0, then there is actually no way for geese to block gooses at the level of the individual syntactic terminal. An EM08-adherent would therefore be forced into one of two positions. The first is what I’ll call featuralization, and the second is mutually-conditioned allomorphy.


Suppose that instead of assuming that /‑z/ is the (elsewhere) spellout of Num0, the spellout of English plural Num0 was always null, and /‑z/ was the spellout of the feature [plural] on n0, a feature whose occurrence is conditioned by the presence of plural Num0 (e.g. via syntactic Agree). Now geese could block goose, but we would have to assume that, in those cases where [plural] surfaces as its own exponent (e.g. /‑z/), Fission has applied, to enable [plural] to be targeted for vocabulary insertion separately from the rest of the content of the noun.

Crucially, we can generalize what we just did with Num0 and [plural] into a recipe for turning any blocking effect that seems to involve multiple successive heads into a version that is EM08-compliant. Suppose we are looking at the following state of affairs:

a. Let <X10, …, Xn0> be a series of successive heads, where for every 1≤in-1: XiP is the complement of Xi+1.
b. Let none of <X10, …, Xn-10> have overt specifiers.
(I will refer to this as a PF-contiguous span of heads.)

And suppose that we see what seems to be a blocking effect involving <X10, …, Xn0>; for example: the spellout of {√GOOSE, n0, Num0[pl]} as geese competing with, and blocking, its spellout as gooses. We can bring this into compliance with EM08 as follows:

a. Assume the spellout of <X20, …, Xn0> is invariably null.
b. Define a set of features <F2, …, Fn>, where:
    i. For every 2≤in: Xi0 is base-generated with a valued [Fi] feature.
    ii. X10 is base-generated with unvalued Fi features for all values of i (2≤in).
    iii. For every 2≤in: X10 enters into Agree in Fi with Xi0 (thus acquiring valued [Fi]).

We can now recast any blocking involving <X10, …, Xn0> as blocking that is occurring exclusively at X10. Anything that looked previously like vocabulary insertion at Xi0 (2≤in) can now be handled as Fission of Fi from X10. We thus have a recipe for bringing any multi-terminal blocking that meets the criteria in (2) into compliance with EM08, indicating that the restriction of formal competition (and thus, blocking) to individual syntactic terminals is not doing the work that EM08 suggested it was doing.

An EM08-adherent might find solace in the fact that, since (3b.iii) involves Agree, there is a built-in restriction (e.g. phases) on how far away from X10 a head can be while still contributing an Fi feature that will participate in blocking at X10. That is all well and good, but it is still equivalent to just saying that there is blocking among sets of terminals, so long as the members of those sets all occur inside a single phase (which seems trivial on any approach where spellout is cyclic).

I am about to turn the discussion to mutually-conditioned allomorphy, but before I do so, I think it’s worth pointing out that it’s not at all clear how one could possibly rule out what is described in (2‑3) (at least as long as one assumes there is such a thing as Fission). This is important because it means that even if one prefers the mutually-conditioned allomorphy treatment of goose-geese (or person-people, etc.), the loophole described here still exists. Thus, whether we like it or not, the restriction of competition to insertion of exponents at individual syntactic terminals is unable to do any work that is not already done by restricting competition to PF-contiguous spans that are contained in a single phase.

Mutually-conditioned allomorphy

As an alternative to featuralization, suppose we instead attempt to rescue the EM08-compliant treatment of (1) in a different way:

a. GOOSE → geese / [plural] 
b. Num[pl] → ∅ / GOOSE

On this view, geese (or people, etc.) arises as something of a conspiracy, wherein the elsewhere form of the plural Num0 (/‑z/) is overridden in the presence of the root GOOSE, while the elsewhere form of this root (goose) is overridden in the presence of the plural Num0.

At this juncture, it is useful to take note of a particular property of the goose-geese example, which I have been neglecting so far, and which demonstrates that (4) is in any case too simplistic of a treatment. Consider (5):

(5) The corrupt accountant gooses(/*goose/*geese/*geeses) the earning reports every quarter.

I point out (5) because it shows that the occurrence of the form goose isn’t dependent on nominal number even being present in the structure. That is, goose is not the counterpart of geese in the presence of Num[sg]; it really is the elsewhere form, and presence of Num[pl] triggers a contextual allomorph of that form. For concreteness, let us assume that goose is the spellout of {√GOOSE, n}, and that the verbal use in (5) involves the (common in English) null v denominal verbalizer. In other words, the verb stem in (5) is the spellout of {√GOOSE, n, vdenom}, or more accurately, the spellout of {√GOOSE, n} (which is goose) plus the spellout of {vdenom} (which is null).

a. {√GOOSE, n, Num[pl]} → geese
b. {√GOOSE, n} → goose
c. {vdenom} → ∅

This is not the only way to capture what is going on with goose-geese (incl. the verbal paradigm), but this way of characterizing the data explicitly sets up a scenario where the spellout of one span (6a) competes with, and preempts, the spellout of a smaller span contained therein (6b) – precisely the sort of thing that EM08 wants to architecturally rule out. So if we can show a recipe that translates (6) into an EM08-compatible characterization involving mutually-conditioned allomorphy, but where vocabulary insertion is restricted to terminals, we will again have shown that the architectural restriction in EM08 is not doing the work it is purported to do.

As before, I will start by translating this particular example into an EM08-compatible implementation, and then generalize the mechanism of inter-translation. Let us begin with the following ‘elsewhere’ rules for the exponents of √GOOSE, n, and Num[pl]:

a. √GOOSE → goose
b. n → ∅
c. Num[pl] → /‑z/

One thing we could do at this juncture is observe that (7b) is a null exponent, and that (7a) and (7c) would therefore be adjacent as far as the overt structure is concerned. This is the approach taken by Embick (2010) (though it’s worth noting that it is explicitly rejected by Bobaljik 2012, for example, in his treatment of comparative & superlative morphology). But since this would reduce the span in question to a rather trivial one – involving only 2 nodes – let us make things harder on ourselves, and assume that we cannot ignore (7b): it still intercedes between √GOOSE and Num[pl], disrupting the kind of adjacency required for contextual allomorphy in an EM08-style system. (Everything I’m about to say will of course also work if n0 is “pruned” à la Embick 2010, but as I said, I’m intentionally choosing the route that will make an EM08-style treatment harder to construct.)

Nevertheless, we can take a page from the featuralization approach, above, and assume that n0 acquires the [plural] feature from Num0 derivationally (e.g. via Agree). In a move that may seem more controversial – but I will argue, shortly, is not – I will assume that n0 can also enter into a syntactic relation with √GOOSE resulting in the identity of the root being reflected in the syntactic content of n0 itself.

The reason this last move may seem controversial is that a tenet of conventional DM dogma holds that roots are not individuated in the syntax. In essence, the thinking goes, there is only one root object in the narrow-syntactic lexicon (the list of available syntactic atoms). Because roots are assumed to be featureless, syntax wouldn’t have any way of telling multiple root objects apart anyway. If this were true, the last step, above – where n0 derivationally acquires featural content reflecting the identity of its root complement – would be impossible.

However, there are both empirical and conceptual reasons to reject the conventional DM premise regarding roots being “featureless” and, therefore, unindividuated in the syntax. Empirically, Harley (2014) has shown that roots cannot be individuated semantically or phonologically, leaving syntactic individuation as the only option still standing. (Importantly, this conclusion holds both on the strong version of her 2014 claim, whereby all arguments are selected by roots, and also on the weaker version of the claim, which Harley settles on in the reply to the commentaries on her target article, whereby some arguments are selected by roots and some are selected higher up, by syntactic categorizers. The latter view, as far as I can tell, is also compatible with Merchant’s 2019 observation that categorizers often do affect the selectional properties of the roots they attach to.) But I think the conceptual argument, in this case, is even stronger: as I have discussed elsewhere, any version of DM in which the identity of roots is negotiated post-syntactically is an equivocation of modularity, anyway. Ignore what DM declares itself to be doing: any line of communication between PF and LF is syntax, and so there is no version of DM in which roots are not individuated in the syntax. And what does it mean to be individuated in the syntax? It means individual roots (like √GOOSE) have properties legible to the syntax. Let me repeat that: √GOOSE has properties legible to the syntax that distinguish it from √DUCK. I am not proposing this, so much as I am pointing out that it follows from any reasonable definition of how the grammar is modularized.

Given this, there is also no obstacle to assuming that n0 acquires, in the course of the derivation, syntactic properties reflecting that its root complement was √GOOSE (and not √DUCK, or √ESSAY, or …). This is possible because the difference between √GOOSE and other roots is, by definition, legible to the syntax.

After these feature transmissions occur in syntax, the structural representation handed over to PF will be as follows:

Num0[pl] » n0[pl, GOOSE] » √GOOSE       (where ‘»’ indicates immediate c-command)

We can now recast (6) as in (9):

a. √GOOSE → geese /      n[pl] 
b. Num[pl] → ∅ /      n[GOOSE]

At this juncture, one might object on the grounds that there are reams of work in morphology indicating that allomorphy is a highly local business, and the kind of non-local interactions just sketched are unavailable, empirically speaking. (Another way of putting this: Embick and others had good empirical reason for proposing their stringent conditions on allomorphy.) My response to that is that those empirical generalizations apparently still await explanation, because the mechanisms put forth to account for them (i.e., to rule out non-local interactions of the kind seen here) are technically unable to do so. In other words: I don’t deny the empirical basis DMers had for proposing these restrictions; I deny that the restrictions proposed get the job done.

It is now time to generalize this treatment, i.e., to show that any account like the one in (6), above – where an exponent associated with one PF-contiguous span competes with and blocks the insertion of an exponent associated with a smaller PF-contiguous span – can be restated in terms of mutually-conditioned allomorphy, with lexical insertion restricted to individual terminals. To the extent that we are able to provide a general recipe of this sort, we will have shown once again that the restriction of insertion to individual terminal does no empirical work.

Let us start again with the state of affairs in (10) (repeated from (2)):

a. Let <X10, …, Xn0> be a series of successive heads, where for every 1≤in-1: XiP is the complement of Xi+1.
b. Let none of <X10, …, Xn-10> have overt specifiers.
(A PF-contiguous span of heads.)

We can then recast any interaction among multiple heads inside this span in terms of mutually-conditioned allomorphy of individual terminals, as follows:

Define a set of features <F1, …, Fn>, where:
a. For every 1≤in: Xi0 is base-generated with a valued [Fi] feature.
b. For every 1≤in, 1≤jnji: Xi0 is base-generated with an unvalued [Fj] feature.
c. For every 1≤in, 1≤jnji: Xi0 enters into Agree in Fi with Xj0 (thus acquiring valued [Fj]).

We can now implement mutually-conditioned allomorphy of any head in the span in (10) based on the features of any other head in the same span, up to restrictions on the locality of feature transmission in (11c) (e.g. up to the phase boundaries restricting Agree). As was the case with featuralization, above, it seems natural enough that competition for span-based PF insertion would have to occur within the bounds of a single phase, anyway, so there seems to be no meaningful distinction here, either.

While (11b) requires n features on each head in the span, and (11c) requires a number of Agree relations that is on the order of n2, in practice many of these will do no work in the translation of span-based competition to competition based on individual terminals. For example, in translating the example in (6) along the lines in (11), any n0-based features copied to Num0 will play no actual role in conditioning any allomorphy, and so in practice they need not exist, and any Agree relations they are involved in need not occur. This will be the case for many of the feature relations generated in principle by (11c). None of this is relevant, however, to our main point, which is that nothing beyond Agree is necessary to recast span-based competition in terms of competition at individual terminals.

It is important to note that there is no sui generis mechanism of mutually-conditioned allomorphy at play here, only Agree and garden variety feature-based contextual allomorphy. Thus, unlike the conclusions in the Featuralization section, this result obtains independently of one’s position on the existence of particular operations like Fission.


We have seen that imposing a restriction on competition and blocking, so that they only take place among different exponents vying for insertion at a single syntactic terminal, does not achieve anything that is not already achieved by restricting competition to PF-contiguous spans within a single phase.

I presented two different recipes for recasting competition and blocking among PF-contiguous spans in terms of competition and blocking at individual terminals only. Eliminating the operation of Fission from the grammar would rule out one of the two recipes, namely, the featuralization one; but it is much less clear how one would rule out the other recipe – which, as noted, does not appeal to any sui generis mechanisms beyond Agree and feature-based contextual allomorphy. One could imagine adding some sort of meta-principle that rules out what we have descriptively characterized as mutually-conditioned allomorphy. But this, as far as I am able to tell, would render the system incapable of capturing alternations like goose-geese (or person-people, or …). It is for this reason that I stated at the beginning of this post that the restriction in question on competition and blocking is either vacuous (on the assumption that Fission and/or mutually-conditioned allomorphy exist) or wrong (if they don’t).

As a side note to all this, banning both Fission and mutually-conditioned allomorphy may not even be sufficient to tear down the equivalence between insertion at terminals only and insertion into spans. As Pavel Caha points out in his thesis (pp. 57‑60), and again here (pp. 7‑9), any system with Fusion and insertion into terminals is also equivalent to a system with insertion into spans.

I see all this as very good news, since I think the view whereby the locus of insertion is a span of contiguous heads has a lot going for it (more on that some other time), and so I’m happy to discover that adopting such a view does not cede any meaningful ground to the EM08 alternative.

Thanks to Pavel Caha, Neil Myler, and Asia Pietraszko for helpful discussion. They are not responsible for the contents of this post.

Apr 262020

For the last ten years or so, Chomsky has been claiming increasingly often that the discrete bifurcation of expressions into “grammatical” and “ungrammatical” is incorrect. I think he is wrong, or at least that these claims are without any current basis. But before explaining why, it’s important to set some parameters of the discussion.

First, we have to fix what we mean by expressions. If we mean strings, or even a given token of phonation, then I have no quarrel with this. Taking strings as the object of study in linguistics is, to quote Indiana Jones, “digging in the wrong place.” A grammatical constraint, like the Complex NP Constraint (CNPC) for example, can reduce the number of meanings associated with a given string from one to zero, as in (1):

(1) * Which dish do you know the guy who brought?

But the same constraint can reduce the number of meanings associated with a different string from two to one, as in (2) (which can be interpreted as a question about reasons for knowing, but not as a question about reasons for bringing):

(2) Why do you know the guy who brought this dish?

A preoccupation with strings as the object of study necessarily misses the point of grammatical constraints, since it artificially affords (1) (which has zero remaining meanings) different status than (2) (which has a non-zero number of remaining meanings).

So, if Chomsky’s point is, “You can’t bifurcate the set of strings into grammatical and ungrammatical,” then there is no disagreement here. But that doesn’t have anything to do with the notion of grammaticality as such. It has to do with misapprehending the object of study. Language does not generate “strings”; it generates structures. Or, if you prefer (though I do not): form-meaning pairings.

But this is not Chomsky’s point, I don’t think. He seems to be saying that structures (or form-meaning pairings) cannot be bifurcated into grammatical and ungrammatical, either. As best I can tell, he has offered two arguments for this view over the years – and both of them are unsound.

The first putative argument is that utterances simply cannot be bifurcated into acceptable and unacceptable; there is, instead, a range of degrees of acceptability. Therefore, the argument goes, a theory of grammar that delivers a binary verdict (either “grammatical” or “ungrammatical”) is inadequate. The flaw here is in reasoning from the gradience of a behavioral measure – in this case, acceptability – to the gradience of a computational predicate (grammaticality). As Armstrong, Gleitman & Gleitman (1983) showed,1Armstrong, Sharon Lee, Lila R. Gleitman & Henry Gleitman. 1983. What some concepts might not be. Cognition 13:263–308, DOI: 10.1016/0010-0277(83)90012-4. you can get gradient responses from people to prompts like “How even/odd is this number?” That doesn’t mean that even or odd are gradient predicates. The gradience of acceptability (as a behavioral measure) doesn’t mean that grammaticality (as a computational predicate) is gradient, either.2I must say, I find it quite baffling that this basic point is lost on so many in cognitive science, even among certain self-identifying linguists. I remember attending a keynote by Tom Wasow at the 2015 DGfS – so, 32 years after Armstrong, Gleitman & Gleitman’s paper was published – in which more or less the entire talk rested on this error of inferring gradience in the grammar from gradience in the behavioral measure of acceptability.

The second putative argument is related to the first, but it is logically separable. The argument is that language users can assign an interpretation even to purportedly “ungrammatical” expressions. The examples that Chomsky tends to give here, quite tellingly I think, tend to involve s-selectional violations that can be used as idioms, metaphors, and/or conventionalized sayings, like Misery loves company. But nobody in their right mind thinks s-selection is a syntactic phenomenon.3Chomsky has a long history of conflating s-selection with syntax. You can see the seeds of this in his work in the ’50s and ’60s, where purely semantic features like [±abstract] (distinguishing abstract vs. concrete nouns) were projected from terminals. This was an understandable move at the outset of modern syntactic theory. But it is extremely strange to still be clinging to it now. There’s every reason to believe that the grammar generates the structure which pairs the string Misery loves company with the meaning whereby there is an individual denoted by the DP misery that stands in the love relation to an individual denoted by the DP company. That this literal meaning is not (typically) the communicative intent we ascribe to a speaker who has uttered this expression is interesting but, in the grand scheme of things, entirely unremarkable. We also don’t usually interpret Can you pass me the salt? as a polar question seeking information about the addressee’s capabilities. Language use can override the literal meaning associated with an expression; this is not news.

Chomsky might be well served to re-read Syntactic Structures (Chomsky 1957), where (3) was contrasted with (4) (asterisk is mine; see below for discussion):

(3) Colorless green ideas sleep furiously.

(4) * Furiously sleep ideas green colorless.

Given the discussion of s-selection, above, we would want to say that (3) is generated by the grammar, it just happens to have an odd literal meaning, that (at least at the time that this sentence was first brought under discussion) was not associated with any conventionalized meaning. But would we want to say that about (4)?

I’m not suggesting that it is an impossible cognitive task to assign, if forced, an interpretation to (4). But it seems plausible to me that the latter requires considerable cognitive control. This would mean that it is manifestly not the result of the automatic / barely volitional computations that are the object of study in linguistic theory. One can, by exerting conscious effort, assign interpretations to a whole range of things – programming languages, animal noises, etc. etc. That hardly means that the latter interpretations are “generated by the mental grammar” in any relevant sense.

And so, I conclude that Chomsky has yet to present any valid argument against the bifurcation of structures into grammatical(=well-formed) and ungrammatical(=ill-formed). That does not mean that grammatical structures cannot be experienced as quite weird (e.g. (5)). Nor does it mean that, exerting conscious effort, it is impossible to assign some meaning or other to word salad like (4). Nor does it mean that acceptability, as a behavior measure, will not be gradient.

(5) The square root of Milly’s desk drinks humanity.       (Chierchia & McConnell-Ginet 2000:46)

Consequently, the theory that takes examples like (6) to be categorically ill-formed (i.e., not generated by the grammar of my idiolect of English) is very much still in business. Yes, it’s possible to assign some meaning to (6) if forced (e.g. the speaker secretly believes that the child is composed of a tiny British committee wrapped in a trenchcoat). But there is no reason to believe the latter process is carried out by the mental grammar (cf. “meaning” in programming languages).

(6) * The child are here.

This leaves open, of course, the possibility that some things traditionally thought of as “ill-formed” turn out to be better characterized as “well-formed but deviant.” In fact, this is precisely what happened in the history of the treatment of examples like (3) and (5): in Aspects (Chomsky 1965), features like [±abstract] (distinguishing abstract nouns from concrete ones) were projected in the syntactic phrase marker, meaning the reduced acceptability of (3) and (5) was a matter handled by the syntax. (See also fn. 3, above.) Later, it was recognized that this treatment just recapitulates something that the semantic component has to do anyway, and so there is no point in duplicating the same mechanism in the syntax. That is progress and it is good; but it bears not one bit on the question of grammaticality as a binary notion.

This post was prompted by a facebook conversation with Halldór Ármann Sigurðsson. The views expressed here are my own.

Apr 182020

Matushansky (2006) proposes to replace the head-adjunction mechanism of Government & Binding theory (call this “Theory1”) with a version that involves movement to a specifier position followed by “m-merger” of the specifier with the adjacent head (call this “Theory2”). One of the major selling points is supposed to be that Theory2 is more compatible with the tenets of minimalist syntax. I will argue that this is exactly backwards, and that Theory1 was far more minimalism-friendly.

First, let’s discuss that which is purportedly problematic about Theory1, when viewed through a minimalist lens. The central problem concerns the Extension Condition: the constraint that forces all Merge operations to have the current root of the tree as one of their (two) operands. In Theory1, the lower head (call it X) moves into the higher head position (call it Y), replacing the original Y with a composite structure [X Y], whose label is Y. This indeed constitutes an instance of Merge that does not target the root of the tree.

Is this really a problem? It is a problem on a literal reading of the Extension Condition. But we know independent of head movement that the Extension Condition is too strong: the “tucking-in” effects shown by Richards (1997, 1998, 2001) for Bulgarian and Romanian multiple-wh movement show this. Instead of the (demonstrably incorrect) Extension Condition, Richards proposes to capture the relevant cyclicity effects in terms of a constraint he calls Featural Cyclicity – basically: “at any given point in the derivation, you’re only allowed to tend to features of the currently-projecting head.” In cases of phrasal movement where multiple specifiers aren’t involved, this delivers the same predictions as the Extension Condition. And where multiple specifiers are involved, it allows tucking-in (though, interestingly, does not require it, leaving room for variation on this front; this might be desirable, especially when it comes to the distinction between base-generated and movement-derived specifiers).

Now back to head movement: if head movement is a response to the features of the attracting head, then good ol’ GB-style head movement (a.k.a. Theory1) obeys Featural Cyclicity: it is a response to features of the head that is currently projecting. That it doesn’t result in Merge at the root of the tree is immaterial; neither does tucking-in.

Let’s turn to Theory2. As a reminder, here X does not head-adjoin to Y; instead, X moves to SpecYP, followed by the instance of X in SpecYP undergoing “m-merger” with the Y head. “M-merger” is supposed to be a morphological operation. But here’s the thing: its output cannot be the business of morphology alone. That’s because the complex head formed when X moves to Y behaves, for the purposes of remainder of the syntactic derivation, as a constituent. In other words, Theory2 assumes that a non-constituent – consisting of the Y head plus the material in SpecYP but excluding the material in the complement of Y – undergoes a “morphological” operation that renders it a constituent for the purposes of subsequent syntactic computation.

How, exactly, is this supposed to work? Isn’t morphology supposed to be post-syntactic?

At this juncture, defenders of Theory2 sometimes appeal to the idea that the inverted-Y/inverted-T/whatever model applies per-cycle (or per-phase, if you prefer; or per-spellout-domain, if you re-prefer). This is certainly true: even on Chomsky’s version of the inverted-Y model, where all covert operations had to follow all overt operations, this ordering only held within a single cycle. For example, QR is clause bounded (at least when it comes to finite clauses), yet if we wait until all the overt structure-building is done in (1) before doing QR of every building across a guard, this QR operation will be in clear violation of the cycle. (All the terms of the operation are contained in an already-completed cycle.) Yet an inverse-scope reading of the most-embedded clause in (1) is in fact possible (I think):

(1) Morty said Rick doubts that a guard stands in front of every building.

On the overt-before-covert (i.e., Chomskyan) version of the inverted-Y model, this arises because the model resets at every cycle (say, every CP boundary). So it only forces overt operations to precede covert ones within one and the same clause.

But is this “restarting” of the derivational model enough for Theory2 to work? It is not. Restarting the model every cycle still does not (or at least, should not) give subsequent cycles of the syntactic computation access to anything but the syntactic information from previous cycles. Here’s why: suppose that we allowed cycle n of the syntactic component access to all the information (syntactic or otherwise) that resulted from the computation of cycle n−1. One consequence would be that, even on a model where phonological content was late-inserted, cycle n would have access to, e.g., whether the phonological forms of items in cycle n−1 included a bilabial consonant. So the venerable “front every constituent that includes a bilabial” rule, which modularity is supposed to rule out, could easily be constructed – with the caveat that the fronting rule could only apply if the constituent containing a bilabial was base-generated in a previous cycle, not if it was base-generated in the same cycle as the attracting probe.

To state this more generally: allowing the syntax in cycle n to make reference to the outputs of the post-syntactic computation in cycles n−1 (and earlier) undoes the effects of modularity, at least as it concerns the computation of any long-distance dependencies (i.e., those spanning multiple cycles, incl. long-distance A-bar movement).

Theory2 requires just this kind of setup: for this theory to work, “m-merger” (a post-syntactic operation) in cycle n−1 must inform narrow syntax (viz. constituency) in cycle n.

What I have constructed here is a slippery-slope argument against Theory2, and such arguments are generally only as good as the slope is slippery. So, for example, one could stipulate that the amount of post-syntactic computation in a given cycle whose output is accessible to the subsequent cycle is limited: that it includes “m-merger” but not, e.g., Vocabulary Insertion.

That’s a logically-coherent solution; but ask yourself this: what is the content, now, of the claim that “m-merger” is post-syntactic? Surely, it cannot be merely the fact that “m-merger” follows other syntactic operations; A-bar movement generally follows A-movement, yet nobody takes this as an indication that A-bar movement is post-syntactic. No, “m-merger” so construed crucially differs from other post-syntactic operations in that its output is legible to syntax. There is a name for the module in which we place operations whose output is legible to syntax. You guessed it: syntax.

Now let us take stock. Theory1 is incompatible with the Extension Condition, but the latter is independently problematic. The best proposal out there for replacing the Extension Condition, namely Featural Cyclicity, rules in a Theory1-treatment of head movement.

Theory2, on the other hand, requires an operation that smushes a non-constituent into a constituent. Despite trying to make ourselves feel better by purportedly placing this operation in the “post-syntax”, this is in fact a distinction without a difference. The operation in question (“m-merger”) informs the syntactic computation, and is therefore a part of syntax. Now ask yourself: what is less minimalist, a syntactic operation that smushes together non-constituents into constituents? Or Featural Cyclicity?