Omer

Jul 252020
 

This is not so much a blog post as it is a collection of things that I think deserve your attention. As you will see, it is quite a self-serving list, in that several of these works provide evidence in favor of claims that I have also been arguing for. But hey, it’s my blog, right? 😊

  1. Pavel Rudnev has a paper set to appear in Glossa arguing against approaches to anaphoric binding in terms of phi-Agree, and in favor of an encapsulation-based account of the Anaphor Agreement Effect, of the kind I have argued for as well. (More converging evidence, with a twist, comes from the work of Rafael Abramovitz on the AAE in Koryak.)
  2. Recent work by Susi Wurmbrand & Magdalena Lohninger on clausal complementation, showing (among other things) that the semantics of clausal complements cannot be read directly off the syntax. Instead, the syntax of a given language will determine which complementation options a given verb in that language will have (subject to an implicational hierarchy that Wurmbrand & Lohninger uncover, but, importantly, underdetermined by the semantics). The semantics then has to map the possible readings of a given complement onto what these syntactically-prescribed structural possibilities happen to be. As readers of this blog know, this is entirely in line with what we find in other empirical domains. My slogan for this has been: “Meaning contrasts are not generated by syntax, they are parasitic on the contrasts syntax happens to make available.” (Not so pithy, I know. But still, this flies in the face of standard wisdom in the Montagovian tradition, so I think it’s worth hammering this point home.)
  3. Pavel Rudnev again! This time, in a paper that’s already available for “early view” in Linguistic Inquiry. The paper provides an argument based on agreement in Avar in favor of restricting phi-agreement to Downward Agree (a.k.a. Upward Valuation; Diercks, Koppen & Putnam 2019, as well as various papers of mine, some of them co-authored with Maria Polinsky).

V-NYI 2020

 Posted by on 07/15/2020  Comments Off on V-NYI 2020
Jul 152020
 

The 2020 edition of the NY ‑ St. Petersburg Institute of Linguistics, Cognition, and Culture (NYI) will take place virtually, during the last two weeks of July!

Together with Asia Pietraszko (University of Rochester), I’ll be teaching a course called Words and other things: what do you need to list in your head?

For more information, including the course description, please see my Teaching & Advising page.

· · · · · · · · · · · · · · · · · · · · 

The course is now complete. Asia and I are happy to share the course materials with any interested individuals. If you are interested, drop me a line.

Suyoung Bae defends!

 Posted by on 07/03/2020  Comments Off on Suyoung Bae defends!
Jul 032020
 

I am proud to announce that Suyoung Bae, my second-ever PhD advisee (co-advised with Howard Lasnik), has defended her thesis!

The thesis is an investigation of Korean amwu-. Suyoung shows that this negation-dependent expression is neither a Negative-Polarity Item (NPI) nor a Negative-Concord Item (NCI), but instead, a third type of negation-dependent item, whose distribution is governed by purely syntactic factors: constituency, the restrictions on A-movement, and the restrictions on long head movement. On the way, she makes novel observations about “radical reconstruction” in Korean (spoiler: it’s not always radical!) and consequently, about long-distance scrambling in Korean (it can’t be “PF movement”), how Cyclic Linearization constrains subextraction from complex noun phrases, and more.

I hope Suyoung makes the thesis available on lingbuzz once she has filed it – in the meantime, if you are interested, please get in touch with her!

Congrats, Suyoung!

Published in Glossa: “Functional structure in the noun phrase: revisiting Hebrew nominals”

 Posted by on 07/02/2020  Comments Off on Published in Glossa: “Functional structure in the noun phrase: revisiting Hebrew nominals”
Jul 022020
 

My paper, “Functional structure in the noun phrase: revisiting Hebrew nominals,” has been published in Glossa. This paper revisits Ritter’s (1991) findings about nominals in Hebrew in light of recent proposals that nominal phrases lack outer functional structure and are instead projections of the noun (Bruening 2010, 2020 and Bruening, Dinh & Kim 2018). Ritter’s findings, though originally put forth to argue for an additional functional projection (NumP) between DP and NP, can be seen to provide even stronger evidence that the nominal phrase is not a projection of the noun itself. You can read more about this on my research page.

The published version is freely available for download here. (Yay for Open Access!)

If you prefer a pre-print, that is still available here.

bibtex
@article{Preminger:2020,
	Author = {Preminger, Omer},
	Doi = {10.5334/gjgl.1244},
	Journal = {Glossa},
	Number = {1},
	Pages = {68},
	Title = {Functional structure in the noun phrase: revisiting {H}ebrew nominals},
	Volume = {5},
	Year = {2020}}
Jun 182020
 

For a while now, I have been pondering the prospects of a realizational/interpretive theory in which spellout to PF and spellout to LF involve separate collections of rules, and where individual spellout rules crucially map from sets of syntactic terminals to exponents or to meaning primitives. (For a given spellout rule to be applicable, the set of nodes in its input specification must appear contiguously in the input structure. In this post, I’m not going to concentrate on how contiguity is defined at each of the interfaces. Email me if you would like some details on my in-progress thinking on these matters.) The sets may happen to be singleton sets, but that is not an architecturally-privileged state of affairs. This way of thinking of spellout makes a number of desirable predictions; for example:

  • There should be instances where the applicable PF-spellout rule and the applicable LF-spellout rule stand in a relation of partial overlap.
  • There should be terminals for which there happens to be no singleton PF-spellout, or for which there happens to be no LF-spellout rule; such terminals will have no elsewhere form, or no context-free interpretation, respectively.

As I’ve noted elsewhere, these things are well-attested.

The picture that emerges is in some ways similar to Nanosyntax, where the candidates for spellout are also sets of nodes (that are contiguous in a specific technical sense). It departs from Nanosyntax in entirely divorcing spellout to PF from spellout to LF (see above), as well as in defining contiguity differently for PF-spellout rules and LF-spellout rules.

Such a system, in which the input to PF- and LF-spellout rules is a set of terminals rather than an individual terminal, seems to fly in the face of the claim made in Embick & Marantz’s 2008 paper, Architecture and Blocking (henceforth EM08). This paper argues that architecturally speaking, “words” do not enter into competition and blocking with phrases, nor do they even enter into competition and blocking with other “words”. The only blocking that is part of the grammatical architecture occurs at the morpheme level, where one exponent competes with another for insertion (see also Embick 2017). Crucially, this requires a framework in which lexical insertion specifically targets individual syntactic terminals.1Confusingly enough, the DM literature refers to syntactic terminals as ‘morphemes’. I have translated this back to more sane terminology, here. It is this claim, that vocabulary insertion (what I termed PF-spellout, above) is restricted to individual syntactic terminals, that I am interested in here – since it appears at first glance to stand in direct conflict with the system I am envisioning. I am going to argue instead that, depending on how one chooses to understand EM08, this claim about vocabulary insertion is either vacuous or wrong. That is not to say that EM08’s other claims, e.g. about the empirical picture concerning “word” and phrase blocking effects, are incorrect. I am only taking aim at the particular claim about the locus/granularity of vocabulary insertion.

Let’s consider the alternations in (1) as our test case:

(1)
a. dog (dɔg) – dogs (dɔgz)
b. goose (gu:s) – geese (gi:s)

There are a few clarifications to make about (1) before we get started. First, the plural morpheme is not some Agr node “added at PF” (though you will find claims like this in the DM literature). Number in the noun phrase comes from a dedicated functional projection, NumP. The degrees of freedom are whether the overt phonological material associated with the plural is the spellout of the head Num0, or the spellout of a feature (call it [plural]) whose occurrence on the noun (or more accurately, on n0) is conditioned by the presence of the right variant of Num0, perhaps via syntactic Agree. But neither of these things (the plural variant of Num0 or the feature [plural] on n0) are “added at PF.”2One might even say that the “added at PF” gambit suggests that when syntacticians talk, DMers nap. (This is a callback to Marantz 1997, and a loving one at that!) Second, I am going to treat geese as suppletive, relative to goose, even though some treatments of alternations like these would characterize this as a “readjustment rule.” What is clear here is that there is no reasonable phonological rule of English (please note: “rule” implies productive knowledge) that would trigger this alternation. If your phonological theory is sufficiently sophisticated to treat geese as the result of affixing some autosegmental/suprasegmental material to goose, more power to you, but then you’ll have to change this example, in your mind’s eye, to one in which such a move is not possible (person-people or whatever you prefer).

Okay, with that out of the way, let’s talk about (1) from the perspective of EM08. If the English plural /‑z/ is the (elsewhere) spellout of plural Num0, then there is actually no way for geese to block gooses at the level of the individual syntactic terminal. An EM08-adherent would therefore be forced into one of two positions. The first is what I’ll call featuralization, and the second is mutually-conditioned allomorphy.

Featuralization

Suppose that instead of assuming that /‑z/ is the (elsewhere) spellout of Num0, the spellout of English plural Num0 was always null, and /‑z/ was the spellout of the feature [plural] on n0, a feature whose occurrence is conditioned by the presence of plural Num0 (e.g. via syntactic Agree). Now geese could block goose, but we would have to assume that, in those cases where [plural] surfaces as its own exponent (e.g. /‑z/), Fission has applied, to enable [plural] to be targeted for vocabulary insertion separately from the rest of the content of the noun.

Crucially, we can generalize what we just did with Num0 and [plural] into a recipe for turning any blocking effect that seems to involve multiple successive heads into a version that is EM08-compliant. Suppose we are looking at the following state of affairs:

(2)
a. Let <X10, …, Xn0> be a series of successive heads, where for every 1≤in-1: XiP is the complement of Xi+1.
b. Let none of <X10, …, Xn-10> have overt specifiers.
(I will refer to this as a PF-contiguous span of heads.)

And suppose that we see what seems to be a blocking effect involving <X10, …, Xn0>; for example: the spellout of {√GOOSE, n0, Num0[pl]} as geese competing with, and blocking, its spellout as gooses. We can bring this into compliance with EM08 as follows:

(3)
a. Assume the spellout of <X20, …, Xn0> is invariably null.
b. Define a set of features <F2, …, Fn>, where:
    i. For every 2≤in: Xi0 is base-generated with a valued [Fi] feature.
    ii. X10 is base-generated with unvalued Fi features for all values of i (2≤in).
    iii. For every 2≤in: X10 enters into Agree in Fi with Xi0 (thus acquiring valued [Fi]).

We can now recast any blocking involving <X10, …, Xn0> as blocking that is occurring exclusively at X10. Anything that looked previously like vocabulary insertion at Xi0 (2≤in) can now be handled as Fission of Fi from X10. We thus have a recipe for bringing any multi-terminal blocking that meets the criteria in (2) into compliance with EM08, indicating that the restriction of formal competition (and thus, blocking) to individual syntactic terminals is not doing the work that EM08 suggested it was doing.

An EM08-adherent might find solace in the fact that, since (3b.iii) involves Agree, there is a built-in restriction (e.g. phases) on how far away from X10 a head can be while still contributing an Fi feature that will participate in blocking at X10. That is all well and good, but it is still equivalent to just saying that there is blocking among sets of terminals, so long as the members of those sets all occur inside a single phase (which seems trivial on any approach where spellout is cyclic).

I am about to turn the discussion to mutually-conditioned allomorphy, but before I do so, I think it’s worth pointing out that it’s not at all clear how one could possibly rule out what is described in (2‑3) (at least as long as one assumes there is such a thing as Fission). This is important because it means that even if one prefers the mutually-conditioned allomorphy treatment of goose-geese (or person-people, etc.), the loophole described here still exists. Thus, whether we like it or not, the restriction of competition to insertion of exponents at individual syntactic terminals is unable to do any work that is not already done by restricting competition to PF-contiguous spans that are contained in a single phase.

Mutually-conditioned allomorphy

As an alternative to featuralization, suppose we instead attempt to rescue the EM08-compliant treatment of (1) in a different way:

(4)
a. GOOSE → geese / [plural] 
b. Num[pl] → ∅ / GOOSE

On this view, geese (or people, etc.) arises as something of a conspiracy, wherein the elsewhere form of the plural Num0 (/‑z/) is overridden in the presence of the root GOOSE, while the elsewhere form of this root (goose) is overridden in the presence of the plural Num0.

At this juncture, it is useful to take note of a particular property of the goose-geese example, which I have been neglecting so far, and which demonstrates that (4) is in any case too simplistic of a treatment. Consider (5):

(5) The corrupt accountant gooses(/*goose/*geese/*geeses) the earning reports every quarter.

I point out (5) because it shows that the occurrence of the form goose isn’t dependent on nominal number even being present in the structure. That is, goose is not the counterpart of geese in the presence of Num[sg]; it really is the elsewhere form, and presence of Num[pl] triggers a contextual allomorph of that form. For concreteness, let us assume that goose is the spellout of {√GOOSE, n}, and that the verbal use in (5) involves the (common in English) null v denominal verbalizer. In other words, the verb stem in (5) is the spellout of {√GOOSE, n, vdenom}, or more accurately, the spellout of {√GOOSE, n} (which is goose) plus the spellout of {vdenom} (which is null).

(6)
a. {√GOOSE, n, Num[pl]} → geese
b. {√GOOSE, n} → goose
c. {vdenom} → ∅

This is not the only way to capture what is going on with goose-geese (incl. the verbal paradigm), but this way of characterizing the data explicitly sets up a scenario where the spellout of one span (6a) competes with, and preempts, the spellout of a smaller span contained therein (6b) – precisely the sort of thing that EM08 wants to architecturally rule out. So if we can show a recipe that translates (6) into an EM08-compatible characterization involving mutually-conditioned allomorphy, but where vocabulary insertion is restricted to terminals, we will again have shown that the architectural restriction in EM08 is not doing the work it is purported to do.

As before, I will start by translating this particular example into an EM08-compatible implementation, and then generalize the mechanism of inter-translation. Let us begin with the following ‘elsewhere’ rules for the exponents of √GOOSE, n, and Num[pl]:

(7)
a. √GOOSE → goose
b. n → ∅
c. Num[pl] → /‑z/

One thing we could do at this juncture is observe that (7b) is a null exponent, and that (7a) and (7c) would therefore be adjacent as far as the overt structure is concerned. This is the approach taken by Embick (2010) (though it’s worth noting that it is explicitly rejected by Bobaljik 2012, for example, in his treatment of comparative & superlative morphology). But since this would reduce the span in question to a rather trivial one – involving only 2 nodes – let us make things harder on ourselves, and assume that we cannot ignore (7b): it still intercedes between √GOOSE and Num[pl], disrupting the kind of adjacency required for contextual allomorphy in an EM08-style system. (Everything I’m about to say will of course also work if n0 is “pruned” à la Embick 2010, but as I said, I’m intentionally choosing the route that will make an EM08-style treatment harder to construct.)

Nevertheless, we can take a page from the featuralization approach, above, and assume that n0 acquires the [plural] feature from Num0 derivationally (e.g. via Agree). In a move that may seem more controversial – but I will argue, shortly, is not – I will assume that n0 can also enter into a syntactic relation with √GOOSE resulting in the identity of the root being reflected in the syntactic content of n0 itself.

The reason this last move may seem controversial is that a tenet of conventional DM dogma holds that roots are not individuated in the syntax. In essence, the thinking goes, there is only one root object in the narrow-syntactic lexicon (the list of available syntactic atoms). Because roots are assumed to be featureless, syntax wouldn’t have any way of telling multiple root objects apart anyway. If this were true, the last step, above – where n0 derivationally acquires featural content reflecting the identity of its root complement – would be impossible.

However, there are both empirical and conceptual reasons to reject the conventional DM premise regarding roots being “featureless” and, therefore, unindividuated in the syntax. Empirically, Harley (2014) has shown that roots cannot be individuated semantically or phonologically, leaving syntactic individuation as the only option still standing. (Importantly, this conclusion holds both on the strong version of her 2014 claim, whereby all arguments are selected by roots, and also on the weaker version of the claim, which Harley settles on in the reply to the commentaries on her target article, whereby some arguments are selected by roots and some are selected higher up, by syntactic categorizers. The latter view, as far as I can tell, is also compatible with Merchant’s 2019 observation that categorizers often do affect the selectional properties of the roots they attach to.) But I think the conceptual argument, in this case, is even stronger: as I have discussed elsewhere, any version of DM in which the identity of roots is negotiated post-syntactically is an equivocation of modularity, anyway. Ignore what DM declares itself to be doing: any line of communication between PF and LF is syntax, and so there is no version of DM in which roots are not individuated in the syntax. And what does it mean to be individuated in the syntax? It means individual roots (like √GOOSE) have properties legible to the syntax. Let me repeat that: √GOOSE has properties legible to the syntax that distinguish it from √DUCK. I am not proposing this, so much as I am pointing out that it follows from any reasonable definition of how the grammar is modularized.

Given this, there is also no obstacle to assuming that n0 acquires, in the course of the derivation, syntactic properties reflecting that its root complement was √GOOSE (and not √DUCK, or √ESSAY, or …). This is possible because the difference between √GOOSE and other roots is, by definition, legible to the syntax.

After these feature transmissions occur in syntax, the structural representation handed over to PF will be as follows:

(8)
Num0[pl] » n0[pl, GOOSE] » √GOOSE       (where ‘»’ indicates immediate c-command)

We can now recast (6) as in (9):

(9)
a. √GOOSE → geese /      n[pl] 
b. Num[pl] → ∅ /      n[GOOSE]

At this juncture, one might object on the grounds that there are reams of work in morphology indicating that allomorphy is a highly local business, and the kind of non-local interactions just sketched are unavailable, empirically speaking. (Another way of putting this: Embick and others had good empirical reason for proposing their stringent conditions on allomorphy.) My response to that is that those empirical generalizations apparently still await explanation, because the mechanisms put forth to account for them (i.e., to rule out non-local interactions of the kind seen here) are technically unable to do so. In other words: I don’t deny the empirical basis DMers had for proposing these restrictions; I deny that the restrictions proposed get the job done.

It is now time to generalize this treatment, i.e., to show that any account like the one in (6), above – where an exponent associated with one PF-contiguous span competes with and blocks the insertion of an exponent associated with a smaller PF-contiguous span – can be restated in terms of mutually-conditioned allomorphy, with lexical insertion restricted to individual terminals. To the extent that we are able to provide a general recipe of this sort, we will have shown once again that the restriction of insertion to individual terminal does no empirical work.

Let us start again with the state of affairs in (10) (repeated from (2)):

(10)
a. Let <X10, …, Xn0> be a series of successive heads, where for every 1≤in-1: XiP is the complement of Xi+1.
b. Let none of <X10, …, Xn-10> have overt specifiers.
(A PF-contiguous span of heads.)

We can then recast any interaction among multiple heads inside this span in terms of mutually-conditioned allomorphy of individual terminals, as follows:

(11)
Define a set of features <F1, …, Fn>, where:
a. For every 1≤in: Xi0 is base-generated with a valued [Fi] feature.
b. For every 1≤in, 1≤jnji: Xi0 is base-generated with an unvalued [Fj] feature.
c. For every 1≤in, 1≤jnji: Xi0 enters into Agree in Fi with Xj0 (thus acquiring valued [Fj]).

We can now implement mutually-conditioned allomorphy of any head in the span in (10) based on the features of any other head in the same span, up to restrictions on the locality of feature transmission in (11c) (e.g. up to the phase boundaries restricting Agree). As was the case with featuralization, above, it seems natural enough that competition for span-based PF insertion would have to occur within the bounds of a single phase, anyway, so there seems to be no meaningful distinction here, either.

While (11b) requires n features on each head in the span, and (11c) requires a number of Agree relations that is on the order of n2, in practice many of these will do no work in the translation of span-based competition to competition based on individual terminals. For example, in translating the example in (6) along the lines in (11), any n0-based features copied to Num0 will play no actual role in conditioning any allomorphy, and so in practice they need not exist, and any Agree relations they are involved in need not occur. This will be the case for many of the feature relations generated in principle by (11c). None of this is relevant, however, to our main point, which is that nothing beyond Agree is necessary to recast span-based competition in terms of competition at individual terminals.

It is important to note that there is no sui generis mechanism of mutually-conditioned allomorphy at play here, only Agree and garden variety feature-based contextual allomorphy. Thus, unlike the conclusions in the Featuralization section, this result obtains independently of one’s position on the existence of particular operations like Fission.


Epilogue

We have seen that imposing a restriction on competition and blocking, so that they only take place among different exponents vying for insertion at a single syntactic terminal, does not achieve anything that is not already achieved by restricting competition to PF-contiguous spans within a single phase.

I presented two different recipes for recasting competition and blocking among PF-contiguous spans in terms of competition and blocking at individual terminals only. Eliminating the operation of Fission from the grammar would rule out one of the two recipes, namely, the featuralization one; but it is much less clear how one would rule out the other recipe – which, as noted, does not appeal to any sui generis mechanisms beyond Agree and feature-based contextual allomorphy. One could imagine adding some sort of meta-principle that rules out what we have descriptively characterized as mutually-conditioned allomorphy. But this, as far as I am able to tell, would render the system incapable of capturing alternations like goose-geese (or person-people, or …). It is for this reason that I stated at the beginning of this post that the restriction in question on competition and blocking is either vacuous (on the assumption that Fission and/or mutually-conditioned allomorphy exist) or wrong (if they don’t).

As a side note to all this, banning both Fission and mutually-conditioned allomorphy may not even be sufficient to tear down the equivalence between insertion at terminals only and insertion into spans. As Pavel Caha points out in his thesis (pp. 57‑60), and again here (pp. 7‑9), any system with Fusion and insertion into terminals is also equivalent to a system with insertion into spans.

I see all this as very good news, since I think the view whereby the locus of insertion is a span of contiguous heads has a lot going for it (more on that some other time), and so I’m happy to discover that adopting such a view does not cede any meaningful ground to the EM08 alternative.


Thanks to Pavel Caha, Neil Myler, and Asia Pietraszko for helpful discussion. They are not responsible for the contents of this post.

Apr 262020
 

For the last ten years or so, Chomsky has been claiming increasingly often that the discrete bifurcation of expressions into “grammatical” and “ungrammatical” is incorrect. I think he is wrong, or at least that these claims are without any current basis. But before explaining why, it’s important to set some parameters of the discussion.

First, we have to fix what we mean by expressions. If we mean strings, or even a given token of phonation, then I have no quarrel with this. Taking strings as the object of study in linguistics is, to quote Indiana Jones, “digging in the wrong place.” A grammatical constraint, like the Complex NP Constraint (CNPC) for example, can reduce the number of meanings associated with a given string from one to zero, as in (1):

(1) * Which dish do you know the guy who brought?

But the same constraint can reduce the number of meanings associated with a different string from two to one, as in (2) (which can be interpreted as a question about reasons for knowing, but not as a question about reasons for bringing):

(2) Why do you know the guy who brought this dish?

A preoccupation with strings as the object of study necessarily misses the point of grammatical constraints, since it artificially affords (1) (which has zero remaining meanings) different status than (2) (which has a non-zero number of remaining meanings).

So, if Chomsky’s point is, “You can’t bifurcate the set of strings into grammatical and ungrammatical,” then there is no disagreement here. But that doesn’t have anything to do with the notion of grammaticality as such. It has to do with misapprehending the object of study. Language does not generate “strings”; it generates structures. Or, if you prefer (though I do not): form-meaning pairings.

But this is not Chomsky’s point, I don’t think. He seems to be saying that structures (or form-meaning pairings) cannot be bifurcated into grammatical and ungrammatical, either. As best I can tell, he has offered two arguments for this view over the years – and both of them are unsound.

The first putative argument is that utterances simply cannot be bifurcated into acceptable and unacceptable; there is, instead, a range of degrees of acceptability. Therefore, the argument goes, a theory of grammar that delivers a binary verdict (either “grammatical” or “ungrammatical”) is inadequate. The flaw here is in reasoning from the gradience of a behavioral measure – in this case, acceptability – to the gradience of a computational predicate (grammaticality). As Armstrong, Gleitman & Gleitman (1983) showed,1Armstrong, Sharon Lee, Lila R. Gleitman & Henry Gleitman. 1983. What some concepts might not be. Cognition 13:263–308, DOI: 10.1016/0010-0277(83)90012-4. you can get gradient responses from people to prompts like “How even/odd is this number?” That doesn’t mean that even or odd are gradient predicates. The gradience of acceptability (as a behavioral measure) doesn’t mean that grammaticality (as a computational predicate) is gradient, either.2I must say, I find it quite baffling that this basic point is lost on so many in cognitive science, even among certain self-identifying linguists. I remember attending a keynote by Tom Wasow at the 2015 DGfS – so, 32 years after Armstrong, Gleitman & Gleitman’s paper was published – in which more or less the entire talk rested on this error of inferring gradience in the grammar from gradience in the behavioral measure of acceptability.

The second putative argument is related to the first, but it is logically separable. The argument is that language users can assign an interpretation even to purportedly “ungrammatical” expressions. The examples that Chomsky tends to give here, quite tellingly I think, tend to involve s-selectional violations that can be used as idioms, metaphors, and/or conventionalized sayings, like Misery loves company. But nobody in their right mind thinks s-selection is a syntactic phenomenon.3Chomsky has a long history of conflating s-selection with syntax. You can see the seeds of this in his work in the ’50s and ’60s, where purely semantic features like [±abstract] (distinguishing abstract vs. concrete nouns) were projected from terminals. This was an understandable move at the outset of modern syntactic theory. But it is extremely strange to still be clinging to it now. There’s every reason to believe that the grammar generates the structure which pairs the string Misery loves company with the meaning whereby there is an individual denoted by the DP misery that stands in the love relation to an individual denoted by the DP company. That this literal meaning is not (typically) the communicative intent we ascribe to a speaker who has uttered this expression is interesting but, in the grand scheme of things, entirely unremarkable. We also don’t usually interpret Can you pass me the salt? as a polar question seeking information about the addressee’s capabilities. Language use can override the literal meaning associated with an expression; this is not news.

Chomsky might be well served to re-read Syntactic Structures (Chomsky 1957), where (3) was contrasted with (4) (asterisk is mine; see below for discussion):

(3) Colorless green ideas sleep furiously.

(4) * Furiously sleep ideas green colorless.

Given the discussion of s-selection, above, we would want to say that (3) is generated by the grammar, it just happens to have an odd literal meaning, that (at least at the time that this sentence was first brought under discussion) was not associated with any conventionalized meaning. But would we want to say that about (4)?

I’m not suggesting that it is an impossible cognitive task to assign, if forced, an interpretation to (4). But it seems plausible to me that the latter requires considerable cognitive control. This would mean that it is manifestly not the result of the automatic / barely volitional computations that are the object of study in linguistic theory. One can, by exerting conscious effort, assign interpretations to a whole range of things – programming languages, animal noises, etc. etc. That hardly means that the latter interpretations are “generated by the mental grammar” in any relevant sense.

And so, I conclude that Chomsky has yet to present any valid argument against the bifurcation of structures into grammatical(=well-formed) and ungrammatical(=ill-formed). That does not mean that grammatical structures cannot be experienced as quite weird (e.g. (5)). Nor does it mean that, exerting conscious effort, it is impossible to assign some meaning or other to word salad like (4). Nor does it mean that acceptability, as a behavior measure, will not be gradient.

(5) The square root of Milly’s desk drinks humanity.       (Chierchia & McConnell-Ginet 2000:46)

Consequently, the theory that takes examples like (6) to be categorically ill-formed (i.e., not generated by the grammar of my idiolect of English) is very much still in business. Yes, it’s possible to assign some meaning to (6) if forced (e.g. the speaker secretly believes that the child is composed of a tiny British committee wrapped in a trenchcoat). But there is no reason to believe the latter process is carried out by the mental grammar (cf. “meaning” in programming languages).

(6) * The child are here.

This leaves open, of course, the possibility that some things traditionally thought of as “ill-formed” turn out to be better characterized as “well-formed but deviant.” In fact, this is precisely what happened in the history of the treatment of examples like (3) and (5): in Aspects (Chomsky 1965), features like [±abstract] (distinguishing abstract nouns from concrete ones) were projected in the syntactic phrase marker, meaning the reduced acceptability of (3) and (5) was a matter handled by the syntax. (See also fn. 3, above.) Later, it was recognized that this treatment just recapitulates something that the semantic component has to do anyway, and so there is no point in duplicating the same mechanism in the syntax. That is progress and it is good; but it bears not one bit on the question of grammaticality as a binary notion.


This post was prompted by a facebook conversation with Halldór Ármann Sigurðsson. The views expressed here are my own.

Apr 182020
 

Matushansky (2006) proposes to replace the head-adjunction mechanism of Government & Binding theory (call it “Theory1”) with a version that involves movement to a specifier position followed by “m-merger” of the specifier with the adjacent head (call this “Theory2”). One of the major selling points is supposed to be that Theory2 is more compatible with the tenets of minimalist syntax. I will argue that this is exactly backwards, and that Theory1 was far more minimalism-friendly.

First, let’s discuss that which is purportedly problematic about Theory1, when viewed through a minimalist lens. The central problem concerns the Extension Condition: the constraint that forces all Merge operations to have the current root of the tree as one of their (two) operands. In Theory1, the lower head (call it X) moves into the higher head position (call it Y), replacing the original Y with a composite structure [X Y], whose label is Y. This indeed constitutes an instance of Merge that does not target the root of the tree.

Is this really a problem? It is a problem on a literal reading of the Extension Condition. But we know independent of head movement that the Extension Condition is too strong: the “tucking-in” effects shown by Richards (1997, 1998, 2001) for Bulgarian and Romanian multiple-wh movement show this. Instead of the (demonstrably incorrect) Extension Condition, Richards proposes to capture the relevant cyclicity effects in terms of a constraint he calls Featural Cyclicity – basically: “at any given point in the derivation, you’re only allowed to tend to features of the currently-projecting head.” In cases of phrasal movement where multiple specifiers aren’t involved, this delivers the same predictions as the Extension Condition. And where multiple specifiers are involved, it allows tucking-in (though, interestingly, does not require it, leaving room for variation on this front; this might be desirable, especially when it comes to the distinction between base-generated and movement-derived specifiers).

Now back to head movement: if head movement is a response to the features of the attracting head, then good ol’ GB-style head movement (a.k.a. Theory1) obeys Featural Cyclicity: it is a response to features of the head that is currently projecting. That it doesn’t result in Merge at the root of the tree is immaterial; neither does tucking-in.

Let’s turn to Theory2. As a reminder, here X does not head-adjoin to Y; instead, X moves to SpecYP, followed by the instance of X in SpecYP undergoing “m-merger” with the Y head. “M-merger” is supposed to be a morphological operation. But here’s the thing: its output cannot be the business of morphology alone. That’s because the complex head formed when X moves to Y behaves, for the purposes of remainder of the syntactic derivation, as a constituent. In other words, Theory2 assumes that a non-constituent – consisting of the Y head plus the material in SpecYP but excluding the material in the complement of Y – undergoes a “morphological” operation that renders it a constituent for the purposes of subsequent syntactic computation.

How, exactly, is this supposed to work? Isn’t morphology supposed to be post-syntactic?

At this juncture, defenders of Theory2 sometimes appeal to the idea that the inverted-Y/inverted-T/whatever model applies per-cycle (or per-phase, if you prefer; or per-spellout-domain, if you re-prefer). This is certainly true: even on Chomsky’s version of the inverted-Y model, where all covert operations had to follow all overt operations, this ordering only held within a single cycle. For example, QR is clause bounded (at least when it comes to finite clauses), yet if we wait until all the overt structure-building is done in (1) before doing QR of every building across a guard, this QR operation will be in clear violation of the cycle. (All the terms of the operation are contained in an already-completed cycle.) Yet an inverse-scope reading of the most-embedded clause in (1) is in fact possible (I think):

(1) Morty said Rick doubts that a guard stands in front of every building.

On the overt-before-covert (i.e., Chomskyan) version of the inverted-Y model, this arises because the model resets at every cycle (say, every CP boundary). So it only forces overt operations to precede covert ones within one and the same clause.

But is this “restarting” of the derivational model enough for Theory2 to work? It is not. Restarting the model every cycle still does not (or at least, should not) give subsequent cycles of the syntactic computation access to anything but the syntactic information from previous cycles. Here’s why: suppose that we allowed cycle n of the syntactic component access to all the information (syntactic or otherwise) that resulted from the computation of cycle n−1. One consequence would be that, even on a model where phonological content was late-inserted, cycle n would have access to, e.g., whether the phonological forms of items in cycle n−1 included a bilabial consonant. So the venerable “front every constituent that includes a bilabial” rule, which modularity is supposed to rule out, could easily be constructed – with the caveat that the fronting rule could only apply if the constituent containing a bilabial was base-generated in a previous cycle, not if it was base-generated in the same cycle as the attracting probe.

To state this more generally: allowing the syntax in cycle n to make reference to the outputs of the post-syntactic computation in cycles n−1 (and earlier) undoes the effects of modularity, at least as it concerns the computation of any long-distance dependencies (i.e., those spanning multiple cycles, incl. long-distance A-bar movement).

Theory2 requires just this kind of setup: for this theory to work, “m-merger” (a post-syntactic operation) in cycle n−1 must inform narrow syntax (viz. constituency) in cycle n.


What I have constructed here is a slippery-slope argument against Theory2, and such arguments are generally only as good as the slope is slippery. So, for example, one could stipulate that the amount of post-syntactic computation in a given cycle whose output is accessible to the subsequent cycle is limited: that it includes “m-merger” but not, e.g., Vocabulary Insertion.

That’s a logically-coherent solution; but ask yourself this: what is the content, now, of the claim that “m-merger” is post-syntactic? Surely, it cannot be merely the fact that “m-merger” follows other syntactic operations; A-bar movement generally follows A-movement, yet nobody takes this as an indication that A-bar movement is post-syntactic. No, “m-merger” so construed crucially differs from other post-syntactic operations in that its output is legible to syntax. There is a name for the module in which we place operations whose output is legible to syntax. You guessed it: syntax.


Now let us take stock. Theory1 is incompatible with the Extension Condition, but the latter is independently problematic. The best proposal out there for replacing the Extension Condition, namely Featural Cyclicity, rules in a Theory1-treatment of head movement.

Theory2, on the other hand, requires an operation that smushes a non-constituent into a constituent. Despite trying to make ourselves feel better by purportedly placing this operation in the “post-syntax”, this is in fact a distinction without a difference. The operation in question (“m-merger”) informs the syntactic computation, and is therefore a part of syntax. Now ask yourself: what is less minimalist, a syntactic operation that smushes together non-constituents into constituents? Or Featural Cyclicity?

blogpost: Post-minimalism?

 Posted by on 02/16/2020  14 Responses »
Feb 162020
 

I was recently invited to contribute a short piece about Agree to the This Or That Publisher’s Handbook of Minimalism, and it made me wonder to what extent I, or really most other generative syntacticians who got their PhDs after about 2005 or so, can be considered to still be doing “minimalism.”

In one sense: this doesn’t really matter, who cares about labels, we all contain multitudes, yada yada yada.

But in this case, I actually think there is something substantive to this beyond the label. In particular, I think we are no longer really in the “minimalist” era anymore. It’s just that the demarcation of generative-syntax epochs has traditionally been led by Chomsky himself. And Chomsky himself is off doing… well, more on that below.

Let’s start with some clarifications. There are at least four ways the terms “minimalism” or “minimalist” get thrown around as it pertains to syntactic theory. One, the original one really, which traces its origin to Chomsky’s 1993 paper / 1995 book, is a substantive hypothesis about natural language. Roughly: that natural language is nothing but Merge, plus whatever the non-language-specific influences are of general cognition, sensory-motor systems, conceptual-intentional systems, and even laws of physics (a.k.a. the Strong Minimalist Thesis). In the service of trying to make this more than a completely-empty speculation, Chomsky then laid out (in chapter 4 of the 1995 book, but more than that, in the two subsequent papers Minimalist Inquiries, 2000, and Derivation by Phase, 2001), the general contours of what a theory that adhered to this hypothesis might look like. It was only a sketch, of course, buried in the usual menagerie of plausible-deniability hedges (“It’s not a theory, it’s a program!”). In going from speculation to a sketch of substantive theory, Chomsky was forced to make certain choices/commitments. Some turned out really well (e.g. probe-goal). Others, not so much (e.g. uninterpretable features). Regardless of how well they turned out, though, this collection of substantive choices leads to the second way the term “minimalist” is now used, namely to refer to syntactic theories that adopt most or all of these substantive choices. So, for example, anyone doing some version of a probe-goal syntactic theory nowadays (myself included) will usually get lumped in as “minimalist,” though as I will discuss shortly, it is pretty straightforward to imagine a probe-goal approach to syntax that eschews all but the most trivial minimalist commitments.

The third way the term “minimalist” gets bandied about is as a very particular computational-linguistic construct, namely the Minimalist Grammars formalism developed by Ed Stabler and his collaborators and students. This is a welcome effort to computationally formalize the collection of substantive theoretical choices that goes under “minimalism (sense 2)”; but for those (many) of us who think the object so formalized contains as many bad choices (e.g. a checking-based feature calculus, rather than valuation-based one) as good ones, this computational object is of limited interest to the Practicing Theoretical Linguist, Non-Computational Division.

The fourth (and as far as I can tell, final) way the terms “minimalism” or “minimalist” get used is the most general and, consequently, the most trivial. It essentially boils down to the methodological heuristic of “less is better,” a.k.a. Occam’s razor. Chomsky’s Minimalist Program has put a new spin on this age-old methodological heuristic, in the form of a concern for evolutionary plausibility (viz. the less we put into UG, the easier it is to envision it evolving in the species and being phenotypically more or less uniform; a.k.a. Darwin’s Problem). But in practice, I see no effective difference between this new variant and Occam’s more venerable version, at least for how I go about doing theoretical linguistics. More is worse, less is better. This is old news.

Let’s summarize so far. The third, Stablerian sense of “minimalist” (as in Minimalist Grammars) is of limited relevance beyond comp-ling. The fourth sense is so bland that it amounts to little more than “Do good science.” This leaves us with sense 1 (as in the Strong Minimalist Thesis), and sense 2 (as in, a shorthand for adopting a significant subset of the substantive choices made by Chomsky in his first few minimalist papers).

So, let’s see: is someone like me a “minimalist” in either of these two remaining senses? (Of course, if this were really about me, it would be of extremely limited interest. As I’ll discuss shortly, I don’t really think this is unique to me at all. But I am the one whose views I know with the greatest certainty, so I’ll start there.) I certainly don’t think the Strong Minimalist Thesis is right, and am on record with an argument to that effect. So that’s out. And I’m far from unique on that front: I was speaking recently with a very prominent morphosyntactician about that linked-to paper (someone of my generation, trained in one of the very top departments for generative syntax), and their response was something like, “Yeah, I’ve never really cared what Chomsky has to say about uninterpretability and ‘crashes’. I’ve basically always assumed things work like [what this paper says], so I really don’t see what the fuss is about.”

And what of the substantive choices that go under “minimalism (sense 2)”? Let me take inventory of my own views. The success of the probe-goal model, in my opinion, has been spectacular. As for Phase Theory, it is mostly Subjacency/Barriers leftovers, thrown in the microwave and reheated. And the evidence for it has been seriously overstated. (That’s not to say that there doesn’t remain some serious evidence for phases, as the comments to the linked-to post make clear.) Uninterpretability, as anyone vaguely familiar with my work knows, was a nice try but ultimately failed. It was influenced, I think, by ideas like the Case Filter, which themselves have turned out to be wrong. Feature-inheritance? Again a nice try, again doesn’t work (see Haegeman & van Koppen 2012, among many others). And as for the recent “labeling” stuff (Chomsky’s Problems of Projection and related papers), I’ll quote Jason Merchant’s comment on FoL:

[…] But no C-I requirement can plausibly tell us that “angry at”, “proud of”, “interested in”, etc pair they way they do. We need l-selection [category-dependent lexical-selection; O.P.], which means we need labels that are at least as fine-tuned as distinguishing “at” from “of”, “in” etc requires. These relations are fundamentally *syntactic*, and so any theory of syntax that claims that these relations can be captured by or at the C-I interface has an uphill battle for the hearts and minds. […]

Jason is being kind/gentle here. Pesetsky and others established way back in the 80s (and beyond any reasonable doubt, I think) that c-selection cannot be reduced to semantics, and so an approach to labeling that situates it at the “C(onceptual)-I(ntentional) interface” is nothing short of laughable, imo.

So the question is: does one really count as “minimalist” because one uses probe-goal and, for lack of a better available proposal, some version of Phase Theory, but eschews all other minimalist technology and doesn’t really believe in the Strong Minimalist Thesis?

And here’s the thing: I think something roughly similar can be said of the overwhelming majority of allegedly “minimalist” syntacticians of the current generation (which I have arbitrarily and capriciously demarcated as getting your PhD post-2005). There are exceptions: Dan Milway and Marc Richards spring to mind, among others. But really, I think I can count the people that are exceptions on one hand. (Marc’s PhD is actually from 2004, if I remember correctly.)

If we track the development of this particular branch of generative syntax from the various “((Revised) Extended) Standard Theory” periods, through the “Government & Binding” period and the “Minimalist” period, isn’t it fairly clear that we are now in a distinct period? This isn’t an issue of labels per se, but of a difference in the substantive set of assumptions that currently lie at the core of our endeavor. (To be clear, I think that “post-minimalism” is a rather shitty label; I consider it more of a placeholder than a real candidate for what this should be called.)


Postscript

When I say things like “{most generative syntacticians, the overwhelming majority of allegedly ‘minimalist’ syntacticians} of the current generation,” I’m really talking about people doing morphosyntax. This is partly a matter of familiarity, as this is the literature that I read the most. I can’t say with confidence that the same things hold of what’s going on on the LF side. But this may not be just a matter of familiarity. Chomsky used to pay close attention to morphosyntactic matters (look no further than the treatment of the English auxiliary system in LSLT / Syntactic Structures, but also, e.g., Remarks on Nominalization). But for my money, his work starting from the Minimalist Program (if not earlier) is characterized by a cavalier, almost dismissive attitude towards even the most basic issues of morphosyntax, and a nearly monomaniacal tendency to equate syntax with LF. This is coupled with dismissive remarks about “externalization” as a cover-all for morphosyntax (and/or an attempt to sweep under the rug the severe morphosyntactic shortcomings of his proposals). It’s probably no coincidence, then, that the late 90s / early 2000s were, at least in some circles, the heyday of LF syntax. (Disclaimer: if you thought the discussion above was parochial, I’m about to delve into some extremely localist opinionizing. Equip yourself with a gigantic grain of salt, please.) From my vantage point as a late-aughts graduate student in the MIT syntax program, for example, the generation before us was dominated by people doing LF syntax (Fox, Hackl, Nissenbaum). There were exceptions, as there are bound to be (Wurmbrand, Legate), but one can still identify something of a sea change, I think, occurring right after that time. I was strictly in the riding-the-coattails category, mind you. To the extent that there were specific individuals whose influence on the syntax grad-student body led to this sea change, I would single out Jessica Coon and Claire Halpert. So, in addition to Coon and Halpert (and myself), you suddenly have Bjorkman, Levin, van Urk, Yuan, and many other morphosyntax theses. (This impression may fall apart under careful quantitative scrutiny; I’m expressing what was my subjective impression living through this period.) I’ve certainly brought with me to UMD my belief that morphosyntax is the “main event” when it comes to syntactic theory, and I feel like my colleague Maria Polinsky shares this view (though please don’t hold her responsible for what I write here). At UMD, this is well-balanced by the views of faculty members like Alexander Williams, Valentine Hacquard, Howard Lasnik, and Norbert Hornstein, who can all be described as LF-centric to varying degrees. But regardless, I think it’s noteworthy that as “core minimalism” drifts ever more towards what is essentially a theory of meaning, with fewer and fewer commitments to form (“externalization”), its ability to contend with even the most basic facts about morphosyntax has degraded to the point that I, for one, find it hard to still consider myself a “minimalist” at all.

blogpost: blogroll? blogroll!

 Posted by on 01/01/2020  1 Response »
Jan 012020
 

At the polite urging of the particle linguist, I have been thinking about adding a blogroll to my site. (As keen observers will discern, this urging is about a year old now… Better late than never!) As the particle linguist points out, this is standard practice in other blogospheres, and it strikes me as just plain good online citizenship.

One problem is that my blog is essentially a “wholly-owned subsidiary” of my personal academic homepage – which is not, in its entirety, a blog – and so a blogroll is not appropriate for every single part of my website. So for now I’ve settled on making the blogroll appear on the top of the sidebar, but only on the blogging-related pages of my site.

To avoid any confusion: this list of blogs is not intended as an evaluative statement of any kind (à la “these are the best blogs” or whatever), nor is it even an endorsement of everything that these blogs have to say. It’s just the list of linguistics blogs that I read regularly or semi-regularly.

Dec 252019
 

Ever since Phase Theory was first put forth by Chomsky, it has been taken for granted that phases include at least CP and transitive vP (or whatever you think the highest projection in a transitive verb phrase is). More recently, Keine (2017) has presented a very nice argument that vPs, even transitive ones, cannot be phases, at least not in Hindi.

I find Keine’s idea to be very promising, for a couple of reasons. First, there is the issue of the Phase Impenetrability Condition (PIC). As many of you probably know, Chomsky has two versions of the PIC. The first (“PIC1”) is straightforward: once the phasal XP is complete, nothing inside its complement domain (Compl,X) is accessible. This allows things to escape via Spec,XP, a pattern linguists have been finding evidence for since the ’70s. The second (“PIC2”) is a “staggered” version of PIC1: the complement domain of a phasal XP remains accessible until the next phase head Y is merged. Now, if I were a big believer in conceptual arguments, this is where I would tell you that PIC2 makes no sense, and therefore must be wrong. But I’m not, so I won’t. What I do want to point out is that, tracing the history of PIC2, I’m pretty sure the only reason it was put forth (in Chomsky’s 2001 Derivation by Phase; see p. 14) was to account for agreement between T and a nominative object in Icelandic (specifically, in those cases where the object has not undergone object shift). If there is a phase boundary at the verb-phrase level, then PIC1 would rule out such an agreement relation, whereas PIC2 would rule it in. Do you see where this is going? This constitutes an argument that we need PIC2 rather than PIC1 only on the assumption that vP (or something thereabout) is a phase. So the first reason that Keine’s result is appealing is because it allows us to do away with PIC2 (which, whether it makes conceptual sense or not, is clearly more complicated) and return to the simpler PIC1.

The second reason I find Keine’s position appealing is that the vast majority of the evidence we have for the successive-cyclicity of long-distance movement concerns the CP layer. Den Dikken has a project in which he attempts to argue that vP (or thereabouts) is a phase but CP isn’t. One of the striking things about this project is just how hard Den Dikken has to work: the vast majority of evidence for intermediate stopping, by far, is evidence involving the CP layer. In fact, I would venture to say that the only evidence I know of that seems to solidly point to a vP-level phasal category comes from van Urk & Richards’ (2015:127) work on Dinka.

At this juncture, you might be asking yourself something like, “Wait a minute, what about all the other evidence we have for vPs being phases?”

As you might suspect, I don’t think much of that evidence stands up to scrutiny. Here’s why.1This is – and not for the first time on this blog – an elaboration of a comment I left on Norbert’s blog once upon a time. Phase theory is supposed to deliver predictions on where things must stop, not where things can stop. Or, if you prefer: it’s the Phase Impenetrability Condition, not the Phase Optional Permeability Condition. Crucially, many of the purported arguments for the phasehood of vP amount to arguments that a Spec,vP stopping-off point is possible, and provide no evidence that such a stopping-off point is obligatory.

Now, one could entertain a theory where the only possible stopping-off points were phase edges. While that is a logically coherent theory, I think we can safely say that it can be discounted on empirical grounds. The following is a digression to demonstrate this point.

· · · · · · · · · · · · · · · · · · · · 

(1) The children seem to her to have liked Mary.
(2) The woman showed the boy to himself in the mirror.
(3) The woman showed the boy to herself in the mirror.
(4) *The children seem to me to have appeared to myself to be clever.
(5) The children seem to me to have appeared to themselves to be clever.

From the disjoint-reference effect in (1) between her and Mary, we can conclude that the to-experiencer of a verb like seem is able to bind into the verb’s infinitival clausal argument. From the success of reflexive-binding in (2) we know that English self-anaphors are not subject oriented. And from the success of reflexive-binding in (3) we know that binding of English reflexives is not subject to minimality (i.e., there is no condition demanding that the antecedent of a reflexive be in the closest position that could possibly bind that reflexive).

Taken together, this means that the failure of reflexive binding in (4) cannot be because (to) me is not capable of binding into the infinitival clause; nor can it be because (to) me is not a subject; nor can it be (merely) because the infinitival contains a closer binder (say, the trace/copy of the children in the subject position of appear). Instead, the reason seems to be that the reflexive myself in (4) is contained in a binding domain (say, the infinitival clause whose main verb is appear) that necessarily excludes (to) me. But if the relevant binding domain necessarily excludes (to) me, then it also necessarily excludes the matrix position of the children (on the benign assumption that binding domains are structurally contiguous).

If all this is so, then how is reflexive binding of themselves, in (5), successful? The only answer, it seems to me, is that the binding domain of myself/themselves in (4)/(5) includes a trace/copy of the children in a position c-commanding the reflexive. And since the relevant position of this trace/copy is not its theta position (which is lower down, in the copular infinitival), the conclusion is that there is a possible stopping off point for movement of the children in the intermediate infinitival clause, likely in Spec,TP of the TP complement of seem.

Now, importantly, the TP complement of seem has no business whatsoever being phasal.2Even if you think that some TPs “inherit” certain phasal properties from C(P), the particular TP in question is not in a structural position to do so, being nowhere near any C(P). So there you have it: a possible stopping off point for movement is not an argument for phasehood.

· · · · · · · · · · · · · · · · · · · · 

Taking this into account, it seems like many of purported arguments in favor of the phasehood of vP have the wrong quantificational force, if you will: they are arguments that a certain stopping-off point is possible, not that it is obligatory. Take the famous “scope-trapping” diagnostics used by Fox (2000:164) and Legate (2003:507) to argue for an intermediate landing site in Spec,vP: these diagnostics merely show that a stopping-off point in Spec,vP is possible, not that it is obligatory. Sure, under certain circumstances (where all other possible stopping-off points would yield binding-theoretic ill-formedness), one can force movement to stop off at this position. But, ipso facto, what is then responsible for the necessity (as opposed to the possibility) of stopping off in Spec,vP is binding, not phasehood.

A possible objection, raised by Peter Svenonius, goes as follows. We know that not every intermediate position is a possible stopping off point. If something (e.g. Spec,vP) behaves like it is consistently a possible stopping off point, phasehood provides a very plausible explanation for why this position and not others behaves that way. The problem with this logic is that it applies with just the same force to the Spec,TP scenario, above. And since we know that it delivers the wrong conclusion there (namely, that raising TPs are phases), I conclude that there must be something wrong with the argument itself.

It is illuminating, I think, to compare this state of affairs with some of the evidence we have for CP-level successive-cyclicity. For example, the famous findings by McCloskey (1979:150–156) about the Irish complementizer system show that, in cases of overt wh-movement, every complementizer on the movement path must undergo the goaL change. Must, not may. The landscape is not quite as clean as one might hope since, as I mentioned above, there is at least one good argument that I am aware of that Spec,vP is an obligatory stopping point (see van Urk & Richards’ 2015:127). If Keine’s position is to be fully vindicated, one must find an alternative account of such facts.

The important thing is, however, that Keine’s position absolutely does not conflict with data like that of Fox or Legate, or anyone else who adduces evidence that Spec,vP is a possible stopping-off point for movement. Phase Theory is a theory of obligatory stopping-off points, not of optional ones. Or, more accurately: Phase Theory is a theory that is supposed to deliver predictions about obligatory stopping-off points, not about optional ones.