Matushansky (2006) proposes to replace the head-adjunction mechanism of Government & Binding theory (call this “Theory1”) with a version that involves movement to a specifier position followed by “m-merger” of the specifier with the adjacent head (call this “Theory2”). One of the major selling points is supposed to be that Theory2 is more compatible with the tenets of minimalist syntax. I will argue that this is exactly backwards, and that Theory1 was far more minimalism-friendly.
First, let’s discuss that which is purportedly problematic about Theory1, when viewed through a minimalist lens. The central problem concerns the Extension Condition: the constraint that forces all Merge operations to have the current root of the tree as one of their (two) operands. In Theory1, the lower head (call it X) moves into the higher head position (call it Y), replacing the original Y with a composite structure [X Y], whose label is Y. This indeed constitutes an instance of Merge that does not target the root of the tree.
Is this really a problem? It is a problem on a literal reading of the Extension Condition. But we know independent of head movement that the Extension Condition is too strong: the “tucking-in” effects shown by Richards (1997, 1998, 2001) for Bulgarian and Romanian multiple-wh movement show this. Instead of the (demonstrably incorrect) Extension Condition, Richards proposes to capture the relevant cyclicity effects in terms of a constraint he calls Featural Cyclicity – basically: “at any given point in the derivation, you’re only allowed to tend to features of the currently-projecting head.” In cases of phrasal movement where multiple specifiers aren’t involved, this delivers the same predictions as the Extension Condition. And where multiple specifiers are involved, it allows tucking-in (though, interestingly, does not require it, leaving room for variation on this front; this might be desirable, especially when it comes to the distinction between base-generated and movement-derived specifiers).
Now back to head movement: if head movement is a response to the features of the attracting head, then good ol’ GB-style head movement (a.k.a. Theory1) obeys Featural Cyclicity: it is a response to features of the head that is currently projecting. That it doesn’t result in Merge at the root of the tree is immaterial; neither does tucking-in.
Let’s turn to Theory2. As a reminder, here X does not head-adjoin to Y; instead, X moves to SpecYP, followed by the instance of X in SpecYP undergoing “m-merger” with the Y head. “M-merger” is supposed to be a morphological operation. But here’s the thing: its output cannot be the business of morphology alone. That’s because the complex head formed when X moves to Y behaves, for the purposes of remainder of the syntactic derivation, as a constituent. In other words, Theory2 assumes that a non-constituent – consisting of the Y head plus the material in SpecYP but excluding the material in the complement of Y – undergoes a “morphological” operation that renders it a constituent for the purposes of subsequent syntactic computation.
How, exactly, is this supposed to work? Isn’t morphology supposed to be post-syntactic?
At this juncture, defenders of Theory2 sometimes appeal to the idea that the inverted-Y/inverted-T/whatever model applies per-cycle (or per-phase, if you prefer; or per-spellout-domain, if you re-prefer). This is certainly true: even on Chomsky’s version of the inverted-Y model, where all covert operations had to follow all overt operations, this ordering only held within a single cycle. For example, QR is clause bounded (at least when it comes to finite clauses), yet if we wait until all the overt structure-building is done in (1) before doing QR of every building across a guard, this QR operation will be in clear violation of the cycle. (All the terms of the operation are contained in an already-completed cycle.) Yet an inverse-scope reading of the most-embedded clause in (1) is in fact possible (I think):
(1) Morty said Rick doubts that a guard stands in front of every building.
On the overt-before-covert (i.e., Chomskyan) version of the inverted-Y model, this arises because the model resets at every cycle (say, every CP boundary). So it only forces overt operations to precede covert ones within one and the same clause.
But is this “restarting” of the derivational model enough for Theory2 to work? It is not. Restarting the model every cycle still does not (or at least, should not) give subsequent cycles of the syntactic computation access to anything but the syntactic information from previous cycles. Here’s why: suppose that we allowed cycle n of the syntactic component access to all the information (syntactic or otherwise) that resulted from the computation of cycle n−1. One consequence would be that, even on a model where phonological content was late-inserted, cycle n would have access to, e.g., whether the phonological forms of items in cycle n−1 included a bilabial consonant. So the venerable “front every constituent that includes a bilabial” rule, which modularity is supposed to rule out, could easily be constructed – with the caveat that the fronting rule could only apply if the constituent containing a bilabial was base-generated in a previous cycle, not if it was base-generated in the same cycle as the attracting probe.
To state this more generally: allowing the syntax in cycle n to make reference to the outputs of the post-syntactic computation in cycles n−1 (and earlier) undoes the effects of modularity, at least as it concerns the computation of any long-distance dependencies (i.e., those spanning multiple cycles, incl. long-distance A-bar movement).
Theory2 requires just this kind of setup: for this theory to work, “m-merger” (a post-syntactic operation) in cycle n−1 must inform narrow syntax (viz. constituency) in cycle n.
What I have constructed here is a slippery-slope argument against Theory2, and such arguments are generally only as good as the slope is slippery. So, for example, one could stipulate that the amount of post-syntactic computation in a given cycle whose output is accessible to the subsequent cycle is limited: that it includes “m-merger” but not, e.g., Vocabulary Insertion.
That’s a logically-coherent solution; but ask yourself this: what is the content, now, of the claim that “m-merger” is post-syntactic? Surely, it cannot be merely the fact that “m-merger” follows other syntactic operations; A-bar movement generally follows A-movement, yet nobody takes this as an indication that A-bar movement is post-syntactic. No, “m-merger” so construed crucially differs from other post-syntactic operations in that its output is legible to syntax. There is a name for the module in which we place operations whose output is legible to syntax. You guessed it: syntax.
Now let us take stock. Theory1 is incompatible with the Extension Condition, but the latter is independently problematic. The best proposal out there for replacing the Extension Condition, namely Featural Cyclicity, rules in a Theory1-treatment of head movement.
Theory2, on the other hand, requires an operation that smushes a non-constituent into a constituent. Despite trying to make ourselves feel better by purportedly placing this operation in the “post-syntax”, this is in fact a distinction without a difference. The operation in question (“m-merger”) informs the syntactic computation, and is therefore a part of syntax. Now ask yourself: what is less minimalist, a syntactic operation that smushes together non-constituents into constituents? Or Featural Cyclicity?