At the time of writing this sentence, I have submitted more than three dozen abstracts to conferences, some of which were accepted, and some of which were not. Some of the abstracts got uniformly favorable reviews, some got uniformly unfavorable ones, and some got contradictory ones. I think some of the contradictions are rather instructive, and some are funny – so they are here for everyone to see.
Chicago Linguistics Society 58
- Review 1 (-2 – reject): This is a conceptual yet superficial proposal. The three ingredients of the authors' TMA are phrased very generally without specific reference to issues in multilingual acquisition. These are briefly touched on at the very end, but only negatively, so that it is not clear why existing approaches should fall short of the authors' stipulated requirements.
- Review 3 (2 – accept): This paper provides three requirements for a generative theory of multilingual acquisition. All are explained clearly with sound reasoning. A comprehensive overview of existing theories in regards to the requirements laid out in the paper sounds particularly interesting and beneficial to the field.
LSRL 52
- Review 1 (I assume accept): The proposal is well grounded, and the hypothesis clearly stated. The study is well described providing information about the participants, the task and how the data analysis has been conducted. The outline of the results (including points for further work) is clearly conveyed. Although this is not my area of expertise, I consider it a very interesting and very well formulated proposal.
- Review 2 (I assume reject): While the topic discussed in this paper is important for theories of L3 and L4 acquisition, it is not clear to this reviewer that the paper would be appropriate at a conference like LSRL. The connection to Romance grammar is tenuous at best - the details of how negative disjunctions are expressed and interpreted in French are barely mentioned in the abstract (there is no actual French data). The sample test sentence that was supposedly evaluated in French is given in English (not sure why). I think readers of the abstract could also benefit from a clearer contextual explanation of what the participants are evaluating as true or false. I’m not sure why this information is not included in the abstract when it is essential for understanding what the participants are evaluating. I cannot understand what true/false for (1e) actually means without a context. Moreover, a clearer explanation of what is expected if transfer from L1/L2 vs. transfer from L3 happens would be welcomed. None of this is included in the description of the experiment. Finally, I am skeptical about how much the L4 French learners actually understand how to use negative disjunction in French. It is a difficult judgment to get in one’s native language let alone an L4 and it is something that is probably not taught explicitly and relatively rare in the input that an L4 learner would have, and scores on proficiency test might not be the best indicators of this. One of the results is that the L4 learners behave in a less target-like way as they are more advanced. I wonder if there are other studies on these kinds of semantic judgments and how their difficulty might play a role in accounting for these unexpected results. For these reasons it would probably be better received at a general language acquisition conference. I think the results of the study are interesting and I urge the author(s) to submit this to a language acquisition conference that is not language-specific and be more clear in the abstract about how the contextual information that participants had in assessing truth values.
Penn Linguistics Society
- Review 1 (3 – strong accept): I find this topic fascinating and the abstract compelling. The submitter nicely lays out the case of three enticing situations in which people beyond the critical age came to acquire a non-native language to a considerable extent. As the submitter intimates, the information gleaned from cases such as these should not be jettisoned, but should be incorporated into the strengthening of theories.
- Review 2 (-2 – reject): The author begins with the claim that “anecdotal data and case studies have been repeatedly shunned from language acquisition inquiry.” This is simply not true. A cursory google scholar search on the terms [“second language acquisition” “case study”] provides more than 50,000 results. […] I have grave concerns, however, about the quality of the observational data that this author brings to bear, and hence of the validity of the evidence in the theoretical claim. […] Finally, it is unclear to me whether the central theses of this piece are intended to be around an epistemology of science (regarding the use of case study evidence) or intended to be around particular SLA hypotheses (the critical period, the fundamental difference hypothesis). With regard to the latter, I’ll also add that I don’t believe that SLA researchers intend that these hypotheses apply without exception to all learners – rather that they are explanations of the linguistic patterns of most typical learners.
- Review 3 (2 – accept): This paper presents a new set of anecdotal evidence on second language acquisition not previously discussed in literature. Though projecting an argument based on a small number of anecdotal cases might seem deviant from modern research practices that adopt a systematic experiment or a large amount of corpus data, the author’s argument is based on well-informed theories so it is worth being heard.
One particular case
Most reviewers do not spend a whole lot of time writing the reviews and explaining their point of view, but there was one particular case when I did submit a rather lousy paper to Society for Computation in Linguistics – and I knew it was lousy. It was rejected, but one of the reviews was what ia given below, and what is the most thorough review I have ever gotten or I think I will ever get, and for which I am incredibly grateful to the reviewer. (I should point out that submitted was a paper, not a two-page abstract)
The review
Score: -2 (reject)
I think it’s a good goal to provide a concrete metric in the 3LA domain (i.e., the metric of algorithmic efficiency), but I found the implementation confusing in several places (also, where was algorithmic efficiency actually defined?) so it’s hard for me to tell how much to believe the conclusions. To be fair, the paper itself notes in the conclusive remarks that this is a preliminary proposal meant to spur discussion, but I worry that it’s still too preliminary (perhaps due to length considerations as well).
Specific comments:
-
(1) Section 1: So it sounds like wholesale transfer assumes one big batch copy operation? And that partial transfer could still transfer the whole grammar, but would just do it gradually than in a batch? This is something that would be helpful to specify more explicitly, since I thought partial would mean only part of the grammar is copied period, rather than the gradual vs. batch distinction that seems to be being made.
-
(2) Section 1, footnote 5: It sounds like you’re excluding domain-specific, generative models (among others). What grammar parameters are you referring to then? Would these be domain-general parameters? Could you give an example of this kind of parameter? Wouldn’t pro-drop be a domain-specific, generative parameter? And why wouldn’t generative and minimalist grammars be related to formal approaches to linguistics? In section 6.1, it seems like you’re referring to minimalist approaches to grammar for inspiration by referencing merge, so it does seem like you’re considering these types of grammars after all.
-
(3) Section 2: I’m not sure I understand why the Slabakova (2017) description of interconnectivity would necessarily equate to cognitive economy. I could imagine a world where interconnectivity is just the empirical fact, and it actually takes more cognitive “energy” to navigate through things that are interconnected, rather than being able to zoom in on a more isolated part directly.
-
(4) Section 2: I definitely appreciate providing a concrete definition of algorithmic efficiency as a useful metric in this domain, but it’s not clear to me what cognitive economy actually is, so it seems possible that algorithmic efficiency could be a precise implementation of one form of cognitive economy. Why isn’t this so?
-
(5) Section 3: If the specifics of the pro-drop parameter don’t matter, why mention it at all? Especially since you note that algorithmic efficiency can be applied to any parameter hierarchy you like. Is it just that you want to link your case study to a concrete example? (Which I think is always a good idea.) However, if that’s the case, I think you do want to give your readers at least a cursory overview of the relevant parts of the pro-drop parameter, especially as to how the tree structures in examples (2) and (3) emerge from them (I see this comes from (1), but you don’t talk about (1) before presenting (2) and (3) and (1) doesn’t show up until the next page). Maybe this means moving example (1) up so it occurs before you discuss the pro-drop parameter.
-
(6) Section 5, on partial transfer of unchanged parameter settings: How does the learner know these are the unchanged ones? Is the learner biased to look at these first, traversing the parameter tree from the root down? Why isn’t the number of nodes to be changed the same n we saw in the wholesale copy definition? What does it mean to copy remaining fragments after “changing values up the tree”?
-
(7) Section 6.1: I’m not sure I understand why parameter-settings necessarily have to be inaccessible, in the way that constituents are after merge. I think this is a proposal that would increase algorithmic efficiency, but doesn’t obviously have to be true.
-
(8) Section 6.1: I understand the point about how a wholesale copy would need to have the tree be malleable. But again, why is that so strange to think about? If a fresh copy is being made, then this is a newer knowledge representation in the process of being formed. Why shouldn’t it be more accessible/malleable than the L1 representation? I think you’re proposing that it’s not, which is fine. But that’s an assumption, not something that must be true.
-
(9) Section 6.2, the approach you mention about partial transfer re-building the entire tree, as with L1: Is this traversing the tree from the root down? Something else? And if it is like traversing the tree root down, why wouldn’t we expect interlanguage/developmental errors as the tree is traversed? Why couldn’t there be traversal both up and down the tree, revising as new information comes in?
-
(10) Section 6.3: Are there citations for an Icelandic speaker learning German who never does through an English-language stage? This seems to be the empirical fact that the description in this section assumes.
-
(11) Section 6.4: The issue of code-switching seems to be a separate one from the different transfer options (wholesale vs. partial) that are being considered. That instead seems to be more about what kind of new grammar (2 separate grammars vs. merged grammar) is being created. So, it wasn’t clear to me how relevant this section was to your main question. Seeing the intro of section 7, maybe it would help to remind your readers that you think both these transfer options are problematic because they involve making a new copy of a tree period, however that process occurs. Your proposal does away with copying altogether.
-
(12) Section 6.5: This section seems to focus on when a copy should be initiated, whether or not it’s wholesale or partial transfer. This also seems orthogonal from the question of which kind of copying is more efficient. Also, the same comment as above about reminding readers of the relevance before you get to section 7.
-
(13) Section 7.1: It seems like the proposal to just have multiple options in the original tree shifts the complexity from the copy operation to the original tree construction. Sure, it may take less energy to order options from a single tree rather than copy that tree to a new one, but doesn’t it take more energy to construct a tree with multiple ordered options at each node in the first place?
-
(14) Section 7.2.3, the proposal to add a setting to the most proximate level which accommodates successful parsing: This seems similar to previous work by William Sakas and Janet Fodor that uses this same approach for L1 learning. It might be helpful to reference that work to show that this idea has been used fruitfully in L1 acquisition already.
Also, why should the learner only be able to adjust one level up in the tree, if they have access to the whole tree and can parse with whatever parts of the tree seem to work? Is this meant as just one illustrative example of the kind of adjustment that can be made by using a parsing-to-learn strategy?