Click through the PLOS taxonomy to find articles in your field.
For more information about PLOS Subject Areas, click here .
Loading metrics
Open Access
Peer-reviewed
Research Article
The Critical Period Hypothesis in Second Language Acquisition: A Statistical Critique and a Reanalysis
* E-mail: [email protected]
Affiliation Department of Multilingualism, University of Fribourg, Fribourg, Switzerland
- Jan Vanhove
- Published: July 25, 2013
- https://doi.org/10.1371/journal.pone.0069172
- Reader Comments
17 Jul 2014: The PLOS ONE Staff (2014) Correction: The Critical Period Hypothesis in Second Language Acquisition: A Statistical Critique and a Reanalysis. PLOS ONE 9(7): e102922. https://doi.org/10.1371/journal.pone.0102922 View correction
In second language acquisition research, the critical period hypothesis ( cph ) holds that the function between learners' age and their susceptibility to second language input is non-linear. This paper revisits the indistinctness found in the literature with regard to this hypothesis's scope and predictions. Even when its scope is clearly delineated and its predictions are spelt out, however, empirical studies–with few exceptions–use analytical (statistical) tools that are irrelevant with respect to the predictions made. This paper discusses statistical fallacies common in cph research and illustrates an alternative analytical method (piecewise regression) by means of a reanalysis of two datasets from a 2010 paper purporting to have found cross-linguistic evidence in favour of the cph . This reanalysis reveals that the specific age patterns predicted by the cph are not cross-linguistically robust. Applying the principle of parsimony, it is concluded that age patterns in second language acquisition are not governed by a critical period. To conclude, this paper highlights the role of confirmation bias in the scientific enterprise and appeals to second language acquisition researchers to reanalyse their old datasets using the methods discussed in this paper. The data and R commands that were used for the reanalysis are provided as supplementary materials.
Citation: Vanhove J (2013) The Critical Period Hypothesis in Second Language Acquisition: A Statistical Critique and a Reanalysis. PLoS ONE 8(7): e69172. https://doi.org/10.1371/journal.pone.0069172
Editor: Stephanie Ann White, UCLA, United States of America
Received: May 7, 2013; Accepted: June 7, 2013; Published: July 25, 2013
Copyright: © 2013 Jan Vanhove. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: No current external funding sources for this study.
Competing interests: The author has declared that no competing interests exist.
Introduction
In the long term and in immersion contexts, second-language (L2) learners starting acquisition early in life – and staying exposed to input and thus learning over several years or decades – undisputedly tend to outperform later learners. Apart from being misinterpreted as an argument in favour of early foreign language instruction, which takes place in wholly different circumstances, this general age effect is also sometimes taken as evidence for a so-called ‘critical period’ ( cp ) for second-language acquisition ( sla ). Derived from biology, the cp concept was famously introduced into the field of language acquisition by Penfield and Roberts in 1959 [1] and was refined by Lenneberg eight years later [2] . Lenneberg argued that language acquisition needed to take place between age two and puberty – a period which he believed to coincide with the lateralisation process of the brain. (More recent neurological research suggests that different time frames exist for the lateralisation process of different language functions. Most, however, close before puberty [3] .) However, Lenneberg mostly drew on findings pertaining to first language development in deaf children, feral children or children with serious cognitive impairments in order to back up his claims. For him, the critical period concept was concerned with the implicit “automatic acquisition” [2, p. 176] in immersion contexts and does not preclude the possibility of learning a foreign language after puberty, albeit with much conscious effort and typically less success.
sla research adopted the critical period hypothesis ( cph ) and applied it to second and foreign language learning, resulting in a host of studies. In its most general version, the cph for sla states that the ‘susceptibility’ or ‘sensitivity’ to language input varies as a function of age, with adult L2 learners being less susceptible to input than child L2 learners. Importantly, the age–susceptibility function is hypothesised to be non-linear. Moving beyond this general version, we find that the cph is conceptualised in a multitude of ways [4] . This state of affairs requires scholars to make explicit their theoretical stance and assumptions [5] , but has the obvious downside that critical findings risk being mitigated as posing a problem to only one aspect of one particular conceptualisation of the cph , whereas other conceptualisations remain unscathed. This overall vagueness concerns two areas in particular, viz. the delineation of the cph 's scope and the formulation of testable predictions. Delineating the scope and formulating falsifiable predictions are, needless to say, fundamental stages in the scientific evaluation of any hypothesis or theory, but the lack of scholarly consensus on these points seems to be particularly pronounced in the case of the cph . This article therefore first presents a brief overview of differing views on these two stages. Then, once the scope of their cph version has been duly identified and empirical data have been collected using solid methods, it is essential that researchers analyse the data patterns soundly in order to assess the predictions made and that they draw justifiable conclusions from the results. As I will argue in great detail, however, the statistical analysis of data patterns as well as their interpretation in cph research – and this includes both critical and supportive studies and overviews – leaves a great deal to be desired. Reanalysing data from a recent cph -supportive study, I illustrate some common statistical fallacies in cph research and demonstrate how one particular cph prediction can be evaluated.
Delineating the scope of the critical period hypothesis
First, the age span for a putative critical period for language acquisition has been delimited in different ways in the literature [4] . Lenneberg's critical period stretched from two years of age to puberty (which he posits at about 14 years of age) [2] , whereas other scholars have drawn the cutoff point at 12, 15, 16 or 18 years of age [6] . Unlike Lenneberg, most researchers today do not define a starting age for the critical period for language learning. Some, however, consider the possibility of the critical period (or a critical period for a specific language area, e.g. phonology) ending much earlier than puberty (e.g. age 9 years [1] , or as early as 12 months in the case of phonology [7] ).
Second, some vagueness remains as to the setting that is relevant to the cph . Does the critical period constrain implicit learning processes only, i.e. only the untutored language acquisition in immersion contexts or does it also apply to (at least partly) instructed learning? Most researchers agree on the former [8] , but much research has included subjects who have had at least some instruction in the L2.
Third, there is no consensus on what the scope of the cp is as far as the areas of language that are concerned. Most researchers agree that a cp is most likely to constrain the acquisition of pronunciation and grammar and, consequently, these are the areas primarily looked into in studies on the cph [9] . Some researchers have also tried to define distinguishable cp s for the different language areas of phonetics, morphology and syntax and even for lexis (see [10] for an overview).
Fourth and last, research into the cph has focused on ‘ultimate attainment’ ( ua ) or the ‘final’ state of L2 proficiency rather than on the rate of learning. From research into the rate of acquisition (e.g. [11] – [13] ), it has become clear that the cph cannot hold for the rate variable. In fact, it has been observed that adult learners proceed faster than child learners at the beginning stages of L2 acquisition. Though theoretical reasons for excluding the rate can be posited (the initial faster rate of learning in adults may be the result of more conscious cognitive strategies rather than to less conscious implicit learning, for instance), rate of learning might from a different perspective also be considered an indicator of ‘susceptibility’ or ‘sensitivity’ to language input. Nevertheless, contemporary sla scholars generally seem to concur that ua and not rate of learning is the dependent variable of primary interest in cph research. These and further scope delineation problems relevant to cph research are discussed in more detail by, among others, Birdsong [9] , DeKeyser and Larson-Hall [14] , Long [10] and Muñoz and Singleton [6] .
Formulating testable hypotheses
Once the relevant cph 's scope has satisfactorily been identified, clear and testable predictions need to be drawn from it. At this stage, the lack of consensus on what the consequences or the actual observable outcome of a cp would have to look like becomes evident. As touched upon earlier, cph research is interested in the end state or ‘ultimate attainment’ ( ua ) in L2 acquisition because this “determines the upper limits of L2 attainment” [9, p. 10]. The range of possible ultimate attainment states thus helps researchers to explore the potential maximum outcome of L2 proficiency before and after the putative critical period.
One strong prediction made by some cph exponents holds that post- cp learners cannot reach native-like L2 competences. Identifying a single native-like post- cp L2 learner would then suffice to falsify all cph s making this prediction. Assessing this prediction is difficult, however, since it is not clear what exactly constitutes sufficient nativelikeness, as illustrated by the discussion on the actual nativelikeness of highly accomplished L2 speakers [15] , [16] . Indeed, there exists a real danger that, in a quest to vindicate the cph , scholars set the bar for L2 learners to match monolinguals increasingly higher – up to Swiftian extremes. Furthermore, the usefulness of comparing the linguistic performance in mono- and bilinguals has been called into question [6] , [17] , [18] . Put simply, the linguistic repertoires of mono- and bilinguals differ by definition and differences in the behavioural outcome will necessarily be found, if only one digs deep enough.
A second strong prediction made by cph proponents is that the function linking age of acquisition and ultimate attainment will not be linear throughout the whole lifespan. Before discussing how this function would have to look like in order for it to constitute cph -consistent evidence, I point out that the ultimate attainment variable can essentially be considered a cumulative measure dependent on the actual variable of interest in cph research, i.e. susceptibility to language input, as well as on such other factors like duration and intensity of learning (within and outside a putative cp ) and possibly a number of other influencing factors. To elaborate, the behavioural outcome, i.e. ultimate attainment, can be assumed to be integrative to the susceptibility function, as Newport [19] correctly points out. Other things being equal, ultimate attainment will therefore decrease as susceptibility decreases. However, decreasing ultimate attainment levels in and by themselves represent no compelling evidence in favour of a cph . The form of the integrative curve must therefore be predicted clearly from the susceptibility function. Additionally, the age of acquisition–ultimate attainment function can take just about any form when other things are not equal, e.g. duration of learning (Does learning last up until time of testing or only for a more or less constant number of years or is it dependent on age itself?) or intensity of learning (Do learners always learn at their maximum susceptibility level or does this intensity vary as a function of age, duration, present attainment and motivation?). The integral of the susceptibility function could therefore be of virtually unlimited complexity and its parameters could be adjusted to fit any age of acquisition–ultimate attainment pattern. It seems therefore astonishing that the distinction between level of sensitivity to language input and level of ultimate attainment is rarely made in the literature. Implicitly or explicitly [20] , the two are more or less equated and the same mathematical functions are expected to describe the two variables if observed across a range of starting ages of acquisition.
But even when the susceptibility and ultimate attainment variables are equated, there remains controversy as to what function linking age of onset of acquisition and ultimate attainment would actually constitute evidence for a critical period. Most scholars agree that not any kind of age effect constitutes such evidence. More specifically, the age of acquisition–ultimate attainment function would need to be different before and after the end of the cp [9] . According to Birdsong [9] , three basic possible patterns proposed in the literature meet this condition. These patterns are presented in Figure 1 . The first pattern describes a steep decline of the age of onset of acquisition ( aoa )–ultimate attainment ( ua ) function up to the end of the cp and a practically non-existent age effect thereafter. Pattern 2 is an “unconventional, although often implicitly invoked” [9, p. 17] notion of the cp function which contains a period of peak attainment (or performance at ceiling), i.e. performance does not vary as a function of age, which is often referred to as a ‘window of opportunity’. This time span is followed by an unbounded decline in ua depending on aoa . Pattern 3 includes characteristics of patterns 1 and 2. At the beginning of the aoa range, performance is at ceiling. The next segment is a downward slope in the age function which ends when performance reaches its floor. Birdsong points out that all of these patterns have been reported in the literature. On closer inspection, however, he concludes that the most convincing function describing these age effects is a simple linear one. Hakuta et al. [21] sketch further theoretically possible predictions of the cph in which the mean performance drops drastically and/or the slope of the aoa – ua proficiency function changes at a certain point.
- PPT PowerPoint slide
- PNG larger image
- TIFF original image
The graphs are based on based on Figure 2 in [9] .
https://doi.org/10.1371/journal.pone.0069172.g001
Although several patterns have been proposed in the literature, it bears pointing out that the most common explicit prediction corresponds to Birdsong's first pattern, as exemplified by the following crystal-clear statement by DeKeyser, one of the foremost cph proponents:
[A] strong negative correlation between age of acquisition and ultimate attainment throughout the lifespan (or even from birth through middle age), the only age effect documented in many earlier studies, is not evidence for a critical period…[T]he critical period concept implies a break in the AoA–proficiency function, i.e., an age (somewhat variable from individual to individual, of course, and therefore an age range in the aggregate) after which the decline of success rate in one or more areas of language is much less pronounced and/or clearly due to different reasons. [22, p. 445].
DeKeyser and before him among others Johnson and Newport [23] thus conceptualise only one possible pattern which would speak in favour of a critical period: a clear negative age effect before the end of the critical period and a much weaker (if any) negative correlation between age and ultimate attainment after it. This ‘flattened slope’ prediction has the virtue of being much more tangible than the ‘potential nativelikeness’ prediction: Testing it does not necessarily require comparing the L2-learners to a native control group and thus effectively comparing apples and oranges. Rather, L2-learners with different aoa s can be compared amongst themselves without the need to categorise them by means of a native-speaker yardstick, the validity of which is inevitably going to be controversial [15] . In what follows, I will concern myself solely with the ‘flattened slope’ prediction, arguing that, despite its clarity of formulation, cph research has generally used analytical methods that are irrelevant for the purposes of actually testing it.
Inferring non-linearities in critical period research: An overview
Group mean or proportion comparisons.
[T]he main differences can be found between the native group and all other groups – including the earliest learner group – and between the adolescence group and all other groups. However, neither the difference between the two childhood groups nor the one between the two adulthood groups reached significance, which indicates that the major changes in eventual perceived nativelikeness of L2 learners can be associated with adolescence. [15, p. 270].
Similar group comparisons aimed at investigating the effect of aoa on ua have been carried out by both cph advocates and sceptics (among whom Bialystok and Miller [25, pp. 136–139], Birdsong and Molis [26, p. 240], Flege [27, pp. 120–121], Flege et al. [28, pp. 85–86], Johnson [29, p. 229], Johnson and Newport [23, p. 78], McDonald [30, pp. 408–410] and Patowski [31, pp. 456–458]). To be clear, not all of these authors drew direct conclusions about the aoa – ua function on the basis of these groups comparisons, but their group comparisons have been cited as indicative of a cph -consistent non-continuous age effect, as exemplified by the following quote by DeKeyser [22] :
Where group comparisons are made, younger learners always do significantly better than the older learners. The behavioral evidence, then, suggests a non-continuous age effect with a “bend” in the AoA–proficiency function somewhere between ages 12 and 16. [22, p. 448].
The first problem with group comparisons like these and drawing inferences on the basis thereof is that they require that a continuous variable, aoa , be split up into discrete bins. More often than not, the boundaries between these bins are drawn in an arbitrary fashion, but what is more troublesome is the loss of information and statistical power that such discretisation entails (see [32] for the extreme case of dichotomisation). If we want to find out more about the relationship between aoa and ua , why throw away most of the aoa information and effectively reduce the ua data to group means and the variance in those groups?
Comparison of correlation coefficients.
Correlation-based inferences about slope discontinuities have similarly explicitly been made by cph advocates and skeptics alike, e.g. Bialystok and Miller [25, pp. 136 and 140], DeKeyser and colleagues [22] , [44] and Flege et al. [45, pp. 166 and 169]. Others did not explicitly infer the presence or absence of slope differences from the subset correlations they computed (among others Birdsong and Molis [26] , DeKeyser [8] , Flege et al. [28] and Johnson [29] ), but their studies nevertheless featured in overviews discussing discontinuities [14] , [22] . Indeed, the most recent overview draws a strong conclusion about the validity of the cph 's ‘flattened slope’ prediction on the basis of these subset correlations:
In those studies where the two groups are described separately, the correlation is much higher for the younger than for the older group, except in Birdsong and Molis (2001) [ = [26] , JV], where there was a ceiling effect for the younger group. This global picture from more than a dozen studies provides support for the non-continuity of the decline in the AoA–proficiency function, which all researchers agree is a hallmark of a critical period phenomenon. [22, p. 448].
In Johnson and Newport's specific case [23] , their correlation-based inference that ua levels off after puberty happened to be largely correct: the gjt scores are more or less randomly distributed around a near-horizontal trend line [26] . Ultimately, however, it rests on the fallacy of confusing correlation coefficients with slopes, which seriously calls into question conclusions such as DeKeyser's (cf. the quote above).
https://doi.org/10.1371/journal.pone.0069172.g002
Lower correlation coefficients in older aoa groups may therefore be largely due to differences in ua variance, which have been reported in several studies [23] , [26] , [28] , [29] (see [46] for additional references). Greater variability in ua with increasing age is likely due to factors other than age proper [47] , such as the concomitant greater variability in exposure to literacy, degree of education, motivation and opportunity for language use, and by itself represents evidence neither in favour of nor against the cph .
Regression approaches.
Having demonstrated that neither group mean or proportion comparisons nor correlation coefficient comparisons can directly address the ‘flattened slope’ prediction, I now turn to the studies in which regression models were computed with aoa as a predictor variable and ua as the outcome variable. Once again, this category of studies is not mutually exclusive with the two categories discussed above.
In a large-scale study using self-reports and approximate aoa s derived from a sample of the 1990 U.S. Census, Stevens found that the probability with which immigrants from various countries stated that they spoke English ‘very well’ decreased curvilinearly as a function of aoa [48] . She noted that this development is similar to the pattern found by Johnson and Newport [23] but that it contains no indication of an “abruptly defined ‘critical’ or sensitive period in L2 learning” [48, p. 569]. However, she modelled the self-ratings using an ordinal logistic regression model in which the aoa variable was logarithmically transformed. Technically, this is perfectly fine, but one should be careful not to read too much into the non-linear curves found. In logistic models, the outcome variable itself is modelled linearly as a function of the predictor variables and is expressed in log-odds. In order to compute the corresponding probabilities, these log-odds are transformed using the logistic function. Consequently, even if the model is specified linearly, the predicted probabilities will not lie on a perfectly straight line when plotted as a function of any one continuous predictor variable. Similarly, when the predictor variable is first logarithmically transformed and then used to linearly predict an outcome variable, the function linking the predicted outcome variables and the untransformed predictor variable is necessarily non-linear. Thus, non-linearities follow naturally from Stevens's model specifications. Moreover, cph -consistent discontinuities in the aoa – ua function cannot be found using her model specifications as they did not contain any parameters allowing for this.
Using data similar to Stevens's, Bialystok and Hakuta found that the link between the self-rated English competences of Chinese- and Spanish-speaking immigrants and their aoa could be described by a straight line [49] . In contrast to Stevens, Bialystok and Hakuta used a regression-based method allowing for changes in the function's slope, viz. locally weighted scatterplot smoothing ( lowess ). Informally, lowess is a non-parametrical method that relies on an algorithm that fits the dependent variable for small parts of the range of the independent variable whilst guaranteeing that the overall curve does not contain sudden jumps (for technical details, see [50] ). Hakuta et al. used an even larger sample from the same 1990 U.S. Census data on Chinese- and Spanish-speaking immigrants (2.3 million observations) [21] . Fitting lowess curves, no discontinuities in the aoa – ua slope could be detected. Moreover, the authors found that piecewise linear regression models, i.e. regression models containing a parameter that allows a sudden drop in the curve or a change of its slope, did not provide a better fit to the data than did an ordinary regression model without such a parameter.
To sum up, I have argued at length that regression approaches are superior to group mean and correlation coefficient comparisons for the purposes of testing the ‘flattened slope’ prediction. Acknowledging the reservations vis-à-vis self-estimated ua s, we still find that while the relationship between aoa and ua is not necessarily perfectly linear in the studies discussed, the data do not lend unequivocal support to this prediction. In the following section, I will reanalyse data from a recent empirical paper on the cph by DeKeyser et al. [44] . The first goal of this reanalysis is to further illustrate some of the statistical fallacies encountered in cph studies. Second, by making the computer code available I hope to demonstrate how the relevant regression models, viz. piecewise regression models, can be fitted and how the aoa representing the optimal breakpoint can be identified. Lastly, the findings of this reanalysis will contribute to our understanding of how aoa affects ua as measured using a gjt .
Summary of DeKeyser et al. (2010)
I chose to reanalyse a recent empirical paper on the cph by DeKeyser et al. [44] (henceforth DK et al.). This paper lends itself well to a reanalysis since it exhibits two highly commendable qualities: the authors spell out their hypotheses lucidly and provide detailed numerical and graphical data descriptions. Moreover, the paper's lead author is very clear on what constitutes a necessary condition for accepting the cph : a non-linearity in the age of onset of acquisition ( aoa )–ultimate attainment ( ua ) function, with ua declining less strongly as a function of aoa in older, post- cp arrivals compared to younger arrivals [14] , [22] . Lastly, it claims to have found cross-linguistic evidence from two parallel studies backing the cph and should therefore be an unsuspected source to cph proponents.
The authors set out to test the following hypotheses:
- Hypothesis 1: For both the L2 English and the L2 Hebrew group, the slope of the age of arrival–ultimate attainment function will not be linear throughout the lifespan, but will instead show a marked flattening between adolescence and adulthood.
- Hypothesis 2: The relationship between aptitude and ultimate attainment will differ markedly for the young and older arrivals, with significance only for the latter. (DK et al., p. 417)
Both hypotheses were purportedly confirmed, which in the authors' view provides evidence in favour of cph . The problem with this conclusion, however, is that it is based on a comparison of correlation coefficients. As I have argued above, correlation coefficients are not to be confused with regression coefficients and cannot be used to directly address research hypotheses concerning slopes, such as Hypothesis 1. In what follows, I will reanalyse the relationship between DK et al.'s aoa and gjt data in order to address Hypothesis 1. Additionally, I will lay bare a problem with the way in which Hypothesis 2 was addressed. The extracted data and the computer code used for the reanalysis are provided as supplementary materials, allowing anyone interested to scrutinise and easily reproduce my whole analysis and carry out their own computations (see ‘supporting information’).
Data extraction
In order to verify whether we did in fact extract the data points to a satisfactory degree of accuracy, I computed summary statistics for the extracted aoa and gjt data and checked these against the descriptive statistics provided by DK et al. (pp. 421 and 427). These summary statistics for the extracted data are presented in Table 1 . In addition, I computed the correlation coefficients for the aoa – gjt relationship for the whole aoa range and for aoa -defined subgroups and checked these coefficients against those reported by DK et al. (pp. 423 and 428). The correlation coefficients computed using the extracted data are presented in Table 2 . Both checks strongly suggest the extracted data to be virtually identical to the original data, and Dr DeKeyser confirmed this to be the case in response to an earlier draft of the present paper (personal communication, 6 May 2013).
https://doi.org/10.1371/journal.pone.0069172.t001
https://doi.org/10.1371/journal.pone.0069172.t002
Results and Discussion
Modelling the link between age of onset of acquisition and ultimate attainment.
I first replotted the aoa and gjt data we extracted from DK et al.'s scatterplots and added non-parametric scatterplot smoothers in order to investigate whether any changes in slope in the aoa – gjt function could be revealed, as per Hypothesis 1. Figures 3 and 4 show this not to be the case. Indeed, simple linear regression models that model gjt as a function of aoa provide decent fits for both the North America and the Israel data, explaining 65% and 63% of the variance in gjt scores, respectively. The parameters of these models are given in Table 3 .
The trend line is a non-parametric scatterplot smoother. The scatterplot itself is a near-perfect replication of DK et al.'s Fig. 1.
https://doi.org/10.1371/journal.pone.0069172.g003
The trend line is a non-parametric scatterplot smoother. The scatterplot itself is a near-perfect replication of DK et al.'s Fig. 5.
https://doi.org/10.1371/journal.pone.0069172.g004
https://doi.org/10.1371/journal.pone.0069172.t003
To ensure that both segments are joined at the breakpoint, the predictor variable is first centred at the breakpoint value, i.e. the breakpoint value is subtracted from the original predictor variable values. For a blow-by-blow account of how such models can be fitted in r , I refer to an example analysis by Baayen [55, pp. 214–222].
Solid: regression with breakpoint at aoa 18 (dashed lines represent its 95% confidence interval); dot-dash: regression without breakpoint.
https://doi.org/10.1371/journal.pone.0069172.g005
Solid: regression with breakpoint at aoa 18 (dashed lines represent its 95% confidence interval); dot-dash (hardly visible due to near-complete overlap): regression without breakpoint.
https://doi.org/10.1371/journal.pone.0069172.g006
https://doi.org/10.1371/journal.pone.0069172.t004
https://doi.org/10.1371/journal.pone.0069172.g007
Solid: regression with breakpoint at aoa 16 (dashed lines represent its 95% confidence interval); dot-dash: regression without breakpoint.
https://doi.org/10.1371/journal.pone.0069172.g008
Solid: regression with breakpoint at aoa 6 (dashed lines represent its 95% confidence interval); dot-dash (hardly visible due to near-complete overlap): regression without breakpoint.
https://doi.org/10.1371/journal.pone.0069172.g009
https://doi.org/10.1371/journal.pone.0069172.t005
https://doi.org/10.1371/journal.pone.0069172.t006
https://doi.org/10.1371/journal.pone.0069172.t007
https://doi.org/10.1371/journal.pone.0069172.t008
In sum, a regression model that allows for changes in the slope of the the aoa – gjt function to account for putative critical period effects provides a somewhat better fit to the North American data than does an everyday simple regression model. The improvement in model fit is marginal, however, and including a breakpoint does not result in any detectable improvement of model fit to the Israel data whatsoever. Breakpoint models therefore fail to provide solid cross-linguistic support in favour of critical period effects: across both data sets, gjt can satisfactorily be modelled as a linear function of aoa .
On partialling out ‘age at testing’
As I have argued above, correlation coefficients cannot be used to test hypotheses about slopes. When the correct procedure is carried out on DK et al.'s data, no cross-linguistically robust evidence for changes in the aoa – gjt function was found. In addition to comparing the zero-order correlations between aoa and gjt , however, DK et al. computed partial correlations in which the variance in aoa associated with the participants' age at testing ( aat ; a potentially confounding variable) was filtered out. They found that these partial correlations between aoa and gjt , which are given in Table 9 , differed between age groups in that they are stronger for younger than for older participants. This, DK et al. argue, constitutes additional evidence in favour of the cph . At this point, I can no longer provide my own analysis of DK et al.'s data seeing as the pertinent data points were not plotted. Nevertheless, the detailed descriptions by DK et al. strongly suggest that the use of these partial correlations is highly problematic. Most importantly, and to reiterate, correlations (whether zero-order or partial ones) are actually of no use when testing hypotheses concerning slopes. Still, one may wonder why the partial correlations differ across age groups. My surmise is that these differences are at least partly the by-product of an imbalance in the sampling procedure.
https://doi.org/10.1371/journal.pone.0069172.t009
The upshot of this brief discussion is that the partial correlation differences reported by DK et al. are at least partly the result of an imbalance in the sampling procedure: aoa and aat were simply less intimately tied for the young arrivals in the North America study than for the older arrivals with L2 English or for all of the L2 Hebrew participants. In an ideal world, we would like to fix aat or ascertain that it at most only weakly correlates with aoa . This, however, would result in a strong correlation between aoa and another potential confound variable, length of residence in the L2 environment, bringing us back to square one. Allowing for only moderate correlations between aoa and aat might improve our predicament somewhat, but even in that case, we should tread lightly when making inferences on the basis of statistical control procedures [61] .
On estimating the role of aptitude
Having shown that Hypothesis 1 could not be confirmed, I now turn to Hypothesis 2, which predicts a differential role of aptitude for ua in sla in different aoa groups. More specifically, it states that the correlation between aptitude and gjt performance will be significant only for older arrivals. The correlation coefficients of the relationship between aptitude and gjt are presented in Table 10 .
https://doi.org/10.1371/journal.pone.0069172.t010
The problem with both the wording of Hypothesis 2 and the way in which it is addressed is the following: it is assumed that a variable has a reliably different effect in different groups when the effect reaches significance in one group but not in the other. This logic is fairly widespread within several scientific disciplines (see e.g. [62] for a discussion). Nonetheless, it is demonstrably fallacious [63] . Here we will illustrate the fallacy for the specific case of comparing two correlation coefficients.
Apart from not being replicated in the North America study, does this difference actually show anything? I contend that it does not: what is of interest are not so much the correlation coefficients, but rather the interactions between aoa and aptitude in models predicting gjt . These interactions could be investigated by fitting a multiple regression model in which the postulated cp breakpoint governs the slope of both aoa and aptitude. If such a model provided a substantially better fit to the data than a model without a breakpoint for the aptitude slope and if the aptitude slope changes in the expected direction (i.e. a steeper slope for post- cp than for younger arrivals) for different L1–L2 pairings, only then would this particular prediction of the cph be borne out.
Using data extracted from a paper reporting on two recent studies that purport to provide evidence in favour of the cph and that, according to its authors, represent a major improvement over earlier studies (DK et al., p. 417), it was found that neither of its two hypotheses were actually confirmed when using the proper statistical tools. As a matter of fact, the gjt scores continue to decline at essentially the same rate even beyond the end of the putative critical period. According to the paper's lead author, such a finding represents a serious problem to his conceptualisation of the cph [14] ). Moreover, although modelling a breakpoint representing the end of a cp at aoa 16 may improve the statistical model slightly in study on learners of English in North America, the study on learners of Hebrew in Israel fails to confirm this finding. In fact, even if we were to accept the optimal breakpoint computed for the Israel study, it lies at aoa 6 and is associated with a different geometrical pattern.
Diverging age trends in parallel studies with participants with different L2s have similarly been reported by Birdsong and Molis [26] and are at odds with an L2-independent cph . One parsimonious explanation of such conflicting age trends may be that the overall, cross-linguistic age trend is in fact linear, but that fluctuations in the data (due to factors unaccounted for or randomness) may sometimes give rise to a ‘stretched L’-shaped pattern ( Figure 1, left panel ) and sometimes to a ‘stretched 7’-shaped pattern ( Figure 1 , middle panel; see also [66] for a similar comment).
Importantly, the criticism that DeKeyser and Larsson-Hall levy against two studies reporting findings similar to the present [48] , [49] , viz. that the data consisted of self-ratings of questionable validity [14] , does not apply to the present data set. In addition, DK et al. did not exclude any outliers from their analyses, so I assume that DeKeyser and Larsson-Hall's criticism [14] of Birdsong and Molis's study [26] , i.e. that the findings were due to the influence of outliers, is not applicable to the present data either. For good measure, however, I refitted the regression models with and without breakpoints after excluding one potentially problematic data point per model. The following data points had absolute standardised residuals larger than 2.5 in the original models without breakpoints as well as in those with breakpoints: the participant with aoa 17 and a gjt score of 125 in the North America study and the participant with aoa 12 and a gjt score of 117 in the Israel study. The resultant models were virtually identical to the original models (see Script S1 ). Furthermore, the aoa variable was sufficiently fine-grained and the aoa – gjt curve was not ‘presmoothed’ by the prior aggregation of gjt across parts of the aoa range (see [51] for such a criticism of another study). Lastly, seven of the nine “problems with supposed counter-evidence” to the cph discussed by Long [5] do not apply either, viz. (1) “[c]onfusion of rate and ultimate attainment”, (2) “[i]nappropriate choice of subjects”, (3) “[m]easurement of AO”, (4) “[l]eading instructions to raters”, (6) “[u]se of markedly non-native samples making near-native samples more likely to sound native to raters”, (7) “[u]nreliable or invalid measures”, and (8) “[i]nappropriate L1–L2 pairings”. Problem No. 5 (“Assessments based on limited samples and/or “language-like” behavior”) may be apropos given that only gjt data were used, leaving open the theoretical possibility that other measures might have yielded a different outcome. Finally, problem No. 9 (“Faulty interpretation of statistical patterns”) is, of course, precisely what I have turned the spotlights on.
Conclusions
The critical period hypothesis remains a hotly contested issue in the psycholinguistics of second-language acquisition. Discussions about the impact of empirical findings on the tenability of the cph generally revolve around the reliability of the data gathered (e.g. [5] , [14] , [22] , [52] , [67] , [68] ) and such methodological critiques are of course highly desirable. Furthermore, the debate often centres on the question of exactly what version of the cph is being vindicated or debunked. These versions differ mainly in terms of its scope, specifically with regard to the relevant age span, setting and language area, and the testable predictions they make. But even when the cph 's scope is clearly demarcated and its main prediction is spelt out lucidly, the issue remains to what extent the empirical findings can actually be marshalled in support of the relevant cph version. As I have shown in this paper, empirical data have often been taken to support cph versions predicting that the relationship between age of acquisition and ultimate attainment is not strictly linear, even though the statistical tools most commonly used (notably group mean and correlation coefficient comparisons) were, crudely put, irrelevant to this prediction. Methods that are arguably valid, e.g. piecewise regression and scatterplot smoothing, have been used in some studies [21] , [26] , [49] , but these studies have been criticised on other grounds. To my knowledge, such methods have never been used by scholars who explicitly subscribe to the cph .
I suspect that what may be going on is a form of ‘confirmation bias’ [69] , a cognitive bias at play in diverse branches of human knowledge seeking: Findings judged to be consistent with one's own hypothesis are hardly questioned, whereas findings inconsistent with one's own hypothesis are scrutinised much more strongly and criticised on all sorts of points [70] – [73] . My reanalysis of DK et al.'s recent paper may be a case in point. cph exponents used correlation coefficients to address their prediction about the slope of a function, as had been done in a host of earlier studies. Finding a result that squared with their expectations, they did not question the technical validity of their results, or at least they did not report this. (In fact, my reanalysis is actually a case in point in two respects: for an earlier draft of this paper, I had computed the optimal position of the breakpoints incorrectly, resulting in an insignificant improvement of model fit for the North American data rather than a borderline significant one. Finding a result that squared with my expectations, I did not question the technical validity of my results – until this error was kindly pointed out to me by Martijn Wieling (University of Tübingen).) That said, I am keen to point out that the statistical analyses in this particular paper, though suboptimal, are, as far as I could gather, reported correctly, i.e. the confirmation bias does not seem to have resulted in the blatant misreportings found elsewhere (see [74] for empirical evidence and discussion). An additional point to these authors' credit is that, apart from explicitly identifying their cph version's scope and making crystal-clear predictions, they present data descriptions that actually permit quantitative reassessments and have a history of doing so (e.g. the appendix in [8] ). This leads me to believe that they analysed their data all in good conscience and to hope that they, too, will conclude that their own data do not, in fact, support their hypothesis.
I end this paper on an upbeat note. Even though I have argued that the analytical tools employed in cph research generally leave much to be desired, the original data are, so I hope, still available. This provides researchers, cph supporters and sceptics alike, with an exciting opportunity to reanalyse their data sets using the tools outlined in the present paper and publish their findings at minimal cost of time and resources (for instance, as a comment to this paper). I would therefore encourage scholars to engage their old data sets and to communicate their analyses openly, e.g. by voluntarily publishing their data and computer code alongside their articles or comments. Ideally, cph supporters and sceptics would join forces to agree on a protocol for a high-powered study in order to provide a truly convincing answer to a core issue in sla .
Supporting Information
Dataset s1..
aoa and gjt data extracted from DeKeyser et al.'s North America study.
https://doi.org/10.1371/journal.pone.0069172.s001
Dataset S2.
aoa and gjt data extracted from DeKeyser et al.'s Israel study.
https://doi.org/10.1371/journal.pone.0069172.s002
Script with annotated R code used for the reanalysis. All add-on packages used can be installed from within R.
https://doi.org/10.1371/journal.pone.0069172.s003
Acknowledgments
I would like to thank Irmtraud Kaiser (University of Fribourg) for helping me to get an overview of the literature on the critical period hypothesis in second language acquisition. Thanks are also due to Martijn Wieling (currently University of Tübingen) for pointing out an error in the R code accompanying an earlier draft of this paper.
Author Contributions
Analyzed the data: JV. Wrote the paper: JV.
- 1. Penfield W, Roberts L (1959) Speech and brain mechanisms. Princeton: Princeton University Press.
- 2. Lenneberg EH (1967) Biological foundations of language. New York: Wiley.
- View Article
- Google Scholar
- 10. Long MH (2007) Problems in SLA. Mahwah, NJ: Lawrence Erlbaum.
- 14. DeKeyser R, Larson-Hall J (2005) What does the critical period really mean? In: Kroll and De Groot [75], 88–108.
- 19. Newport EL (1991) Contrasting conceptions of the critical period for language. In: Carey S, Gelman R, editors, The epigenesis of mind: Essays on biology and cognition, Hillsdale, NJ: Lawrence Erlbaum. 111–130.
- 20. Birdsong D (2005) Interpreting age effects in second language acquisition. In: Kroll and De Groot [75], 109–127.
- 22. DeKeyser R (2012) Age effects in second language learning. In: Gass SM, Mackey A, editors, The Routledge handbook of second language acquisition, London: Routledge. 442–460.
- 24. Weisstein EW. Discontinuity. From MathWorld –A Wolfram Web Resource. Available: http://mathworld.wolfram.com/Discontinuity.html . Accessed 2012 March 2.
- 27. Flege JE (1999) Age of learning and second language speech. In: Birdsong [76], 101–132.
- 36. Champely S (2009) pwr: Basic functions for power analysis. Available: http://cran.r-project.org/package=pwr . R package, version 1.1.1.
- 37. R Core Team (2013) R: A language and environment for statistical computing. Available: http://www.r-project.org/ . Software, version 2.15.3.
- 47. Hyltenstam K, Abrahamsson N (2003) Maturational constraints in sla . In: Doughty CJ, Long MH, editors, The handbook of second language acquisition, Malden, MA: Blackwell. 539–588.
- 49. Bialystok E, Hakuta K (1999) Confounded age: Linguistic and cognitive factors in age differences for second language acquisition. In: Birdsong [76], 161–181.
- 52. DeKeyser R (2006) A critique of recent arguments against the critical period hypothesis. In: Abello-Contesse C, Chacón-Beltrán R, López-Jiménez MD, Torreblanca-López MM, editors, Age in L2 acquisition and teaching, Bern: Peter Lang. 49–58.
- 55. Baayen RH (2008) Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.
- 56. Fox J (2002) Robust regression. Appendix to An R and S-Plus Companion to Applied Regression. Available: http://cran.r-project.org/doc/contrib/Fox-Companion/appendix.html .
- 57. Ripley B, Hornik K, Gebhardt A, Firth D (2012) MASS: Support functions and datasets for Venables and Ripley's MASS. Available: http://cran.r-project.org/package=MASS . R package, version 7.3–17.
- 58. Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with R. New York: Springer.
- 59. Pinheiro J, Bates D, DebRoy S, Sarkar D, R Core Team (2013) nlme: Linear and nonlinear mixed effects models. Available: http://cran.r-project.org/package=nlme . R package, version 3.1–108.
- 65. Field A (2009) Discovering statistics using SPSS. London: SAGE 3rd edition.
- 66. Birdsong D (2009) Age and the end state of second language acquisition. In: Ritchie WC, Bhatia TK, editors, The new handbook of second language acquisition, Bingley: Emerlad. 401–424.
- 75. Kroll JF, De Groot AMB, editors (2005) Handbook of bilingualism: Psycholinguistic approaches. New York: Oxford University Press.
- 76. Birdsong D, editor (1999) Second language acquisition and the critical period hypothesis. Mahwah, NJ: Lawrence Erlbaum.
Our systems are now restored following recent technical disruption, and we’re working hard to catch up on publishing. We apologise for the inconvenience caused. Find out more: https://www.cambridge.org/universitypress/about-us/news-and-blogs/cambridge-university-press-publishing-update-following-technical-disruption
We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .
Login Alert
- > Journals
- > Bilingualism: Language and Cognition
- > Volume 21 Issue 5
- > Critical periods for language acquisition: New insights...
Article contents
Critical periods for language acquisition: new insights with particular reference to bilingualism research.
Published online by Cambridge University Press: 23 October 2018
One of the best-known claims from language acquisition research is that the capacity to learn languages is constrained by maturational changes, with particular time windows (aka ‘critical’ or ‘sensitive’ periods) better suited for language learning than others. Evidence for the critical period hypothesis (CPH) comes from a number of sources demonstrating that age is a crucial predictor for language attainment and that the capacity to learn language diminishes with age. To take just one example, a recent study by Hartshorne, Tenenbaum and Pinker ( 2018 ) identified a ‘sharply-defined critical period’ for grammar learning, and a steady decline thereafter, based on a very large dataset (of 2/3 million English Speakers) that allowed them to disentangle critical-period effects from non-age factors (e.g., amount of experience) affecting grammatical performance. Other evidence for the CPH comes from research with individuals who were deprived of linguistic input during the critical period (Curtiss, 1977 ) and were consequently unable to acquire language properly. Moreover, neurobiological research has shown that critical periods affect the neurological substrate for language processing, specifically for grammar (Wartenburger, Heekeren, Abutalebi, Cappa, Villringer & Perani, 2003 ).
One of the best-known claims from language acquisition research is that the capacity to learn languages is constrained by maturational changes, with particular time windows (aka ‘critical’ or ‘sensitive’ periods) better suited for language learning than others. Evidence for the critical period hypothesis (CPH) comes from a number of sources demonstrating that age is a crucial predictor for language attainment and that the capacity to learn language diminishes with age. To take just one example, a recent study by Hartshorne, Tenenbaum and Pinker ( Reference Hartshorne, Tenenbaum and Pinker 2018 ) identified a ‘sharply-defined critical period’ for grammar learning, and a steady decline thereafter, based on a very large dataset (of 2/3 million English Speakers) that allowed them to disentangle critical-period effects from non-age factors (e.g., amount of experience) affecting grammatical performance. Other evidence for the CPH comes from research with individuals who were deprived of linguistic input during the critical period (Curtiss, Reference Curtiss 1977 ) and were consequently unable to acquire language properly. Moreover, neurobiological research has shown that critical periods affect the neurological substrate for language processing, specifically for grammar (Wartenburger, Heekeren, Abutalebi, Cappa, Villringer & Perani, Reference Wartenburger, Heekeren, Abutalebi, Cappa, Villringer and Perani 2003 ).
In bilingualism research, the CPH has received a somewhat mixed response, with some researchers plainly denying that critical periods constrain language acquisition (e.g., Bialystok & Kroll, Reference Bialystok and Kroll 2018 ) and others having ‘little doubt’ that language acquisition is subject to critical period effects (Meisel, Reference Meisel, Boeckx and Grohmann 2013 : 71). It is true that early onsets of bilingual first language acquisition (during childhood) do indeed typically yield better linguistic skills than later ones, in line with the CPH. On the other hand, individuals with early onsets of acquisition of a particular language are typically also younger when they learn that language and have a longer time of exposure than individuals with a later onset of acquisition. Given these potentially confounding factors, supposed critical period effects might be open to alternative interpretations.
Our keynote article (Mayberry & Kluender, Reference Mayberry and Kluender 2018a ) offers a new challenging perspective on the CPH by relying mainly on studies of the acquisition of sign languages, the specific learning circumstances of which offer a unique opportunity to disentangle genuine critical-period effects from non-age factors affecting linguistic performance. Mayberry and Kluender specifically compare linguistic outcomes of the acquisition of sign languages in post-childhood L2 learners with that of post-childhood L1 learners. Their most striking finding is that late L1 learners perform significantly worse in morphology, syntax and phonology than late L2 learners. This contrast appears to be unrelated to non-linguistic cognitive or motivational factors but is attributed instead to very late L1 learners having developed an incomplete brain/language system during childhood brain maturation. L2 learners, on the other hand, have already established a fully-fledged brain/language system during this period. Mayberry and Kluender conclude from the more substantial age-of-acquisition effect in adult L1 than in adult L2 learners that there is a critical period for the acquisition of a first language only, whereas L2 development is affected by other factors.
Fifteen commentaries, most of which were specifically selected to represent different views on the CPH from the perspective of bilingualism research, accompany the keynote article. Many commentators praise the keynote article for drawing attention to the acquisition of sign languages, which through comparisons of late L1 and L2 learners contributes important insights for our understanding of a critical or sensitive period for the acquisition of language. Woll ( Reference Woll 2018 ) reports an additional case of late L1 acquisition of (British) Sign language, a deaf person with very late exposure to L1, who exhibits severe difficulties with syntax and phonology despite intact cognitive skills, in line with the findings reported in the keynote article. On the other hand, Mayberry and Kluender's ( Reference Mayberry and Kluender 2018 a) claim that maturational factors (viz. critical or sensitive periods) do not affect L2 acquisition has received a less positive response from many commentators. Several commentators point to evidence indicating age-of-acquisition effects on L2 speakers’ linguistic skill and to models of L2 acquisition that account for the role of maturational constraints implicated by the CPH (Abrahamsson, Reference Abrahamsson 2018 ; DeKeyser, Reference DeKeyser 2018 ; Hyltenstam, Reference Hyltenstam 2018 ; Long & Granena, Reference Long and Granena 2018 ; Newport, Reference Newport 2018 ; Reh, Arredondo & Werker, Reference Reh, Arredondo and Werker 2018 ; Veríssimo, Reference Veríssimo 2018 ). As opposed to these researchers, some commentators question the role of critical or sensitive periods for language not only for L2 but also for L1 acquisition (Bialystok & Kroll, Reference Bialystok and Kroll 2018 ; Flege, Reference Flege 2018 ). Other commentators highlight specific limitations of the proposed account and of the data presented in its support. Birdsong and Quinto-Pozos ( Reference Birdsong and Quinto-Pozos 2018 ) note that what is missing from Mayberry and Kluender's comparison of late L1 vs. L2 signers is a role for bilingualism, arguing that comparing bilinguals with monolinguals will always reveal differences regardless of the age of L2 acquisition. Emmorey ( Reference Emmorey 2018 ) questions the keynote article's claim that if L2 outcomes were fully under the control of a critical period, they should not be as variable as they are and affected by cognitive or motivational factors, by pointing out that this variability does indeed extend to L1 learners. Lillo-Martin ( Reference Lillo-Martin 2018 ) points out that there may be domain-specific splits with respect to critical periods, with different age cutoffs for different linguistic phenomena, a possibility that is not considered in any detail in the keynote article (see also Veríssimo, Reference Veríssimo 2018 ). Finally, Bley-Vroman ( 2018 ) and White ( Reference White 2018 ) use the evidence presented in our keynote article to address the question of whether or not domain-specific learning mechanisms are available to adult language learners; see also Clahsen & Muysken ( Reference Clahsen and Muysken 1986 ; Reference Clahsen and Muysken 1989 ).
In their response, Mayberry and Kluender ( Reference Mayberry and Kluender 2018b ) highlight points of agreement, clear up misunderstandings, admit current limitations of their proposal, and welcome suggestions for future research. Most importantly, however, in the face of the commentaries Mayberry and Kluender ( Reference Mayberry and Kluender 2018b ) modify their original claim of a critical period for L1 acquisition only. They now sympathize with the idea that there are critical periods for both L1 and L2 acquisition, but with less severe AoA effects on late L2 acquisition than on delayed L1 acquisition, due to L2 speakers having learnt another language early in life; see Hyltenstam ( Reference Hyltenstam 2018 ) and Newport ( Reference Newport 2018 ).
We hope our readers will enjoy the keynote article together with the commentaries and the authors’ response as well as the interesting regular research articles and research notes presented in the current issue.
This article has been cited by the following publications. This list is generated based on data provided by Crossref .
- Google Scholar
View all Google Scholar citations for this article.
Save article to Kindle
To save this article to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .
Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Find out more about the Kindle Personal Document Service.
- Volume 21, Issue 5
- JUBIN ABUTALEBI (a1) and HARALD CLAHSEN (a2)
- DOI: https://doi.org/10.1017/S1366728918001025
Save article to Dropbox
To save this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about saving content to Dropbox .
Save article to Google Drive
To save this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about saving content to Google Drive .
Reply to: Submit a response
- No HTML tags allowed - Web page URLs will display as text only - Lines and paragraphs break automatically - Attachments, images or tables are not permitted
Your details
Your email address will be used in order to notify you when your comment has been reviewed by the moderator and in case the author(s) of the article or the moderator need to contact you directly.
You have entered the maximum number of contributors
Conflicting interests.
Please list any fees and grants from, employment by, consultancy for, shared ownership in or any close relationship with, at any time over the preceding 36 months, any organisation whose interests may be affected by the publication of the response. Please also list any non-financial associations or interests (personal, professional, political, institutional, religious or other) that a reasonable reader would want to know about in relation to the submitted work. This pertains to all the authors of the piece, their spouses or partners.
ORIGINAL RESEARCH article
Critical period in second language acquisition: the age-attainment geometry.
- Teachers College, Columbia University, New York City, NY, United States
One of the most fascinating, consequential, and far-reaching debates that have occurred in second language acquisition research concerns the Critical Period Hypothesis [ 1 ]. Although the hypothesis is generally accepted for first language acquisition, it has been hotly debated on theoretical, methodological, and practical grounds for second language acquisition, fueling studies reporting contradictory findings and setting off competing explanations. The central questions are: Are the observed age effects in ultimate attainment confined to a bounded period, and if they are, are they biologically determined or maturationally constrained? In this article, we take a sui generis , interdisciplinary approach that leverages our understanding of second language acquisition and of physics laws of energy conservation and angular momentum conservation, mathematically deriving the age-attainment geometry. The theoretical lens, termed Energy Conservation Theory for Second Language Acquisition, provides a macroscopic perspective on the second language learning trajectory across the human lifespan.
Introduction
The Critical Period Hypothesis (CPH), as proposed by [ 1 ], that nativelike proficiency is only attainable within a finite period, extending from early infancy to puberty, has generally been accepted in language development research, but more so for first language acquisition (L1A) than for second language acquisition (L2A).
In the context of L2A, there are two parallel facts that appear to compound the difficulty of establishing the validity of CPH. One is that there is a stark difference in the level of ultimate attainment between child and adult learners. “Children eventually reach a more native-like level of proficiency than learners who start learning a second language as adults” ([ 2 ], p . 360). But this fact exists alongside another fact, namely, that there are vast differences in ultimate attainment among older learners. [ 3 ] observed:
Although few adults, if any, are completely successful, and many fail miserably, there are many who achieve very high levels of proficiency, given enough time, input, and effort, and given the right attitude, motivation, and learning environment. ( p . 13).
The dual facets of inter-learner differential success are at the nexus of second language acquisition research. As [ 4 ] once noted:
One of the enduring and fascinating problems confronting researchers of second language acquisition is whether adults can ever acquire native-like competence in a second language, or whether this is an accomplishment reserved for children who start learning at a relatively early age. As a secondary issue, there is the question of whether those rare cases of native-like success reported amongst adult learners are indeed what they seem, and if they are, how it is that such people can be successful when the vast majority are palpably not. ( p . 219).
The primary question Kellerman raised here is, in essence, a critical period (CP) question, concerning differential attainment between child and adult learners, and his secondary question relates to differential attainment among adult learners.
As of this writing, neither question has been settled. Instead, the two phenomena are often seen conflated in debates, including taking evidence for one as counter-evidence for the other (see, e.g., [ 5 ]). By and large, it would seem that the debate has come down to a matter of interpretation; the same facts are interpreted differently as evidence for or against CPH (see, e.g., [ 5 – 7 ]). This state of affairs, tinted with ideological differences over the role of nature and/or nature in language development, continues to put a tangible understanding of either phenomenon out of reach, let alone a coherent understanding of both phenomena. In order to break out of the rut of ‘he said, she said,” we need to engage in systems thinking.
Our research sought to juxtapose child and adult learners, as some researchers have, conceptually, attempted (see, e.g., [ 8 – 11 ]). Specifically, we built on and extended an interdisciplinary model of L2A, Energy Conservation Theory for L2A (ECT-L2A) [ 12 , 13 ], originally developed to account for differential attainment among adult learners, to child learners. In so doing, we sought to gain a coherent understanding of the dual facets of inter-learner differential success in L2A, in addition to mathematically obtaining the geometry of the age-attainment function, a core concern of the CPH/L2A debate.
In what follows, we first provide a quick overview of the CPH research in L2A. We then introduce ECT-L2A. Next, we extend ECT-L2A to the age issue, mathematically deriving the age-attainment function. After that, we discuss the resultant geometry and the fundamental nature of CPH/L2A, and, more broadly, L2 attainment across the human lifespan. We conclude by suggesting a number of avenues for furthering the research on CPH within the framework of ECT-L2A.
However, before we proceed, it is necessary to note two “boundary conditions” we have set for our work. First, the linguistic domain in which we theorize inter-learner differential attainment concerns only the grammatical/computational aspects of language, or what [ 14 , 15 ] calls basic language cognition, which concerns aspects of language where native speakers show little variance. As [ 2 ] has aptly pointed out, much of the confusion in the CPH-L2A debate is attributable to a lack of agreement on the scope of linguistic areas affected by CP. Second, we are only concerned with naturalistic acquisition (i.e., acquisition happens in an input-rich or immersion environment), not instructed learning (i.e., an input-poor environment). These two assumptions are often absent in CPH/L2A research, leading to the different circumstances under which researchers interpret the CP notion and empirical results (for discussion, see [ 7 ]).
The critical period hypothesis in L2A
To date, two questions have dominated the research and debate on CPH/L2A: What counts as evidence of a critical period? What accounts for the age-attainment difference between younger learners and older learners? More than 4 decades of research on CPH/L2A- from [ 16 ] to [ 17 ] to [ 18 ]—have, in the main, found an inverse correlation between the age of acquisition (AoA) and the level of grammatical attainment (see also [ 19 ], for a meta-analysis); “the age of acquisition is strongly negatively correlated with ultimate second language proficiency for grammar as well as for pronunciation” ([ 20 ], p . 88).
However, views are almost orthogonal over whether the observed inverse correlation can count as evidence of CPH or the observed difference is attributable to brain maturation (see, e.g., [ 5 , 7 , 21 – 35 ]).
For some researchers, true evidence or falsification of CPH for L2A must be tied to whether or not late learners can attain a native-like level of proficiency (e.g., [ 36 ]). Others contend that the nativelikeness threshold, in spite of it being “the most central aspect of the CPH” ([ 2 ], p . 362), is problematic, arguing that monolingual-like native attainment is simply impossible for L2 learners [ 37 , 38 ]. Echoing this view, [ 39 ] offered:
[Sequential] bilinguals are not “two monolinguals in one” in any social, psycholinguistic, or cognitive neurofunctional sense. From this perspective, it is of questionable methodological value to quantify bilinguals’ linguistic attainment as a proportion of monolinguals’ attainment, with those bilinguals reaching 100% levels of attainment considered nativelike. ( p . 121).
In the meantime, empirical research into adult learners have consistently produced evidence of selective nativelike attainment, that is, nativelikeness is attained vis-à-vis some aspects of the target language but not others. These studies employed a variety of methodologies, including cross-sectional studies and longitudinal case studies (see, e.g., [ 40 – 56 ]). Some researchers (e.g., [ 55 , 57 ]) take the selective nativelikeness as falsifying evidence of CPH/L2A; other researchers disagree (see, e.g., [ 36 ]).
Leaving aside the vexed issue of nativelikeness, 1 Birdsong [ 58 ], among others, postulated that CPH/L2A must ultimately pass geometric tests: if studies comparing younger learners and older learners yield the geometry of a “stretched Z” for the age-attainment function, that would prove the validity of CPH/L2A, or falsify it, if otherwise. The stretched Z or inverted S [ 20 ] references a bounded period in which the organism exhibits heightened neural plasticity and sensitivity to linguistic stimuli from the environment. This period has certain temporal and geometric features. Temporally, it extends from early infancy to puberty, coinciding with the time during which the brain undergoes maturation [ 1 , 36 , 59 – 62 ]. Geometrically, this period should exhibit two points of inflection or discontinuities, viz, “an abrupt onset or increase of sensitivity, a plateau of peak sensitivity, followed by a gradual offset or decline, with subsequent flattening of the degree of sensitivity” ([ 58 ], p . 111).
By the temporal and geometric hallmarks, few studies seem to have confirmed CPH/L2A, not even those that have allegedly found stark evidence. A case in point is the [ 17 ] study, which reported what appears to be clear-cut evidence of CPH/L2A: r = −.87, p <.01 for the early age of arrival (AoA) group and r = −.16, p >.05 for the late AoA group. As Johnson and Newport described it, “test performance was linearly related to [AoA] up to puberty; after puberty, performance was low but highly variable and unrelated to [AoA],” which supports “the conclusion that a critical period for language acquisition extends its effects to second language acquisition” ( p . 60). However, this claim has been contested.
Focusing on the geometry of the results, [ 58 ] pointed out that the random distribution of test scores within the late AoA group “does not license the conclusion that “through adulthood the function is low and flat” or the corresponding interpretation that “the shape of the function thus supports the claim that the effects of age of acquisition are effects of the maturational state of the learner” ([ 17 ], p . 79)” ( p . 117). Birdsong argued that if CPH holds for L2A, the performance scores of the late AoA group should be distributed horizontally in addition to showing marginal correlation with age. Accordingly, the random distribution of scores could only be taken as indicative of “a lack of systematic relationship between the performance and the AoA and not of a “levelling off of ultimate performance among those exposed to the language after puberty” ([ 17 ], p . 79)” ([ 58 ], p . 118).
Interpreting the same study, other researchers such as [ 20 ] did not set their sights as much on the random distribution of the performance scores among the late learners as on the discontinuity between the early AoA and late AoA groups, arguing that the qualitative difference is sufficient evidence of CPH/L2A.
If geometric satisfaction is one flash point in CPH/L2A research, explaining random distribution of performance scores or, essentially, differential attainment among late learners counts as another. Analyses of late learners’ ultimate attainment (e.g., [ 10 , 22 , 26 , 43 , 63 – 67 ]) have yielded a host of cognitive, socio-psychological, or experiential factors that can be associated with inter-learner differential attainment among late learners. The question, then, is whether or not these non-age factors confound, or even interact with, the age or maturational effect (see discussion in [ 2 , 68 – 72 ]. As Newport [ 7 ] aptly asked, “why cannot other variables interact with age effects?” ( p . 929).
These are undoubtedly complex questions for which sophisticated solutions are needed—beyond the methodological repairs many have thought are solely needed in advancing CPH/L2A research (see, e.g., [ 19 , 67 ]). In the remainder of this article, we take a different tack to the age issue, adopting a theoretical, hybrid approach, ECT-L2A [ 12 , 13 ], to mathematically derive the age-attainment function.
Energy-Conservation Theory for L2A
ECT-L2A is a theoretical model originally developed to account for the divergent states of ultimate attainment in adult L2A [ 12 , 13 ]. Drawing on the physics laws of energy conservation and angular momentum conservation, it theorizes the dynamic transformation and conservation of internal energies (i.e., from the learner) and external energies (i.e., from the environment) in rendering the learner’s ultimate attainment. This model, thus, takes into account nature and nurture factors, and specifically, uses five parameters - the linguistic environment or input, learner motivation, learner aptitude, distance between the L1 and the target language (TL) and the developing learner—and their interaction to account for levels of L2 ultimate attainment.
ECT-L2A draws a number of parallels between mechanical energies and human learning energies: kinetic energy for motivation and aptitude energy, potential energy for environmental energy, 2 and centrifugal energy for L1-TL deviation energy (for discussion, see [ 12 ]). These energies each perform a unique yet dynamic role. As the learner progresses in the developmental process, the energies shift in their dominance, while the total energy remains constant.
Mathematically, ECT-L2A reads as follows:
where ζ r denotes the learner’s motivational energy, r the learner’s position in the learning process relative to the TL, η the distance between L1 and TL, and ρ the input of TL. According to Eq. 1 , the total learning energy, ∈ , comes from the sum of motivation energy ζ r , aptitude (a constant) Λ , deviation energy η 2 r 2 , and environmental energy - ρ r .
The energy types included in Eq. 1 are embodiments of nature and nurture contributions. The potential energy or TL traction, - ρ r , represents the external or environmental energy, while the kinetic or motivational energy, ζ r , along with aptitude, Λ , and the centrifugal or deviation energy η 2 r 2 represent the internal energies.
Under the overarching condition of the total energy being the same or conserved throughout the learning process, ϵ = constant , each type of energy performs a different role, with one converting to another over time as the position of the learner changes in the developmental process.
For mathematical and conceptual convenience, (1) is rewritten into (2) which contains the effective potential energy, U eff (r) .
where U e f f r = η 2 r 2 − ρ r . In other words, the effective potential energy is the sum of deviation energy and the potential energy (see further breakdown in the next section).
The L2A energy system as depicted here is true of every learner, meaning that the total energy is constant for a single learner. But the total energy varies from learner to learner. Accordingly, different learners may reach different levels of ultimate attainment (i.e., closer or more distant from the TL), r 0 . This is illustrated in Figure 1 , where r 0 and r 0 ′ represent the ultimate attainments for learners with different amounts of total energy, ϵ >0 or ϵ <0.
FIGURE 1 . Inter-learner differential ultimate attainment as a function of different amounts of total energy: ϵ >0; ϵ < 0; ϵ = ϵ min [ 12 , 13 ].
Key to understanding Figure 1 is that it is the individual’s total energy that determines their level of attainment. Of the three scenarios on display here, ECT-L2A is only concerned with the case of ϵ ≥0, which represents the unbound process (r 0 , ∞), ignoring the bounded processes of ϵ < 0; ε = ϵmin.
The central thesis of ECT-L2A, as expressed in Eq. 1 , is that the moment a learner begins to receive substantive exposure to the TL, s/he enters a ‘gravitational’ field or a developmental ecosystem in which s/he is initially driven by kinetic or motivational energy, increasingly subject to the traction of the potential or environmental energy, but eventually stonewalled by the deviation energy or centrifugal barrier, resulting in an asymptotic endstate. This trajectory is further elaborated below.
The developmental trajectory depicted and forecast by ECT-L2A
The L2A trajectory begins with the learner at the outset of the learning process or at infinity (r = ∞). Initially, their progression toward the central source, i.e., the TL, is driven almost entirely by their motivation energy and aptitude, as expressed in Eq. 3 .
As learning proceeds, but with r still large (i.e., the learner still distant from the target) and the deviation energy much weaker than the environmental energy, η 2 r 2 ≪ ρ r (due to the second power of r ), the motivation energy rises as a result of its “interaction” with the environmental energy− ρ r , in which case the environmental energy transfers to the motivation energy. Mathematically, this is expressed in Eq. 4 .
As learning further progresses, the environmental energy - ρ r becomes dominant before yielding to the deviation energy η 2 r 2 . Eventually, the deviation energy overrides the environmental energy, as expressed in Eq. 1 , repeated below as Eq. 5 for ease of reference.
The deviation energy is so powerful that it draws the learner away from the target and their learning reaches an asymptote, where their motivation energy becomes minimal, ζ ( r 0 ) = 0, as expressed in Eq. 6 .
At this point, all other energies submit to the deviation energy, including the initial motivation energy ζ (∞) and some of the potential or environmental energy. Consequently, further exposure to TL input would not be of substantive help, meaning that it would not move the learner markedly closer to the target.
Figure 2 gives a geometric expression of the L1-TL deviation η, which is akin to the angular momentum of an object moving in a central force field [ 73 – 75 ]. The deviation from the TL, signifying the distance between the L1 and the TL, varies with different L1-TL pairings. For example, the distance index, according to the Automated Similarity Judgment Program Database [ 76 ], is 90.25 for Italian and English but 100.33 for Italian and Chinese.
FIGURE 2 . Geometric description of the deviation parameter η .
Figure 3 illustrates differential ultimate attainment (indicated by r 0 ) as a function of the deviation parameter η. As η increases, the level of attainment is lower or the attainment is further away from the target ( r = 0).
FIGURE 3 . Effective potentials U eff with different values of η [ 12 , 13 ].
For adult L2A, ECT-L2A predicts, inter alia , that high attainment is possible but full attainment is not. In other words, near-nativelike attainment is possible, but complete-nativelike attainment is not. ECT-L2A also predicts that while motivation and aptitude are part and parcel of the total energy of a given L2 learner, their role is largely confined to the earlier stage of development. Most of all, ECT-L2A predicts that the L1-L2 deviation is what keeps L2 attainment at asymptote.
For L2 younger learners, ECT-L2A also makes a number of predictions to which we now turn.
ECT-L2A vis-à-vis younger learners
As highlighted above, the deviation energy is what leads L2 attainment to an asymptote. It follows that as long as η (i.e., the L1-TL distance) is non-zero, the learner’s ultimate attainment, r 0 , will always eventuate in an asymptote. As shown in Figure 3 , the larger the deviation r 0 , the more distant the ultimate attainment r 0 is from the TL. Put differently, a larger η portends that learning would reach an asymptote earlier or that the ultimate attainment would be less native-like. But how does that work for child L2A?
On the ECT-L2A account, it is the low η value that determines child learners’ superior attainment. In child L2 learners, the deviation is low, because of the incipient or underdeveloped L1. However, as the L1 develops, the η value grows until it becomes a constant, presumably happening around puberty 3 , hence coinciding with the offset of the critical period [ 1 ]. As shown in Figures 1 , 3 , the smaller the deviation, η, the closer r 0 (i.e., the ultimate level of attainment) is to the TL or the higher the ultimate attainment.
From Eq. 6 the ultimate attainment of any L2 learner, irrespective of age, can be mathematically derived:
where ε = ϵ – Λ (i.e., total energy minus aptitude). r 0 here again denotes ultimate attainment. The upper panel in Figure 4 displays the geometry of ultimate attainment as a function of deviation, η.
FIGURE 4 . Double non-linearity of r 0 η [ (A) : first non-linearity] and η t [ (B) : second non-linearity] at early AoA.
For a given child learner, η is a constant, but different child learners can have a different η value, depending on their AoA . Herein lies a crucial difference from adult learning where η is a constant for all learners because of their uniform late AoA or age of acquisition and because their L1 has solidified. Adult learning starts at a time when the deviation between their L1 and the TL has become fixed, so to speak, as a result of having mastered their L1 (see the lower panel of Figure 4 ).
Further, for child L2 learners, η is simultaneously a function of their AoA, a proxy for time ( t ), and can therefore be expressed as η(t). This deviation function of time varies in the range of 0 ≤ η t ≤ η max . Accordingly; Eq. 7 can be mathematically rewritten into (8):
Assuming that as t grows or as AoA increases, η increases slowly and smoothly from 0 to η max until it solidifies into a constant, which marks the onset of adult learning, η( t ) can mathematically be expressed as (9).
where a is a constant. The geometry of the deviation function of time is illustrated in the lower panel of Figure 4 .
Figure 4 displays a double non-linearity characterizing L2 acquisition by young learners, with (A) showing the first order of non-linearity of r 0 η , that is, ultimate attainment as a function of deviation or the L1-TL distance (computed via Eq. 7 ), and with (B) displaying the second order of non-linearity, η (t), that is, η changing with t , age of acquisition (computed through Eq. 9 ).
Figure 5 illustrates ultimate attainment as a function of AoA, r 0 (t), and its derivative against t , d r 0 d t , which naturally yields three distinct periods: a critical period, t critical ; a post-critical period, t p-critical ; and an adult learning period, t adult . Within the critical period, t critical , r 0 ≅ 0 , meaning there is no real difference in attainment as age of acquisition increases. But within the post-critical period, t p-critical , r 0 changes dramatically, with d r 0 d t peaking and waning until it drops to the level approximating that of the adult period. Within the adult period, t adult , r 0 remains a constant, as attainment levels off.
FIGURE 5 . Ultimate attainment (the blue line) as a function of age of acquisition ( t ) and its derivatives giving three distinct periods (the orange line).
ECT-L2A, therefore, identifies three learning periods. First, there is a critical period, t critical , within which attainment is nativelike, r 0 ≅ 0. Notice that the blue line in Figure 5 is the lowest during the critical period, signifying that the attainment converges on the target, but it is the highest during the adult period, meaning that the attainment diverges greatly from the target. The offset of the critical period is smooth rather than abrupt, with the impact of deviation, η , slowly emerging at its offset. During this period, the L1 is surfacing, yet with negligible deviation from the TL and weak in strength.
Key to understanding this account of the critical period is the double non-linearity: first, ultimate attainment as a function of L1-TL deviation ( r 0 ( η ), see (A) in Figure 4 ); and second, L1-TL deviation as a function of AoA ( η ( t ); see (B) in Figure 4 ). Crucially, this double non-linearity extends a critical “point” into a critical “period” .
Second, there is a post-critical period, t p-critical , 0 < r 0 ≤ r 0 ( η max ), within which, with advancing AoA, the L1-L2 deviation grows larger and stronger, resulting in ultimate attainment that is increasingly lower (i.e., increasingly non-nativelike). The change rate of r 0 , its first derivative to time, d r 0 d t , is dramatic, waxing and waning. As such, the post-critical period is more complex and nuanced than the critical period. During the post-critical period, as the learner’s L1 becomes increasingly robust and developed, the deviation becomes larger, resulting in a level of attainment increasingly away from the target (i.e., increasingly non-nativelike).
Third, there is an adult learning period, t adult , η = η max ≅ constant, where, despite the continuously advancing AoA, the deviation reaches its maximum and remains a constant, as benchmarked in indexes of crosslinguistic distance (see, e.g., the Automated Similarity Judgment Program Database [ 76 ]). As a result, L2 ultimate attainment turns asymptotic (for discussion, see [ 12 , 13 ]).
The three periods mathematically produced by ECT-L2A coincide with the stretched “Z” slope that some researchers have argued (e.g., [ 17 , 58 , 59 ]) constitutes the most unambiguous evidence for CPH/L2A, and by extension, for a maturationally-based account of the generic success or lack thereof (i.e., nativelike or non-nativelike L2 proficiency) in early versus late starters. For better illustration of the stretched “Z,” we can convert Figure 5 into Figure 6 , using Eq. 10 .
where a t t stands for level of attainment. According to Eq. 10 , the smaller the r 0 is, the higher the attainment is.
FIGURE 6 . Level of attainment as a function of AoA.
In sum, ECT-L2A mathematically establishes the critical period geometry. That said, the geometry, as seen in Figure 6 , exhibits anything but abrupt inflections; the phase transitions are gradual and smooth. The adult period, for example, does not exhibit a complete “flattening” but markedly lower attainment with continuous decline (cf. [ 7 , 23 , 28 ]). 4
Explaining CPH/L2A
As is clear from the above, on the ECT-L2A account of the critical period, η (i.e., L1-TL deviation) is considered an inter-learner variable and, at once, a proxy for age of acquisition, t . More profoundly, however, ECT-L2A associates η with neural plasticity or sensitivity (cf [ 77 ]). The relationship between plasticity, p ( t ), and deviation function, η (t) , is expressed as (11):
Thus, the relationship between plasticity and the deviation function is one of inverse correlation. During the critical period, η = η min (i.e., minimal L1-TL deviation) and p = p max (i.e., maximal plasticity); conversely, during the adult learning period, η = η max (i.e., maximal L1-TL deviation) and p = p min (i.e., minimal plasticity). In short, an increased deviation, η (t) , corresponds to a decrease of plasticity, p (t) , and vice versa , as illustrated in Figure 7 .
FIGURE 7 . Plasticity as a function of age of acquisition.
Illustrated in Figure 7 is that neural plasticity, first proposed by [ 78 ] as the underlying cause of CP, is at its highest during the critical period and, as [ 79 ] put it, it “endures within the confines of its onset and offset” ( p . 182). But it begins to decline and drops to a low level during the post-critical period, and remains low through the adult learning period. 5 It would, therefore, seem reasonable to call the first period “critical” and the second period “sensitive.” It is worth mentioning in passing that the post-critical or sensitive period has thus far received scant empirical attention in CPH/L2A research.
Temporally, following the [ 59 ] conjecture, the critical period should last through early childhood from birth to age six, and the sensitive period should offset around puberty (see also [ 2 , 20 , 36 , 67 , 71 ]). Crucially, both periods are circumscribed, exhibiting discontinuities, with the critical period exhibiting maximal sensitivity, the sensitive period declining, though, for the most part, still far greater, sensitivity than the adult learning period. This view of a changing underlying mechanism across the three periods of AoA and attainment resonates with the Language as a Complex Adaptive System perspective (see, e.g., [ 80 ]). [ 81 ], for example, noted that “the processing mechanisms that underlie [language development] … are fundamentally non-linear. This means that development itself will frequently have phase-like characteristics, that there may be periods of extreme sensitivity to input (‘critical periods’)” ( p . 431).
ECT-L2A as a unifying model
ECT-L2A, by virtue of identifying the L1-TL deviation, η, as a lynchpin for age effects, provides an explanation for the differential ultimate attainment of early versus late starters. Essentially, in early AoA, η is a temporal and neuro-functional proxy tied respectively to a developing L1 and to a changing age and changing neuroplasticity. In contrast, in late AoA, η is a constant, due to the L1 being fully developed and the brain fully mature. This takes care of the first facet of inter-learner differential attainment. What about the second facet, viz., the inter-learner differential attainment among late learners?
ECT-L2A (as expressed in Eq. 1 ) is a model of an ecosystem where there is an interplay between learner-internal and environmental energies. In line with the general finding from L2 research that individual difference variables are largely responsible for inter-learner differential attainment of nativelike proficiency in adult learners (see, e.g., [ 27 , 35 , 77 , 82 , 83 ]), ECT-L2A specifically ties motivation and aptitude to kinetic energy, only to provide a more nuanced picture of the changing magnitude of individual difference variables.
Figure 8 illustrates the twin facets of inter-learner differential attainment. First, attainment varies as a function of AoA. Second, attainment varies within and across the three learning periods as a function of individual learners with different amounts of total energy, ϵ 1 < ϵ 2 < ϵ 3 . As shown, individual differences play out the least among learners of AoA falling within the critical period but the most within the adult learning period, consistent with the general findings from L2 research (see, e.g., [ 2 , 3 , 43 , 63 , 65 , 67 , 84 , 85 ]). During the post-critical or sensitive period, individual differences are initially non-apparent but become more pronounced with increasing AoA. 6
FIGURE 8 . Level of attainment for total energies ϵ 1 < ϵ 2 < ϵ 3 .
ECT-L2A thus offers a coherent explanation for variable attainment in late learners. First and foremost, it posits that individual learners’ total energy or “carrying capacity” [ 86 ] is different, which leads to different levels of attainment. Second, although the internal (motivation and aptitude) and external (environment) energies interact over time, ultimately it is the deviation energy η 2 r 2 that dominates and stalls the learner at asymptote (see Eq. 6 ). This account provides a much more nuanced perspective on the role of individual differences than has been given in the current L2A literature.
Extant empirical studies investigating individual difference variables through correlation analysis have mostly projected a static view of the role (some of) the variables play in L2A. In contrast, ECT-L2A gives a dynamic view and, more importantly, an interactive view. In the end, the individual difference variables are part of a larger ecosystem within which they do not act alone, but rather interact with other energies (i.e., potential energy and deviation energy), waxing and waning as a result of energy conservation.
In this article, we engaged with a central concern in the ongoing heated debate on CPH/L2A, that is, the geometry of age differences. Within the framework of ECT-L2A, an interdisciplinary model of L2 attainment, we mathematically derived the age-attainment function and established the presence of a critical period in L2A. Importantly, this period is part of a developmental trajectory that comprises three learning periods: a critical period, a post-critical or sensitive period, and an adult period.
ECT-L2A has thus far demonstrated a stunning internal consistency in that it mathematically identifies younger learners’ superior performance to adult learners’ as well as the differential attainment among adult learners.
ECT-L2A, while in broad agreement with an entrenchment-transfer account from L2A research that essentializes the role of the L1 in L2 attainment (see, e.g., [ 5 , 11 , 87 – 90 ]), provides a dynamic account of that role and its varying contributions to the different age-related learning periods. Furthermore, ECT-L2A offers an interactive account whereby the L1, as part of the deviation energy, interacts with other types of learner-internal and learner-external energies. Above all, ECT-L2A, by virtue of summoning internal and external energies, gives a coherent explanation for the twin facets of inter-learner differential success—as respectively manifested between younger and older learners and among older learners.
Validation of ECT-L2A is, however, required. Many questions warrant investigation. On this note, Johnson and Newort’s view [ 17 ], in particular, that the goal of any L2A theory should be to account for three sets of facts—a) gradual decline of performance, b) the age at which a decline in performance is detected, and c) the nature of adult performance—resonates with us. Although ECT-L2A shines a light on all three, further work is clearly needed. More specific to the focus of the present article, three sets of questions can be asked in relation to the three learning periods ECT-L2A has identified.
In the spirit of promoting collective intelligence, we present a subset of these questions below in the hope that they will spark interest among researchers across disciplines and inspire close-up investigations leveraging a variety of methodologies.
First, for the critical period:
1. When does the decline of learning begin?
2. How does it relate to the status of L1?
3. What is plasticity like in this period?
4. What does plasticity entail?
5. How is it related to a developing L1 and a developing L2?
Answers to these questions can, at least in part, be found in the various literatures across disciplines. But approaching these questions in relation to one another—as opposed to discretely—would likely yield a more systematic, holistic and coherent understanding. Or perhaps, in search of answers to any of these questions, one may realize that the existing understanding is way too shallow or inadequate. For instance, [ 18 ] cited “a lack of interference from a well-learned first language” as one of the possible causes of the age-attainment function in younger versus older learners. But what has not yet been established is the nature of the younger learners’ L1. What does “well-learned” mean? Is it established or is it still developing? At minimum, it cannot be a unitary phenomenon, given the age span of young learners.
Second, for the post-critical or sensitive period, ECT-L2A mathematically identifies two sub-periods. Thus, questions such as the following should be examined:
6. What prompts the initial dramatic decline of attainment?
7. How does each of the sub-periods relate to the status of L1?
8. How does the decline relate to changing plasticity?
9. How does it relate to grammatical performance?
Third, for the adult learning period, questions such as the following warrant close engagement:
10. How do learners with the same L1 background differ from each other in their L2 ultimate attainment?
11. How do learners with different L1 backgrounds differ from one another in their L2 ultimate attainment?
12. How is the trajectory of each type of energy, endogenous or exogenous, related to the level of attainment?
Investigating these questions, among others, will lead us to a better understanding not only of the critical period but also of L2 learning over the arc of human life.
The theoretical and practical importance of gaining a robust and comprehensive understanding of how age affects the L2 learning outcome calls for systematic investigations. To that end, ECT-L2A has offered a systems thinking perspective and framework.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
Author contributions
All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.
Acknowledgments
We greatly appreciate the insightful and perceptive comments made by the reviewers on an earlier version of this article, and take sole responsibility for any error or omission.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
1 Despite the centrality of “nativelikeness” to the Critical Period Hypothesis [ 1 ], studies in L2A have increasingly moved away from the use of the term in favor of “the level of ultimate attainment” [ 2 ].
2 The potential energy in ECT-L2A is akin to gravitational potential energy. As such it defines the central source field, serving as the primary energy that dynamically converts to other types of energy: kinetic energy and centrifugal energy. Similarly, the potential energy of L2A defines the field of learning. It stands for TL environment or input, serving as primary energy, dynamically converting to motivational and L1-TL deviation energies. An essential premise of ECT-L2A is the existence of potential energy. This premise is consistent with that underpinning L2A studies on CPH and ultimate attainment.
3 That is when the L1 becomes entrenched.
4 Looking back on the [ 17 ] study, [ 7 ], taking account of developments in the intervening 3 decades in understanding changes in the brain during adulthood, updated the earlier assertation about the stability of age effects in adulthood, noting that “it is more accurate to hypothesize that L2 proficiency SHOULD continue to decline during adulthood” and that “a critical or sensitive period for language acquisition is not absolute or sudden” ( p . 929, emphasis in original). She further argued that “[t]he lack of flattening of age function at adulthood in many studies does not mean that learning is not constrained by biologically based maturational changes” (ibid).
5 The plasticity never completely disappears, but rather becomes asymptotic.
6 Age and attainment function appears to follow a power law in that age effects are greatest during the critical period, less so during the post-critical or sensitive period, and weakest during the adult learning period (see Figure 8 ). Similarly, Figure 7 exhibits a power law relationship between age and plasticity: Plasticity is at its peak during the critical period, declines during the post-critical or sensitive period, and plateaus in the adult learning period.
1. Lenneberg E. Biological foundations of language . Wiley (1967).
Google Scholar
2. Hyltenstam K. Critical period. In: C Chappell, editor. The concise encyclopedia of applied linguistics . John Wiley and Sons (2020). p. 360–4.
CrossRef Full Text | Google Scholar
3. Bley-Vroman R. The logical problem of second language learning. Linguistic Anal (1990) 20(1-2):3–49.
4. Kellerman E (1995). Age before beauty: Johnson and Newport revisited. In L Eubank, L Selinker, and M Sharwood Smith (Eds.), The current state of interlanguage: Studies in honor of William Rutherford , 219–31. John Benjamins .
5. Mayberry RI, Kluender R. Rethinking the critical period for language: New insights into an old question from American Sign Language. Bilingualism: Lang Cogn (2018) 21(5):938–44. doi:10.1017/s1366728918000585
6. Abrahamsson N. But first, let’s think again. Bilingualism: Lang Cogn (2018) 21(5):906–7. doi:10.1017/s1366728918000251
7. Newport E (2018). Is there a critical period for L1 but not L2? Bilingualism: Lang Cogn 21(5), 928–9. doi:10.1017/s1366728918000305
8. Foster-Cohen S. First language acquisition.second language acquisition: What's Hecuba to him or he to Hecuba? Second Lang Res (2001) 17:329–44. doi:10.1191/026765801681495859
9. Herschensohn J. Language development and age . Cambridge University Press (2007).
10. Meisel J. Sensitive phases in successive language acquisition: The critical period hypothesis revisited. In: C Boeckx, and K Grohmann, editors. The Cambridge handbook of biolinguistics . Cambridge University Press (2013). p. 69–85.
11. MacWhinney B. A unified model. In: P Robinson, and N Ellis, editors. Handbook of cognitive linguistics and second language acquisition . Cambridge University Press (2008). p. 229–52.
12. Han Z-H, Bao G, Wiita P. Energy conservation: A theory of L2 ultimate attainment. Int Rev Appl Linguistics (2017) 50(2):133–64. doi:10.1515/iral-2016-0034
13. Han Z-H, Bao G, Wiita P. Energy conservation in SLA: The simplicity of a complex adaptive system. In: L Ortega, and Z-H Han, editors. Complexity theory and language development. In celebration of Diane Larsen-Freeman . John Benjamins (2017). p. 210–31.
14. Hulstijn JH. Language proficiency in native and non-native speakers: Theory and research . John Benjamins (2015).
15. Hulstijn JH. An individual-differences framework for comparing nonnative with native speakers: Perspectives from BLC Theory. Lang Learn (2019) 69:157–83. doi:10.1111/lang.12317
16. Oyama S. A sensitive period for the acquisition of a non-native phonological system. J Psycholinguistic Res (1976) 5:261–83. doi:10.1007/bf01067377
17. Johnson JS, Newport EL. Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cogn Psychol (1989) 21(1):60–99. doi:10.1016/0010-0285(89)90003-0
PubMed Abstract | CrossRef Full Text | Google Scholar
18. Hartshorne JK, Tenenbaum JB, Pinker S. A critical period for second language acquisition: Evidence from 2/3 million English speakers. Cognition (2018) 177:263–77. doi:10.1016/j.cognition.2018.04.007
19. Qureshi M. A meta-analysis: Age and second language grammar acquisition. System (2016) 60:147–60. doi:10.1016/j.system.2016.06.001
20. DeKeyser R, Larson-Hall J. What does the critical period really mean? In: J Kroll, and AMB de Groot, editors. Handbook of bilingualism: Psycholinguistic approaches . Oxford University Press (2005). p. 88–108.
21. Abutalebi J, Clahsen H. Critical periods for language acquisition: New insights with particular reference to bilingualism research. Bilingualism: Lang Cogn (2018) 21(5):883–5. doi:10.1017/s1366728918001025
22. Bialystok E, Hakuta K. Confounded age: Linguistic and cognitive factors in age differences for second language acquisition. In: D Birdsong, editor. Second language acquisition and the Critical Period hypothesis . Lawrence Erlbaum (1999). p. 161–82.
23. Bialystok E, Miller B. The problem of age in second language acquisition: Influences from language, structure, and task. Bilingualism: Lang Cogn (1999) 2:127–45. doi:10.1017/s1366728999000231
24. DeKeyser R. Age effects in second language learning. In: S Gass, and A Mackey, editors. The Routledge handbook of second language acquisition . Routledge (2012). p. 422–60.
25. Hakuta K, Bialystok E, Wiley E. Critical evidence: A test of the critical-Period Hypothesis for second-language acquisition. Psychol Sci (2013) 14(1):31–8. doi:10.1111/1467-9280.01415
26. Birdsong D. Age and second language acquisition and processing: A selective overview. Lang Learn (2006) 56:9–49. doi:10.1111/j.1467-9922.2006.00353.x
27. Birdsong D. Plasticity, variability, and age in second language acquisition and bilingualism. Front Psychoogy (2018) 9(81):81–17. doi:10.3389/fpsyg.2018.00081
28. Birdsong D, Molis M. On the evidence for maturational constraints in second language acquisition. J Mem Lang (2001) 44:235–49. doi:10.1006/jmla.2000.2750
29. Long M. Problems with supposed counter-evidence to the critical period hypothesis. Int Rev Appl Linguistics Lang Teach (2005) 43:287–317. doi:10.1515/iral.2005.43.4.287
30. Long MH. Problems in SLA . Lawrence Erlbaum Associates (2007).
31. Long MH. Maturational constraints on child and adult SLA. In: G Granena, and MH Long, editors. Sensitive periods, language aptitude, and ultimate L2 attainment . John Benjamins (2013). p. 3–41.
32. Mayberry RI, Lock E. Age constraints on first versus second language acquisition: Evidence for linguistic plasticity and epigenesis. Brain Lang (2003) 87(3):369–84. doi:10.1016/s0093-934x(03)00137-8
33. Marinova-Todd S, Marshall DB, Snow CE. Three misconceptions about age and L2 learning. TESOL Q (2000) 34:9–34. doi:10.2307/3588095
34. Muñoz C, Singleton D. A critical review of age-related research on L2 ultimate attainment. Lang Teach (2010) 44(1):1–35. doi:10.1017/s0261444810000327
35. Singleton D. Age and second language acquisition. Annu Rev Appl Linguistics (2001) 21:77–89. doi:10.1017/s0267190501000058
36. Long M. Maturational constraints on language development. Stud Second Lang Acquisition (1990) 12:251–85. doi:10.1017/s0272263100009165
37. Cook V. Evidence for multi-competence. Lang Learn (1992) 42:557–91. doi:10.1111/j.1467-1770.1992.tb01044.x
38. Grosjean F. Studying bilinguals: Methodological and conceptual issues. Bilingualism: Lang Cogn (1989) 1:131–49. doi:10.1017/s136672899800025x
39. Birdsong D. Nativelikeness and non-nativelikeness in L2A research. Int Rev Appl Linguistics (2005) 43(4):319–28. doi:10.1515/iral.2005.43.4.319
40. Birdsong D. Ultimate attainment in second language acquisition. Language (1992) 68(4):706–55. doi:10.2307/416851
41. Coppieters R. Competence differences between native and near-native speakers. Language (1987) 63:544–73. doi:10.2307/415005
42. Donaldson B. Left-dislocation in near-native French. Stud Second Lang Acquisition (2011) 33:399–432. doi:10.1017/s0272263111000039
43. Donaldson B. Syntax and discourse in near-native French: Clefts and focus. Lang Learn (2012) 62(3):902–30. doi:10.1111/j.1467-9922.2012.00701.x
44. Franceschina F. Fossilized second language grammars: The acquisition of grammatical gender (Vol. 38) . John Benjamins (2005).
45. Han Z-H. Fossilization: Can grammaticality judgment be a reliable source of evidence? In: Z-H Han, and T 672 Odlin, editors. Studies of fossilization in second language acquisition . Multilingual Matters (2006). p. 56–82.
46. Han Z-H. Grammatical morpheme inadequacy as a function of linguistic relativity: A longitudinal study. In: ZH Han, and T Cadierno, editors. Linguistic relativity in SLA: Thinking for speaking . Clevedon: Multilingual Matters (2010). p. 154–82.
47. Hopp H. Ultimate attainment in L2 inflection: Performance similarities between non-native and native speakers. Lingua (2010) 120:901–31. doi:10.1016/j.lingua.2009.06.004
48. Hopp H. Grammatical gender in adult L2 acquisition: Relations between lexical and syntactic variability. Second Lang Res (2013) 29(1):33–56. doi:10.1177/0267658312461803
49. Ioup G, Boutstagui E, El Tigi M, Moselle M. Re-Examining the critical period hypothesis: A case study of successful adult sla in a naturalistic environment. Stud Second Lang Acquisition (1994) 10:73–98. doi:10.1017/s0272263100012596
50. Lardiere D. Ultimate attainment in second language acquisition . Lawrence Erlbaum (2007).
51. Saito K. Experience effects on the development of late second language learners' oral proficiency. Lang Learn (2015) 65(3):563–95. doi:10.1111/lang.12120
52. Sorace A. Unaccusativity and auxiliary choice in non-native grammars of Italian and French: Asymmetries and predictable indeterminacy. J French Lang Stud (1993) 3:71–93. doi:10.1017/s0959269500000351
53. Stam G. Changes in thinking for speaking: A longitudinal case study. In: E Bylund, and P Athanasopolous, Guest editors. The language and thought of motion in second language speakers . The modern language journal , 99 (2015). p. 83–99.
54. van Baxtel S. Can the late bird catch the worm? Ultimate attainment in L2 syntax . Landelijke Onderzoekschool Taalwetenschap (2005).
55. White L, Genesee F. How native is near-native? The issue of ultimate acquisition. Second Lang Res (1996) 12(3):233–65. doi:10.1177/026765839601200301
56. Yuan B, Dugarova E. Wh-topicalization at the syntax-discourse interface in English speakers' L2 Chinese grammars. Stud Second Lang Acquisition (2012) 34:533–60. doi:10.1017/s0272263112000332
57. Birdsong D. Introduction: Why and why nots of the critical period hypothesis. In: D Birdsong, editor. Second language acquisition and the critical period hypothesis . Erlbaum (1999). p. 1–22.
58. Birdsong D. Interpreting age effects in second language acquisition. In: J Kroll, and A De Groot, editors. Handbook of bilingualism: Psycholinguistic approaches . Oxford University Press (2005). p. 109–27.
59. Pinker S. The language instinct . W. Morrow (1994).
60. Pulvermüller F, Schumann J. Neurobiological mechanisms of language acquisition. Lang Learn (1994) 44:681–734. doi:10.1111/j.1467-1770.1994.tb00635.x
61. Scovel T. A time to speak: A psycholinguistic inquiry into the critical period for human speech . Newbury House (1988).
62. Scovel T. A critical review of the critical period research. Annu Rev Appl Linguistic (2000) 20:213–23. doi:10.1017/s0267190500200135
63. Abrahamsson N, Hyltenstam K. The robustness of aptitude effects in near-native second language acquisition. Stud Second Lang Acquisition (2008) 30:481–509. doi:10.1017/s027226310808073x
64. Bialystok E, Kroll J. Can the critical period be saved? A bilingual perspective. Bilingualism: Lang Cogn (2018) 21(5):908–10. doi:10.1017/s1366728918000202
65. DeKeyser R. The robustness of critical period effects in second language acquisition. Stud Second Lang Acquisition (2000) 22(4):499–533. doi:10.1017/s0272263100004022
66. Flege J, Yeni-Komshian G, Liu S. Age constraints on second-language acquisition. J Mem Lang (1999) 41:78–104. doi:10.1006/jmla.1999.2638
67. Granena G, Long MH. Age of onset, length of residence, language aptitude, and ultimate L2 attainment in three linguistic domains. Second Lang Res (2013) 29(3):311–43. doi:10.1177/0267658312461497
68. Birdsong D, Vanhove J. Age of second-language acquisition: Critical periods and social concerns. In: E Nicoladis, and S Montanari, editors. Bilingualism across the lifespan: Factors moderating language proficiency . Washington, DC: American Psychological Association (2016). p. 162–81.
69. Flege JE. It’s input that matters most, not age. Bilingualism: Lang Cogn (2018) 21:919–20. doi:10.1017/s136672891800010x
70. Hyltenstam K. Second language ultimate attainment: Effects of maturation, exercise, and social/psychological factors. Bilingualism: Lang Cogn (2018) 21(5):921–3. doi:10.1017/s1366728918000172
71. Hyltenstam K, Abrahamsson N. Maturational constraints in SLA. In: C Doughty, and M Long, editors. The handbook of second language acquisition . Blackwell (2003). p. 539–88.
72. Long M, Granena G. Sensitive periods and language aptitude in second language acquisition. Biling: Lang Cogn (2018) 21(5):926–927.
73. Bao G, Hadrava P, Ostgaard E. Multiple images and light curves of an emitting source on a relativistic eccentric orbit around a black hole. Astrophysics J (1994) 425:63–71. doi:10.1086/173963
74. Bao G, Hadrava P, Ostgaard E. Emission line profiles from a relativistic accretion disk and the role of its multi-images. Astrophysics J (1994) 435:55–65.
75. Bao G, Wiita P, Hadrava P. Energy-dependent polarization variability as a black hole signature. Phys Rev Lett (1996) 77:12–5. doi:10.1103/PhysRevLett.77.12
76. Wichmann S, Holman E, Brown C. The ASJP database (version 17) (2016). Available at http://asjp.clld.org (Accessed November 1, 2020).
77. MacWhinney B. Emergent fossilization. In: Z-H Han, and T Odlin, editors. Studies of fossilization in second language acquisition . Multilingual Matters (2006). p. 134–56.
78. Penfield W, Roberts L. Speech and brain mechanisms . Atheneum (1959).
79. Bornstein MH. Sensitive periods in development: Structural characteristics and causal interpretations. Psychol Bull (1989) 105:179–97. doi:10.1037/0033-2909.105.2.179
80. Larsen-Freeman L. Complex dynamic systems theory. In: B VanPatten, G Keating, and S Wulff, editors. Theories in second language acquisition . Routledge (2020). p. 248–70.
81. Elman J. Development: It’s about time. Dev Sci (2003) 6:430–3. doi:10.1111/1467-7687.00297
82. Dörnyei Z, Skehan P. Individual differences in second language learning. In: C Doughty, and M Long, editors. The handbook of second language acquisition . Blackwell (2003). p. 589–630.
83. Hyltenstam K, Abrahamsson N. Who can become native-like in a second language? All, some, or none? Studia Linguistica (2000) 54(2):150–66. doi:10.1111/1467-9582.00056
84. Reber A. Implicit learning and tacit knowledge: An essay on the cognitive unconscious . Clarendon Press (1993).
85. Hoyer W, Lincourt A. Ageing and the development of learning. In: M Stadler, and P Frensch, editors. Handbook of implicit learning . Sage (1998). p. 445–70.
86. van Geert P. A dynamic systems model of cognitive and language growth. Psychol Rev (1991) 98:3–53. doi:10.1037/0033-295x.98.1.3
87. Elman J, Bates E, Johnson M, Karmiloff-Smith A, Parisi D, Plunkett K. Rethinking innateness: A connectionist perspective on development . Cambridge, MA: MIT Press (1996).
88. Flege J. Age of learning and second language speech. In: D Birdsong, editor. Second language acquisition and the Critical Period hypothesis . Lawrence Erlbaum (1999).
89. Schepens J, Roeland W, van Hout F, van der Slik F. Linguistic dissimilarity increases age-related decline in adult language learning. Stud Second Lang Acquisition (2022) 1–22. doi:10.1017/S0272263122000067
90. Ventureyra V, Pallier C, Yoo H. The loss of first language phonetic perception in adopted Koreans. J Neurolinguist (2004) 17:79–91. doi:10.1016/s0911-6044(03)00053-8
Keywords: ultimate attainment, critical period, second language acquisition, physics laws, energy conservation, angular momentum conservation, inter-learner differential attainment
Citation: Han Z and Bao G (2023) Critical period in second language acquisition: The age-attainment geometry. Front. Phys. 11:1142584. doi: 10.3389/fphy.2023.1142584
Received: 11 January 2023; Accepted: 02 March 2023; Published: 20 March 2023.
Reviewed by:
Copyright © 2023 Han and Bao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: ZhaoHong Han, [email protected]
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
An official website of the United States government
Official websites use .gov A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
- Publications
- Account settings
- Advanced Search
- Journal List
A critical period for second language acquisition: Evidence from 2/3 million English speakers
Joshua k hartshorne, joshua b tenenbaum, steven pinker.
- Author information
- Article notes
- Copyright and License information
Corresponding author at: Department of Psychology, Boston College, McGuinn Hall 527, Chestnut Hill, MA 02467, United States. [email protected] (J.K. Hartshorne)
Issue date 2018 Aug.
Children learn language more easily than adults, though when and why this ability declines have been obscure for both empirical reasons (underpowered studies) and conceptual reasons (measuring the ultimate attainment of learners who started at different ages cannot by itself reveal changes in underlying learning ability). We address both limitations with a dataset of unprecedented size (669,498 native and non-native English speakers) and a computational model that estimates the trajectory of underlying learning ability by disentangling current age, age at first exposure, and years of experience. This allows us to provide the first direct estimate of how grammar-learning ability changes with age, finding that it is preserved almost to the crux of adulthood (17.4 years old) and then declines steadily. This finding held not only for “difficult” syntactic phenomena but also for “easy” syntactic phenomena that are normally mastered early in acquisition. The results support the existence of a sharply-defined critical period for language acquisition, but the age of offset is much later than previously speculated. The size of the dataset also provides novel insight into several other outstanding questions in language acquisition.
Keywords: Language acquisition, Critical period, L2 acquisition
1. Introduction
People who learned a second language in childhood are difficult to distinguish from native speakers, whereas those who began in adulthood are often saddled with an accent and conspicuous grammatical errors. This fact has influenced many areas of science, including theories about the plasticity of the young brain, the role of neural maturation in learning, and the modularity of linguistic abilities ( Johnson & Newport, 1989 ; Lenneberg, 1967 ; Morgan-Short & Ullman, 2012 ; Newport, 1988 ; Pinker, 1994 ). It has also affected policy, driving debates about early childhood stimulation, bilingual education, and foreign language instruction ( Bruer, 1999 ).
However, neither the nature nor the causes of this “critical period” for second language acquisition are well understood. (Here, we use the term “critical period” as a theory-neutral descriptor of diminished achievement by adult learners, whatever its cause.) There is little consensus as to whether children’s advantage comes from superior neural plasticity, an earlier start that gives them additional years of learning, limitations in cognitive processing that prevent them from being distracted by irrelevant information, a lack of interference from a well-learned first language, a greater willingness to experiment and make errors, a greater desire to conform to their peers, or a greater likelihood of learning through immersion in a community of native speakers ( Birdsong, 2017 ; Birdsong & Molis, 2001 ; Hakuta, Bialystok, & Wiley, 2003 ; Hernandez, Li, & MacWhinney, 2005 ; Johnson & Newport, 1989 ; Newport, 1990 ; Pinker, 1994 ). We do not even know how long the critical period lasts, whether learning ability declines gradually or precipitously once it is over, or whether the ability continues to decline throughout adulthood or instead reaches a floor ( Birdsong & Molis, 2001 ; Guion, Flege, Liu, & Yeni-Komshian, 2000 ; Hakuta et al., 2003 ; Jia, Aaronson, & Wu, 2002 ; Johnson & Newport, 1989 ; McDonald, 2000 ; Sebastián-Gallés, Echeverría, & Bosch, 2005 ; Vanhove, 2013 ).
1.1. Learning ability vs. ultimate attainment
As noted by Patkowski (1980) , researchers interested in critical periods focus on two interrelated yet distinct questions:
How does learning ability change with age?
How proficient can someone be if they began learning at a particular age?
The questions are different because language acquisition is not instantaneous. For example, an older learner who (hypothetically) acquired language at a slower rate could, in theory, still attain perfect proficiency if he or she persisted at the learning long enough.
The question of ultimate attainment (2) captures the most public attention because it directly applies to people’s lives, but the question of learning ability (1) is more theoretically central. Does learning ability decline gradually from birth ( Guion et al., 2000 ; Hernandez et al., 2005 ), whether from neural maturation, interference from the first language, or other causes ( Fig. 1A )? Alternatively, is there an initial period of high ability, followed by a continuous decline ( Fig. 1B ), or a decline that reaches a floor ( Fig. 1C ) ( Johnson & Newport, 1989 )? Or does ability remain relatively constant ( Fig. 1D ), with adults failing to learn for some other reason such as less time and interest ( Hakuta et al., 2003 ; Hernandez et al., 2005 )?
(A–D) Schematic depictions of four theories of how language learning ability might change with age. (E–H) Schematic depictions of four theories of how ultimate attainment might vary with age of first exposure to the language. Note: While the curves hypothesized for learning ability and ultimate attainment resemble one another, there is little systematic relationship between the two; see the main text.
Unfortunately, learning ability is a hidden variable that is difficult to measure directly. Studies that compare children and adults exposed to comparable material in the lab or during the initial months of an immersion program show that adults perform better, not worse, than children ( Huang, 2015 ; Krashen, Long, & Scarcella, 1979 ; Snow & Hoefnagel-Höhle, 1978 ), perhaps because they deploy conscious strategies and transfer what they know about their first language. Thus, studies that are confined to the initial stages of learning cannot easily measure whatever it is that gives children their long-term advantage. (Note that strictly speaking, these studies measure learning rate , not learning ability . While these are conceptually distinct, in practice they are difficult to disentangle, and the distinction has played little role in the literature. In the present paper, we will use the terms interchangeably.)
Thus, although the question of learning ability (1) is more theoretically central, empirical studies have largely probed the more tractable question of how ultimate attainment changes as a function of age of first exposure (2). Here, too, there are a number of theoretically interesting possibilities ( Fig. 1E–H ). The hope has been that identifying the shape of the ultimate attainment curve might tell us something about the shape of the learning ability curve (cf. Birdsong, 2006 ; Hakuta et al., 2003 ; Johnson & Newport, 1989 ). Unfortunately, this turns out not to be the case. Despite the similarities between the two sets of hypothesized curves (e.g., compare Fig. 1A and E ), they bear little relationship to one another: The same ultimate attainment curve (e.g., Fig. 1E ) is consistent with many different learning ability curves ( Fig. 1A–D ).
Here is why learning ability curves ( Fig. 1A–D ) and ultimate attainment curves ( Fig. 1E–H ) should not be conflated: If, hypothetically, learning ability plummeted at age 15 but it took 10 years of experience to master a language completely, then ultimate attainment would decline starting at an age of exposure of 5 (since someone who began at 6 years old would learn at peak capacity for only 9 of the 10 years required, someone who began at 7 years old would learn for only 8 of those years, and so on). It would be erroneous, in that case, to conclude that a decline in ultimate attainment starting at age 5 implied that children’s learning ability declines starting at age 5. Conversely, showing that people who began learning at a certain age reached native-like proficiency merely indicates that they learned fast enough, not that they learned as fast as a native speaker, just as the fact that two runners both finished a race indicates only that they both started early enough and ran fast enough, not that they ran at the exact same speed.
As a result, it is impossible to directly infer developmental changes in underlying ability (the theoretical construct of interest) from age-related changes in ultimate attainment (the empirically available measurements). Fig. 2 shows that two very distinct ability curves, one with a steady decline from infancy (2A), the other with a sudden drop in late adolescence (2B), can give rise to indistinguishable ultimate attainment curves. (The curves are generated by our ELSD model, described below, but the point is model-independent.) Conversely, a rapid drop in ultimate attainment beginning at age 10 could be explained by a continuous decline in learning ability beginning in infancy ( Fig. 2C ) or by a discontinuous drop in learning rate at 15 years old ( Fig. 2D ). Moreover, quantitative differences in the magnitude of a hypothetical decline in underlying learning ability (which are not specified in existing theories) can give rise to qualitative differences in the empirically measured ultimate attainment curves, such as a gentle decline versus a sudden drop-off: compare Fig. 2A with 2C , and Fig. 2B with 2D .
Simulation results showing how the mapping between hypothetical changes in underlying learning rate (the left graph in each pair) and empirically measured changes in ultimate attainment is many-to-many. These quantitative predictions were derived from the ELSD model, described below, but the basic point is model-independent.
1.2. The present study
As we have seen, to understand how language-learning ability changes with age, we must disentangle it from age of exposure, years of experience, and age at testing. Unfortunately, this challenge is insuperable with any study that fails to use sufficiently large samples and ranges, because any imprecision in measuring the effects of amount of exposure on attainment, the effects of age of first exposure on attainment, or both, will render the results ambiguous or even uninterpretable.
Moreover, an underlying ability curve can be ascertained only if the measure of language attainment is sufficiently sensitive: If learners hit an artificial ceiling, any gains from an earlier age of exposure or a greater amount of exposure will be concealed. Indeed, the concept of native proficiency entails extreme levels of accuracy. An error rate that would be considered excellent in other academic or psychological settings, such as 0.75%, represents a conspicuous immaturity in the context of language. For example, over-regularizations of irregular verbs, such as runned and breaked , are among the most frequently noted errors in preschoolers’ speech ( Pinker, 1999 ), despite occurring in only 0.75% of utterances (and on 2.5% of past-marked irregular verbs; Marcus et al., 1992 ).
These basic mathematical facts raise a significant practical problem: Detecting an error that occurs as little as 0.75% of the time requires a lot of data: A preschooler has to produce 92 utterances to have a better than even chance of producing an over-regularization. Thus, to detect even “conspicuous” errors, such as childhood over-regularization, we need to test many subjects on many items.
Below, we describe a study of syntax that attempts to meet these challenges using novel experimental and analytical techniques. To foreshadow, the age at which syntax-learning ability begins to decline is much later than usually suspected, and it takes both native and non-native speakers longer to reach their ultimate level of attainment than has been previously assumed. While both findings are unexpected, we show that the apparent inconsistencies with prior findings can be explained by the much higher precision afforded by our methods. Indeed, the findings below should not be surprising in retrospect. More importantly, these findings appear robust and emerge in a variety of different analyses.
2.1. Overview
Initial power calculations suggested that several hundred thousand subjects of diverse ages and linguistic backgrounds would be required to disentangle age of first exposure, age at testing, and years of exposure (we return to issues of power in the discussion, below). The standard undergraduate subject pools are not nearly large or diverse enough to achieve this, nor are crowdsourcing platforms like Amazon Mechanical Turk ( Stewart et al., 2015 ). Inspired partly by Josh Katz’s Dialect Quiz for the New York Times , we developed an Internet quiz we hoped would be sufficiently appealing as to attract large numbers of participants. In order to go viral, the quiz needed to be entertaining and intrinsically motivating while also quick to complete, since Internet volunteers rarely spend more than 10 min on a quiz. At the same time, to yield useful data the quiz had to include a robust, comprehensive measure of syntactic knowledge without an artificial ceiling, as well as elicit demographic data about age and linguistic background. Below, we describe how we addressed these desiderata. Procedures were approved by the Committee on the Use of Humans as Experimental Subjects at Massachusetts Institute of Technology.
2.2. Procedure
Potential subjects were invited to take a grammar quiz ( www.gameswithwords.org/WhichEnglish ), the results of which would allow a computer algorithm to guess their native language and their dialect of English. After providing informed consent, subjects provided basic demographic details (age, gender, education, learning disability) and indicated whether they had taken the quiz before. They then completed the quiz and were presented with the algorithm’s top three guesses of their native language and their dialect, which was based on the Euclidean distance between the vector of the subject’s responses and the vector of mean responses for each language and dialect. Participants found this aspect of the quiz highly engaging, and the quiz was widely shared on social media. For instance, it was shared more than 300,000 times on Facebook.
After seeing the guesses, subjects were invited to help us improve the algorithm by filling out a demographic questionnaire. (Although early answers were used to tune the algorithm, the algorithm’s accuracy quickly plateaued and was not tuned further.) This included all the countries they had lived in for at least 6 months, and all the languages they spoke from birth. 1 Participants who listed multiple countries were asked to indicate their current country. For some countries (such as the USA), additional localizing information was collected. Participants who did not report speaking English from birth were asked at what age they began learning English, how many years they had lived in an English-speaking country, and whether any immediate family members were native speakers of English. Approximately 80% of subjects who completed the syntax questions also completed this demographic questionnaire. The data reported here come from those subjects.
2.3. Participants
All participants gave informed consent. 680,333 participants completed the experiment, excluding repeats. We further excluded participants who gave inconsistent or implausible responses to the demographic questions (listing a current age less than the age of first exposure to English; listing a current age that is less than the number of years spent in an English-speaking country; reporting college attendance and a current age of less than 16, or reporting graduate school attendance and a current age of less than 19), resulting in 669,800 participants. Finally, based on the histogram of ages, we excluded participants younger than 7 and older than 89 as implausible. Note: a number of participants ages 7–10 reported in the comments that their parents helped by reading the quiz to them, adding credibility to those data. The resulting number of participants for the analyses was 669,498.
The sample was demographically diverse ( Fig. 3 ). Thirty-eight languages were represented by at least 1000 native speakers, not counting individuals who had multiple native languages. The most common native languages other than English were Finnish (N = 39,962), Turkish (N = 36,239), German (N = 24,995), Russian (N = 22,834), and Hungarian (N = 22,108).
(A) Current country of residence of participants (excluding participants with multiple residences). (B) Histogram of participants by age of first exposure to English. (C) Native languages of the bilinguals (excluding English). (D) Histogram of participants by current age.
Analyses focused on three subject groups. Monolinguals (N = 246,497) grew up speaking English only; their age of first exposure was coded as 0. Immersion learners (N = 45,067) were either simultaneous bilinguals who grew up learning English simultaneously with another language (age of first exposure = 0), or later learners who learned English primarily in an English-speaking setting (defined as spending at least 90% of their life since age of first exposure in an English-speaking country). Non-immersion learners (N = 266,701) had spent at most 10% of post-exposure life in an English-speaking country and no more than 1 year in total. 2 Subjects with intermediate amounts of immersion (N = 122,068) were not analyzed further.
2.4. Materials
We took a shotgun approach to assessing syntax, using as diverse a set of items as we could fit into a short quiz, addressing such phenomena as passivization, clefting, agreement, relative clauses, preposition use, verb syntactic subcategorization, pronoun gender and case, modals, determiners, subject-dropping, aspect, sequence of tenses, and wh- movement. This broad approach has two advantages. First, it provides a more comprehensive assessment of syntactic phenomena than many prior studies, which focused on a smaller number of phenomena ( Flege, Yeni-Komshian, & Liu, 1999 ; Johnson & Newport, 1989 ; Mayberry & Lock, 2003 ). Second, this diversity provides some robustness to transfer from the first language. That is, while native speakers of some languages may find certain phenomena easier to master than others (e.g., Spanish-speakers may find tense reasonably natural while Mandarin-speakers may find word-order restrictions intuitive), the diversity of items should help wash out these differences (see also discussion below).
2.4.1. Item selection
Items were subjected to several rounds of pilot testing to select a suffficient number of critical items that were diagnostic of proficiency (neither too easy nor too hard) and that represented a wide range of grammatical phenomena, while requiring less than 10 min to complete. These included phenomena known to present difficulties for children, such as passives and clefts, and for non-native speakers, such as tenses and articles. We focused particularly on items known to be difficult for speakers of a variety of first languages: in particular, Arabic, French, German, Hindi, Japanese, Korean, Mandarin, Russian, Spanish, or Vietnamese. Based on previous experiments on gameswithwords.org, we expected these to be among the most common native languages.
In addition to the critical items, we included items designed to distinguish among English dialects drawn from websites describing “Irishisms,” “Canadianisms”, and so on. These items were not used for assessing language proficiency and were not used in the data analyses below, but were important for recruiting subjects (see above). Several rounds of pilot-testing reduced this set to the smallest number of items that could reliably distinguish major English dialects.
As in most previous studies, we solicited grammaticality judgments (e.g., “Is the following grammatical: Who whom kissed ?”). In order to shorten the test and improve the subject experience, where possible we grouped multiple grammaticality judgments into a single multiple-choice question. Because the grammaticality judgment task is time-consuming and unsuitable for probing certain grammatical phenomena, we also included items that required matching a sentence to a picture (e.g., to probe topicalization and the application of linking rules). Several rounds of piloting were used to construct a test that involved items of a range of difficulty.
The final set of 132 items is provided in the Supplementary Materials . Of these, 95 were critical items, defined as items for which the same response was selected by at least 70% of the native English speaking adults 18–70 years old in our full dataset in each of thirteen broadly-defined English dialects (Standard American, African American Vernacular English, Canadian, English, Scottish, Irish, North Irish, Welsh, South African, Australian, New Zealand, Indian, and Singaporean). (For obvious reasons, the exact number of critical items was not known until after the data was collected.) All analyses below are restricted to this set.
Many prior studies classify items according to the syntactic phenomenon they test. While this is straightforward for certain types of tests, such as our sentence-picture matching items, the accuracy of these categorizations for grammaticality judgments is unclear. For instance, in judging a sentence to be grammatical, subjects can hardly be expected to know which syntactic rule the experimenter deliberately did not violate. Likewise, ungrammatical sentences may implicate different rules depending on what the intended message was: I eats dinner could involve an agreement error on the verb or a failure of pronoun selection. Thus, the syntactic violation that catches the subject’s eye may not be the one the experimenter had in mind. Because our goal was merely to have a diverse set of items, an exact count of syntactic phenomena is less important than demonstrating diversity. Thus, we have bypassed these theoretically thorny issues by avoiding categorization and simply providing the entire stimulus set in the Supplementary Materials . As a result, readers can judge for themselves whether the items are sufficiently diverse.
2.4.2. Test reliability
Reliability for the critical items was high across the entire dataset (Chronbach’s alpha = 0.86). Because monolingual subjects were close to ceiling, reliability is expected to be lower for that subset. Reliability is a measure of covariation, and the monolinguals exhibited very little variation (the majority missed fewer than 3 items), exactly as one would expect for a valid test. However, reliability for monolinguals was still well above chance (0.66), indicating that what few errors they made were not randomly distributed (as would be expected from mere sloppiness) nor concentrated on a few “bad” items (in which case, there would be little variance). Thus, our test was sensitive to differences in grammatical knowledge even for monolinguals who were close to ceiling. It is difficult to compare these numbers to prior studies, since most did not report reliability (but see DeKeyser, 2000 ; DeKeyser, Alfi-Shabtay, & Ravid, 2010 ; Granena & Long, 2013 ).
2.4.3. Data
The resulting dataset is available at http://osf.io/pyb8s .
3.1. Learning rate
We focus first on the difficult but theoretically important question of the underlying learning rate. We defer the traditional question of level of ultimate attainment to a later section. Note that all analyses are conducted in terms of log-odds (the log-transformed odds of a correct answer, using the empirical logit method to avoid division by zero) rather than percent correct. Although prior work on critical periods has tended to use percent correct, this is problematic. Specifically, percentage points are not all of equal value, being more meaningful closer to 0% or 100% than when near 50% ( Jaeger, 2008 ). That is, the difference between 95% and 96% is “larger” than the difference between 55% and 56%. Thus, the use of percentages artificially imposes ceiling effects, inflating both Type I and Type II error rates, particularly for interactions. Similarly, graphing results in terms of percentage correct distorts the results (particularly the shapes of curves), and so we have graphed in terms of log odds. For reference, we have included percent correct on the right-hand side of many of the graphs.
Fig. 4 plots the level of performance against current age in separate curves for participants with different ranges of age of first exposure. It simultaneously reveals the effects of age of first exposure (the differences among the curves) and total years of exposure (the left-to-right position along each curve). Immersion learners—who were less numerous than the other groups—were aggregated into three-year bins for age of exposure, except for the simultaneous bilinguals (age of exposure = 0), who constituted their own bin. Curves were smoothed with a five-year floating window (analyses on non-smoothed data are discussed in the next subsection), and each of the estimated performance curves (described below) was restricted to consecutive ages for which there were at least ten participants in the five-year window, leaving 244,840 monolinguals, 44,412 immersion learners, and 257,998 non-immersion learners.
(A and B) Performance curves for monolinguals and immersion learners (A) and non-immersion learners (B) under 70 years old, smoothed with five-year floating windows. (C and D) Corresponding curves for the best-fitting model. (E) Learning rate for the best-fitting model (black), with examples of the many hypotheses for how learning rate changes with age that were considered in model fitting (grey). For additional detail, see Fig. 7 , S3, and S6 .
In order to estimate how underlying learning ability changes with age, we used a novel computational model to disentangle current age, age of first exposure, and amount of experience. Specifically, we modeled syntax acquisition as a simple exponential learning process:
(1) |
where g is grammatical proficiency, t is current age, t e is age of first exposure, r is the learning rate, and E is an experience discount factor, modeled separately for simultaneous bilinguals, immigrants, and non-immersion learners, reflecting the fact that they may receive less English input than monolinguals. We modeled a possible developmental change in the learning rate r as a piecewise function in which r is constant from birth to age t c , whereupon it declines according to a sigmoid with shape parameters α and δ (α controls the steepness of the sigmoid, and δ moves its center left or right):
(2) |
The piecewise structure of this Exponential Learning with Sigmoidal Decay (ELSD) model, and the fact that sigmoid functions can accommodate both flat and steep declines, allows it to capture a very wide range of developmental trajectories, including all of those discussed in the literature. Learning rate may be initially high or low, begin declining at any point in the lifespan (or not at all), decline rapidly or gradually, decline continuously or discontinuously, etc. Examples of the many possibilities encompassed by the model include the different curves shown in Figs. 2 and S2 , as well as the gray lines in Fig. 4E .
The model was fitted simultaneously to the performance curves for monolinguals, immersion learners, and non-immersion learners (cf. Fig. 4A and B ). Parameters were fit with Differential Evolution ( Mullen, Aridia, Gil, Windover, & Cline, 2011 ) and compared using Monte Carlo split-half cross-validated R 2 , which avoids over-fitting. The best-fitting model (R 2 = 0.89) involved a rate change beginning at 17.4 years ( Fig. 4E ). The fit was significantly better than the best fit for alternative models in which learning rate did not change (R 2 = 0.66) or changed according to a step function with no further decline in the learning rate after the initial drop (R 2 = 0.70). Details on these and related models can be found in the supplementary materials .
3.2. Interim discussion
Though the ELSD model is necessarily simplified, the good fit between model and data, and the poorer fit by reasonable alternatives, offers good support for the existence of a critical period for language acquisition, and suggests that our estimate of when the learning rate declines (17.4 years old) is likely to be reasonably accurate.
This age is much later than what is usually found for the offset of the critical period for native-like ultimate attainment of syntax. However, as discussed in the Introduction, because language acquisition takes time, there is no reason to suppose that the last age at which native-like ultimate attainment can be achieved is the same as the age at which underlying ability declines (see also Patkowski, 1980 ). Instead, the relationship between ultimate attainment and critical periods is complex, depending also on how long it takes to learn a language. The ELSD model disentangles these factors. In order to better understand the results of the above analyses, we look at these issues in turn.
3.3. The duration of learning
Little is known about how long it takes learners to reach asymptotic performance. On the one hand, developmentalists have observed that by 3–5 years of age, most children show above-chance sensitivity to many syntactic phenomena ( Crain & Thornton, 2011 ; Pinker, 1994 ). Indeed, our youngest native speakers (~7 years old) were already scoring very well on our quiz ( Fig. 5B ).
(A) Histogram of cutoffs used for minimum years of experience to asymptotic learning in previous studies of syntax ( Abrahamsson, 2012 ; Birdsong & Molis, 2001 ; DeKeyser, 2000 ; DeKeyser et al., 2010 ; Flege et al., 1999 ; Granena & Long, 2013 ; Jia et al., 2002 ; Johnson & Newport, 1989 , 1991 ; Mayberry & Lock, 2003 ; Mayberry, Lock, & Kazmi, 2002 ; McDonald, 2000 ; Weber-Fox & Neville, 1996 ). Papers with multiple studies are included only once, except for McDonald (2000) , which used different cutoffs in two different studies. (B) Accuracy for monolinguals (N = 246,497) and simultaneous bilinguals (N = 30,397). Shadowed area represents ± 1 SE. This highlights information also available in Fig. 4A .
While certainly an important fact about acquisition, this is the wrong standard for research into critical periods. The question has never been “why do non-native speakers not match the competency level of preschooler?” Many of them do. In fact, in our dataset, even non-native immersion learners who began learning in their late 20 s eventually surpassed the youngest native speakers in our dataset ( Fig. 4A ).
Instead, the puzzle driving this entire research domain is why later learners do not reach the same proficiency level of mature native speakers. That is a much higher standard. Many other aspects of syntax continue to develop in the school-age years ( Berman, 2004 , 2007 ; Nippold, 2007 ), and prior studies have not been able to determine the age at which syntactic development concludes. Even for those aspects of syntax that preschoolers are sensitive to, they are rarely at ceiling, and they typically do worse than college-age adults, whether assessed through comprehension, elicited production, or spontaneous production (e.g., Kidd & Bavin, 2002 ; Kidd & Lum, 2008 ; Marcus et al., 1992 ; Messenger, Branigan, McLean, & Sorace, 2012 ; Rowland & Pine, 2000 ). However, while we know that performance continues to improve into the school ages, the literature has little to say about when children attain adult levels of accuracy. Moreover, the common practice of comparing children to college-aged adults necessarily renders undetectable any post-college development.
Even less is known about how long non-native speakers continue to improve on the target language. While a few studies found limited continued improvement for immersion learners after the first five years ( Johnson & Newport, 1989 ; Patkowski, 1980 ), these studies had minimal power to detect continued improvement (see below). Specifically, looking at samples of non-native learners who were selected to have at least three years ( Johnson & Newport, 1989 ) or five years ( Patkowski, 1980 ) of experience, these authors found that while age of first exposure predicted performance, length of experience did not. In contrast, analysis of US Census data suggests that learning continues for decades ( Stevens, 1999 ), though the validity of this self-report data is uncertain. Analysis of foreign language education suggests learning in that context may continue for a couple of decades, though this may merely reflect the slower pace of non-immersion learning ( Huang, 2015 ).
This empirical uncertainty is reflected directly in the ultimate attainment literature. Ultimate attainment analyses require restricting analysis to those subjects who have been learning the target language long enough to have reached asymptote (e.g., Johnson & Newport, 1989 ). In the absence of any clear evidence, researchers have chosen a diverse set of cut-offs, ranging anywhere from three ( Birdsong & Molis, 2001 ; McDonald, 2000 ) to fifteen years ( Abrahamsson, 2012 ) ( Fig. 5A ).
Inspection of Fig. 5B suggests that native speakers did not reach asymptote until around 30 years old, though most of the learning takes place in the first 10–20 years. The results for later learners shown in Fig. 4 similarly suggest a protracted period of learning (for detailed results, see Figs. S21 and S22 in the Supplementary Materials , and surrounding discussion). Note that the increases in performance after the first 15–20 years are modest, which accords with the fact that they are not routinely noticed.
While this prolonged learning trajectory was not anticipated in the language learning literature, it joins mounting evidence that many cognitive abilities continue to develop through adolescence and even adulthood, including working memory, face recognition, magnitude estimation, and various measures of crystalized intelligence ( Germine, Duchaine, & Nakayama, 2011 ; Halberda, Ly, Wilmer, Naiman, & Germine, 2012 ; Hartshorne & Germine, 2015 ).
Thus, even native speakers—who are able to make full use of the critical period—take a very long time to reach mature, native-like proficiency. By implication, someone who started relatively late in the critical period—that is, someone who had limited time to learn at the high rate the critical period provides—would simply run out of time. In order to follow up on this issue and test this implication, we turn to analysis of ultimate attainment.
3.4. Ultimate attainment
Based on the results above, we expect that the last age of first exposure at which native-like attainment is still within reach is likely well prior to 17. Below, we first estimate this age from our own data and then compare that against previous estimates.
Following the usual practice, we first restrict the analysis to those subjects who have been learning English long enough to have reached asymptote (e.g., Johnson & Newport, 1989 ). As described in the previous section, there is no consensus as to how long “long enough” is (see Fig. 5A ). This stems from the fact that, prior to our own study, there was little data to constrain hypotheses (see previous section). Inspection of Figs. 4 and 5 suggests 30 years old as a reasonable cutoff.
Thus, to estimate the age at which mastery of a second language is no longer attainable, we analyzed ultimate attainment curves by focusing on the 11,371 immersion learners and 29,708 non-immersion learners who had at least 30 years of experience (ensuring asymptotic learning) and who were at most 70 years old (avoiding age-related decline) ( Fig. 6 ). We fitted these curves using multivariate adaptive regression splines ( Friedman, 1991 ; Milborrow, 2014 ). Immersion learners showed only a minimal decline in ultimate attainment until an age of first exposure of 12 years ( B = −0.009; 0.01 SDs/year), after which the decline became significantly steeper ( B = −0.06; 0.07 SDs/year). Non-immersion learners showed similar results: From 4 years to 9 years, proficiency showed no decline (in fact it increased slightly; B = 0.01; 0.01 SDs/year), followed by a steep decline ( B = −0.06; 0.07 SDs/year). Two other methods of estimating changes in slope provided similar results (see Supplementary Materials ).
Ultimate attainment for monolinguals, immersion learners, and non-immersion learners, smoothed with a three-year floating window. Shadowed areas represent ± 1 SE. Attainment for monolinguals was significantly higher than that of simultaneous bilinguals (immersion learners with exposure age = 0) ( p < .01).
While these analyses employ the standard method of analyzing subjects who have (presumably) already reached ultimate attainment, the density of our data allows a more direct analysis. Fig. 7 re-plots the data in Fig. 4 against years of experience, aligning the curves for the learners who began at different ages at the onset of learning. Inspection reveals that the learning trajectories for immersion learners who began in the first decade of life (the orange curves) are almost indistinguishable ( Fig. 7A ). We see a similar trend for the non-immersion learners ( Fig. 7B ).
Accuracy as a function of years of experience, by age of first exposure for immersion learners (A) and non-immersion learners (B). Color scheme is same as in Fig. 4 . Red: monolinguals. Orange: AoFE < 11. Green: 10 < AoFE < 21. Blue: AoFE > 20. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
We confirmed these observations with permutation analysis. Specifically, we calculated the average difference between each performance curve and the performance curve for the youngest learners of that type (the simultaneous bilinguals for immersion learners, the learners with an age of first exposure of 4 years for the non-immersion learners). A positive score indicated that the performance curve was, on average, below the curve for the earliest learners. We then constructed an empirical distribution by randomly permuting the age of exposure across participants at a given number of years of experience. The curves were again smoothed with five-year floating windows and the difference scores were again calculated. This was repeated 1000 times. The percentage of cases in this distribution in which the difference score for a given performance curve is larger than the actual difference score for that performance curve serves as a one-tailed p -value (all comparisons reported as significant are also significant as two-tailed tests). These analyses revealed that the performance curves for immersion learners with average exposure ages of 2, 5, and 8 years were not significantly different from those of simultaneous bilinguals (exposure age = 0; p s > 0.31), while the curves for later learners were significantly lower ( p s < 0.01). Similarly, non-immersion learners with ages of exposure of 5–11 years were indistinguishable from our earliest non-immersion learners (4 years; ps > 0.31), whereas later learners learned significantly more slowly ( p s < 0.01).
3.4.1. Comparison with previous ultimate attainment results
Both traditional ultimate attainment analyses and permutation analyses indicated that learners must start by 10–12 years of age to reach native-level proficiency. Those who begin later literally run out of time before the sharp drop in learning rate at around 17–18 years of age. For non-immersion learners, the ceiling was lower but the overall story was the same: little difference between learners who start within the first decade of life, with a ceiling that noticeably drops for later learners. These findings are consistent with the protracted trajectory of learning that we observe in our data (see previous section).
However, our results for immersion learners diverge from those of some previous studies (there are no similar studies of non-immersion learners). For instance, Johnson and Newport’s (1989) study of immersion learners found no correlation between ultimate attainment and age of first exposure after an onset age of 16, whereas we see a strong relationship (for review, see Qureshi, 2016 ). In principle, this could be due to differences in subject population or the types of grammar rules tested. Indeed, researchers frequently argue that such differences have large effects on ultimate attainment, based on the fact that studies of different populations or stimuli have produced different results ( Abrahamsson, 2012 ; Birdsong & Molis, 2001 ; DeKeyser, 2000 ; DeKeyser et al., 2010 ; Flege et al., 1999 ; Granena & Long, 2013 ; Hakuta et al., 2003 ; Jia et al., 2002 ; Johnson & Newport, 1989 ; Vanhove, 2013 ; Weber-Fox & Neville, 1996 ).
However, a recent analysis by Vanhove (2013) raised questions about whether these differences are statistically meaningful. Whereas most prior studies had between 50 and 250 subjects, Vanhove demonstrates that precisely measuring how ultimate attainment changes as a function of age of first exposure requires thousands. Only one previous dataset, based on US Census data, reaches sufficient sample size ( Hakuta et al., 2003 ; Stevens, 1999 ). However, this study was based on a self-report of proficiency on a four-point scale, which is unlikely to have much precision. Thus, differences across findings in the literature could reflect nothing more than random noise.
Thus, in order to better understand whether the differences in our findings and those of prior studies are meaningful, we need to consider the precision of these findings. We estimated precision using bootstrapping, simulating running many different studies by resampling with replacement from our own data ( Efron & Tibshirani, 1993 ). The results of each simulation will be slightly different, and so the range of results across simulations simulates the variability we would expect from statistical noise alone. Crucially, we can simulate running studies with different sample sizes. Thus, we can ask whether Johnson and Newport’s (1989) findings are within what we might have found had we used our own methods but tested the same number of subjects (N = 69).
For our simulations, we considered two different sample sizes: N = 69, the size of the classic Johnson and Newport (1989) study, and N = 275, larger than the largest prior study, with the exception of the aforementioned Census studies. For comparison, we also simulated studies with N = 11,371, the number of subjects in our own ultimate attainment results described in the previous section.
We focused on three different analyses that have been reported in a number of prior studies ( Bialystok & Miller, 1999 ; Birdsong & Molis, 2001 ; DeKeyser, 2000 ; DeKeyser et al., 2010 ; Flege et al., 1999 ; Johnson & Newport, 1989 ; Weber-Fox & Neville, 1996 ). First, we considered Johnson and Newport’s finding that the correlation between age of first exposure and ultimate attainment is much stronger before an exposure age of 16 ( r = −0.87) than after ( r = −0.16). This finding has proved controversial, with subsequent studies finding much weaker effects or no effect at all ( Bialystok & Miller, 1999 ; Birdsong & Molis, 2001 ; DeKeyser, 2000 ; Johnson & Newport, 1989 ). All these prior findings are well within what one would expect for N = 69 ( Fig. 8 , upper left). As power increased, the variability in the estimates dropped dramatically, with more highly-powered studies being increasingly unlikely to find any substantial difference in the correlations before and after 16 years old.
We conducted 2500 simulated experiments of monolingual and immersion learners with each of three sample sizes: N = 69 (equivalent to Johnson & Newport, 1989 ), N = 275 (larger than the largest prior lab-based study), and N = 11,371 (equivalent to the present study). Three analyses were considered. Left: Correlation between age of first exposure and ultimate attainment prior to 16 years old minus after 16 years old. Middle: First subgroup of subjects to be significantly worse than monolinguals in a t -test (note: the top graph uses the same age bins as Johnson & Newport, 1989 ). Right: age of first exposure at which performance begins to decline more rapidly, if any. Blue: estimates from Bialystok and Miller (1999) , Birdsong and Molis (2001) , DeKeyser (2000) , DeKeyser et al. (2010) , Flege et al. (1999) , Johnson and Newport (1989) , and Weber-Fox and Neville (1996) . While many other papers addressed similar issues, these papers provide the closest analog to Johnson & Newport in that they used a broad-spectrum test of syntax, defined the onset of learning as the age at immigration, and (crucially) report comparable statistics. Red: estimates from current study. Full details available in Supplementary Materials . (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Second, Johnson and Newport also reported that individuals who began learning English at 8–10 years old failed to reach monolingual-like ultimate attainment, whereas individuals who began earlier did, suggesting that the “optimal period” for language-learning is 0–7 years old. Once again, there has been considerable variability in subsequent studies, and our own study finds that even simultaneous bilinguals do not quite reach monolingual levels. Vanhove (2013) suggested, based on power calculations, that accurately estimating the end of the optimal period requires thousands of subjects. Although a small study can detect very large effects, the differences between learners who began just within the optimal period and those who began just after are relatively small ( Fig. 6 ) and thus undetectable with a low-power study. Our simulations confirm this analysis ( Fig. 8 , middle column): in our simulation of Johnson & Newport ( Fig. 8 , middle column, top), the 95% confidence interval contained almost the entire range. Even with 275 subjects, a wide range of findings would be expected. However, simulations based on our full sample show no variability at all, with learners who began at 1 year of age performing reliably worse than monolinguals ( Fig. 8 , middle column, bottom).
Third, whereas the previous analysis of the optimal period followed Johnson and Newport’s method of using t-tests to compare native speakers to groups of later-learners, subsequent researchers have used instead curve estimation—typically segmented regression with breakpoint estimation—which is argued to be more precise and less prone to false positives ( Birdsong & Molis, 2001 ; Vanhove, 2013 ; but see DeKeyser et al., 2010 ). If there is an optimal period, the slope of the ultimate attainment curve should initially be close to 0, followed by a point where it becomes significantly more negative. By this standard of evidence, most studies have failed to find any evidence of an optimal period ( Birdsong & Molis, 2001 ; Flege et al., 1999 ; Vanhove, 2013 ). Our simulations suggest these prior findings were false negatives due to low power: Like the majority of prior studies, low-power simulations elicited largely null results, whereas high-power simulations suggested an optimal period ending in early or middle childhood ( Fig. 8 , right).
3.4.2. Interim discussion
Two sets of analyses of our data suggest that learners who begin as late as 10–12 years old reach similar levels of ultimate attainment as native bilinguals. After that age, we find a continuous decline in attainment as a function of age of first exposure, with no evidence that this relationship ceases after a particular age (cf. Johnson & Newport, 1989 ; Pulvermüller & Schumann, 1994 ). These findings are consistent with our results for learning rate. Interestingly, these findings held not only for immersion but also non-immersion learners, a population that has not been much studied in this regard.
Our findings do contrast with the conclusions of some prior studies of ultimate attainment in immersion learners. However, as our simulations show, these conclusions were probably overfit to point estimates. That is, conclusions depended on the most probable estimate (the optimal period ends at 8 years of age), ignoring the error bars, which in some cases were likely so large as to encompass the entire possible range ( Fig. 8 ). In contrast, our larger sample size allows for fairly precise estimates ( Fig. 8 ). These simulations support Vanhove’s (2013) contention that thousands of subjects are required to provide reliable conclusions about ultimate attainment. Note that we cannot conclude that differences in stimuli or population do not matter for ultimate attainment, only that studying such effects requires very large datasets. We return to this issue in the General Discussion.
4. General discussion
Taken together, the analyses above all point to a grammar-learning ability that is preserved throughout childhood and declines rapidly in late adolescence. This model provided a better fit to the data than did a wide range of alternatives, including models with declines that were earlier or later, faster or slower, sharper or smoother.
In addition to providing the first empirical estimate of how language-learning ability changes with age, we addressed two related issues. First, we found that native and non-native learners both require around 30 years to reach asymptotic performance, at least in immersion settings. While this question has not been previously addressed, these findings are compatible with what is known about the initial period of learning.
Second, we found that ultimate attainment—that is, the level of asymptotic performance—is fairly consistent for learners who begin prior to 10–12 years of age. We found no evidence that the ultimate attainment curve reaches a floor at around puberty, as has been previously proposed ( Johnson & Newport, 1989 ). While these results differed from the conclusions of some prior studies, our simulations showed that the prior findings were in fact too noisy to provide precise estimates. 3 To provide reliable results about ultimate attainment, a study should have in excess of 10,000 subjects (see also Vanhove, 2013 ). This suggests that the results of those prior studies, all but one of which has fewer than 250 subjects, largely reflect statistical noise. The remaining study had many subjects but uncertain validity (see discussion above).
This set of results is internally consistent, adding credibility to the whole. However, our conclusions—like any conclusions—are only as good as the data supporting them. Below, we address a number of possible concerns. These include both methodological concerns about the data and how they were collected but also more theoretical concerns, like the possibility that results differ across subsets of subjects or items. We then conclude by discussing the implications of our results, should they prove valid and robust.
4.1. Potential concerns and complications
4.1.1. familiarity with the testing procedure.
One possible concern is that differences across subjects were due to age-related differences in familiarity with the Internet. Prior comparisons of Internet-based and offine datasets have found little support for this concern ( Hartshorne & Germine, 2015 ). Similarly, some of the differences between children and adults could conceivably be due to general test-taking ability. In order to better understand interactions between subject age and test method, if any, it would be ideal to gather data from a variety of tests in a variety of modalities.
Crucially, however, most of our analyses did not depend on the current age of the subject but on their age at first exposure, which should weaken any effects of current age. Moreover, we can compare the learning trajectories of learners who started at different ages (see Figs. 4 and 7 but especially Figs. S21–S22 in the Supplementary Materials ). If older subjects are substantially better at taking our test, this should appear as more rapid early learning. As inspection of the figures indicates, any such effect is inconsistent and small.
4.1.2. Test modality
Our use of a written comprehension test was dictated by our methodology. Comprehension studies can be scored automatically (which is crucial when there are over half a million subjects), and written tests do not require high-quality audio equipment or sound booths. Nonetheless, one might ask how these choices affected our results.
Certainly, differences between production and comprehension and between written and oral modalities can affect comparisons between native and non-native speakers ( Bialystok & Miller, 1999 ). Listening places high demands on speed and memory (one can re-read but not rehear), and the speech must be analyzed by non-native acoustic phonetics and phonology, which we do not test here. Written tests require literacy. Production allows one to strategically avoid difficult and imperfectly learned words and constructions.
Whether any of these factors affect estimates of a critical period depends on whether they interact with the variables that define critical period effects, namely age at first exposure, current age, and years of experience. While the necessary studies are not currently feasible, this is likely to change as technology improves. (For instance, we are exploring the use of machine learning to characterize the nativeness of a written text.)
Importantly, none of these considerations would make the study of critical periods in written comprehension uninteresting or uninformative, merely complex. Results from any modality must reflect underlying grammatical ability at least to some degree, and reading comprehension is important in its own right, given the importance of reading in many modern societies. (In fact, for many non-native speakers, this may be their primary use for the non-native language.)
4.1.3. Item selection and quiz difficulty
Another potential worry is that our results may depend on smallish differences among subjects who are already near the ceiling (for relevant discussion, see: Abrahamsson & Hyltenstam, 2009 ; Birdsong, 2006 ). Mitigating this concern is that, as we argued in the Introduction, the ceiling is where all the action is. What is remarkable about language is that we are (nearly) all extremely good at it, including adult learners. For reference, we noted that over-regularizations of irregular verbs, which are among the most salient errors in the speech of preschoolers, occur in only 0.75% of their utterances. On a continuum of linguistic ability that includes apes and machines at one end, preschoolers and reasonably diligent late learners are clustered at the other end, near native-speaking adults. Indeed, the question in the critical period literature has never been why adults are incapable of learning a new language—obviously they are—but why adult learners so rarely (if ever) achieve native-like mastery. Likewise, asking whether adult learners can master basic syntax may be theoretically interesting but distracts from the original motivation for this literature: adult learners rarely, if ever, achieve the same level of mastery as those who started in childhood. In order to study that phenomenon, the relevant yardstick is the asymptotic performance of native speakers.
Still, we can ask whether our results hold for both items mastered early in typical development and for items mastered only in adolescence or adulthood. We found no evidence of such a difference: In the best-fitting models of learning, the learning rate began to slow at approximately the same time for the 47 items that are mastered by the youngest monolingual English-speakers in the sample (ages 7–8) as for the 48 items that are mastered only by the older ones: 17.3 years old and 18.2 years old, respectively. Moreover, if there were substantial interactions between item and age of first exposure, we would expect to see substantial differences in terms of which items were more or less difficult for early and late learners. However, item difficulty was strongly correlated across learners regardless of age of first exposure (for details of these analyses, see Supplementary Materials , “Item Effects”).
We might similarly ask whether results vary based on the type of syntactic construction tested. Prior analyses of ultimate attainment have provided conflicting results, likely due to the power issues discussed above ( Coppieters, 1987 ; Flege et al., 1999 ; Johnson & Newport, 1989 , 1991 ; McDonald, 2000 ; Weber-Fox & Neville, 1996 ) and the theoretical issues raised below. Our just-discussed analyses of item difficulty provide some initial evidence against substantial differences across syntactic phenomena. More precise analyses would involve the direct comparison of different types of constructions. Unfortunately, our quiz was designed to cover a wide range of phenomena, and thus we have few items of any given type, making it difficult to distinguish differences between items and differences between item types . In any case, such analyses raise thorny theoretical questions: different theories of syntactic processing categorize phenomena differently, and any given sentence involves many different phenomena. Thus, classifying items by syntactic phenomena is far from trivial and may not even be the right approach. Progress on this question will require a significant amount of further research. 4 If it turns out that different aspects of syntax do indeed have different critical periods, the conclusions presented here would need to be revised. Design of follow-up studies may be informed by comparing items in our dataset, which is available at http://osf.io/pyb8s .
4.1.4. The effect of the first language
Our results are unlikely to be specific to any one language or language family: Participants listed more than 6000 native languages or combinations of them. The best-represented language families among immersion and non-immersion learners were Uralic (N = 54,664), Slavic (N = 41,640), West Germanic (N = 38,385), Romance (N = 40,476), Turkic (N = 29,816), and Chinese (N = 15,161). The remaining 29% of participants either had multiple native languages or had native languages belonging to a different family. Thus, no language contributed more than a small fraction of the immersion or non- immersion learners ( Fig. 3C ). However, this leaves the possibility that our results reflect an epiphenomenal average of very different trajectories for very different types of learners ( Bialystok & Miller, 1999 ; McDonald, 2000 ).
It is uncontroversial that speakers of different native languages make characteristic mistakes when speaking English ( Schachter, 1990 , among others); indeed, the algorithm we used as part of our recruitment strategy depended on this fact (see Section 2.2). However, that is logically distinct from the question as to whether critical periods differ across native languages. Ideally, we would compare the results of our model for speakers of different native languages. However, our samples of individual languages are too small. Specifically, because our data are unevenly distributed across ages and learner conditions, we risk over-fitting certain conditions (such as monolinguals) at the expense of others. As described in the Method, we circumvented this issue by averaging across subjects in each bin prior to running the model. This is not applied easily to subsets of the data: too many bins have few or no subjects. In any case, we lack a computationally tractable method for comparing model fits for different datasets. Thus, we must leave this for future research.
We can, however, address a related question. It could be that speakers of different native languages learn English more or less quickly and to a greater or lesser degree. At best, this would add noise to our analyses. At worst, to the extent that native language is confounded with other variables of interest in our sample (e.g., age of first exposure), it could have distorted our results. Anecdotally, many people perceive that speakers of certain languages are better or worse at English, though it is hard to know how much this is confounded with accent (which likely has a critical period distinct from that of syntax), cultural variation in age at first exposure, and differences in the types of exposure (e.g., songs, movies, tourism, coursework) and instructional methods. For instance, in our dataset, speakers of Chinese and Western Germanic languages tended to start learning English in immersion settings earlier than speakers of Turkic or Uralic languages (5.2 and 5.9 years old vs. 13.4 and 14.8 years old, respectively). More systematically, some studies have suggested different patterns of ultimate attainment for speakers of different native languages ( Bialystok & Miller, 1999 ), though caution is warranted given the extremely low power for such studies (see Fig. 8 and surrounding discussion).
We considered the effect of native language on three different metrics of learning success: the level of ultimate attainment (how well the most advanced learners do), the age at the end of the optimal period (the last age to start learning in order to reach native-like performance), and the shape of the learning curve (performance as a function of years of experience). In keeping with our earlier analyses, ultimate attainment was defined as the average performance for subjects no older than 70 years old and with at least 30 years of experience with English. To increase power, we grouped subjects into Uralic, Slavic, West Germanic, Romance, and Chinese language groups (no other language group had nearly as many speakers at similarly wide ranges of years of experience and ages of first exposure). For each measurement, we assessed the level of evidence that speakers of one language group differed from the others using Bayes Factor model comparison with the BIC approximation ( Wagenmakers, 2007 ). Details for all analyses are provided in the Supplementary Materials , under “Item Effects.”
By looking at ultimate attainment, we can assess whether speakers of different languages have greater or lesser success in learning English, equating for years of experience. In fact, the differences across language groups were small (see Fig. S14 ) and generally not reliable. In most cases, analyses favored the null hypothesis (no difference between the target language and the other languages), and differences across language groups were inconsistent: among learners who began at age 0, the best-performing language group was Romance, for learners beginning at 1–5 years old, it was West Germanic, and for learners who began at 6–10 years old, it was Chinese. Likewise, analysis indicated that the length of the optimal period does not vary across language groups. We found slightly more evidence for differences in learning curves. In particular, simultaneous English-Chinese speakers could be distinguished from the rest, whereas simultaneous bilinguals who spoke Romance or West Germanic languages both matched the overall pattern. However, the actual differences are subtle and seem to reflect slightly faster initial learning by the Chinese speakers ( Fig. S18 ). Most other comparisons were not possible due to insufficiently many subjects (see Supplementary Materials ).
Thus, although speakers of different languages make different mistakes, we find only limited evidence of differences in learning once learning context (immersion vs. non-immersion), years of experience, and age at first exposure are taken into account. That said, power analyses suggest that we only had sufficient subjects to detect relatively large effects, meaning that we cannot rule out more subtle differences (see Supplementary Materials , under “Item Effects”). These power analyses should, however, provide guidance on sample sizes for future research along these lines.
Whatever these analyses say about language-learning in general, they do not provide any evidence that our findings were heavily confounded by differences across the native languages in our sample.
4.2. Implications
The analyses above suggest that our findings are reasonably robust, particularly in comparison to those of previous studies. While this inspires confidence, it should also suggest caution: future work that successfully addresses the limitations of the present study may similarly prompt significant revisions in what we believe to be true. Science is the process of becoming less wrong, and while hopefully the revisions are smaller and smaller after each step, there is no way of knowing that this is the case in advance. Thus, confirmation and extension of the present results is crucial, particularly given the novelty of our questions, methods, models, and results.
Nonetheless, we believe it is useful to consider the implications of the present findings, on the presumption that they prove to be (reasonably) robust:
4.2.1. The nature of the critical period for second language acquisition
On the assumption that the present results apply broadly to syntax acquisition by diverse learners, they have profound theoretical implications. Most importantly, they clarify the shape of the well-attested critical period for second-language acquisition: a plateau followed by a continuous decline. The end of the plateau period must be due to changes in late adolescence rather than childhood, whether they are biological, social, or environmental. Thus the critical period cannot be attributed to neuronal death or syntactic pruning in the first few years of life, nor to hormonal changes surrounding adrenarche or puberty ( Johnson & Newport, 1989 ; Lenneberg, 1967 ; Pinker, 1994 ). Also casting doubt on the effect of hormones is our finding that girls do not show a decline in learning ability before boys do, despite their earlier age of puberty (see Supplementary Materials ). Likewise, the critical period cannot be explained by documented developmental changes in working memory, episodic memory, reasoning ability, processing speed, or social cognition ( Hakuta et al., 2003 ; Hartshorne & Germine, 2015 ; Klindt, Devaine, & Daunizeau, 2017 ; Morgan-Short & Ullman, 2012 ; Newport, 1988 ), to the diminished likelihood that adolescent and adult immigrants will be immersed in an environment of native speakers and identify with the new culture, 5 or to gradually accumulating interference from a first language ( Hernandez et al., 2005 ; Jia et al., 2002 ; Sebastián-Gallés et al., 2005 ).
In short, these data are inconsistent with any hypothesis that places the decline in childhood—which is to say, every prior specific hypothesis that we know of. What, then, could explain the critical period? There are a number of possibilities. For instance, it remains possible that the critical period is an epiphenomenon of culture: the age we identified (17–18 years old) coincides with a number of social changes, any of which could diminish one’s ability, opportunity, or willingness to learn a new language. In many cultures, this age marks the transition to the workforce or to professional education, which may diminish opportunities to learn. Note that causality (if any) could run the other direction: cultures may have chosen this age for certain transitions because of age-dependent changes in neural plasticity. Further traction on these issues could come from cross-cultural comparison, or comparison of individuals within a culture who are on different educational tracks.
Alternatively, the critical period could reflect interference from the first language, so long as this interference is non-linear rather than gradually accumulating. While it has generally been assumed that interference from the first language would be proportional to the amount of first language learned—something inconsistent with our data—we cannot rule out the possibility of non-linear interference. Neural network models, which are capable of showing interference from a first language ( Hernandez et al., 2005 ), can exhibit surprising nonlinearities ( Haykin, 1999 ; Hernandez et al., 2005 ). It remains to be seen whether they can successfully model the nonlinearities we actually observed.
Finally, the end of the critical period might reflect late-emerging neural maturation processes that compromise the circuitry responsible for successful language acquisition (whether specific to language or not). While language acquisition researchers often focus on neural development in the childhood years, the brain undergoes significant changes through adolescence and early adulthood ( Blakemore & Mills, 2014 ; Mills, Lalonde, CLasen, Giedd, & Blakemore, 2014 ; Pinto, Hornby, Jones, & Murphy, 2010 ; Shafee, Buckner, & Fischl, 2015 ; Tamnes et al., 2010 ). While continued develoment of the prefrontal cortex is perhaps the most familiar, changes occur throughout the brain and along multiple dimensions. Drawing on these and other findings, some researchers have suggested that adolescence may involve a number of different biologically-driven critical periods ( Crews, He, & Hodge, 2007 ; Fuhrmann, Knoll, & Blakemore, 2015 ; see also Ghitza & Gelman, 2014 ).
Little is certain about the relationship between neural maturation and behavioral maturation, other than the likelihood it is complex. Current evidence suggests that critical periods in perception involve a complex interplay of neurochemical and epigenetic promoters and brakes for both synaptic pruning and outgrowth ( Werker & Hensch, 2015 ). Given this complexity, and the relative sparseness of the data on neural maturation, it is hard to say whether any of the identified neural maturation processes might correspond to the changes in syntax acquisition that we observed.
Nor can we do much more than speculate as to whether these maturational process (if any) are specific to structures subserving language acquisition. It is notable that language-learning ability is, out of every cognitive ability whose developmental trajectory has been characterized behaviorally, the only one that is stable through childhood and declines sharply in late adolescence ( Hartshorne & Germine, 2015 ). This observation is consistent with the possibility of language-specific maturation. However, the developmental trajectories of some cognitive abilities, such as procedural memory, have not been well characterized ( Fuhrmann et al., 2015 ; Hartshorne & Germine, 2015 ). Moreover, cognitive testing has largely focused on simple abilities that can be measured in a single, short session (e.g., working memory). In contrast, syntax acquisition takes place over much longer intervals and involves learning a complex, interlocking system. Thus, progress on this question will require characterization of a broader range of cognitive abilities, as well as acquisition of other complex systems (e.g., music or chess).
In attempting to gain traction on these issues, there are additional complexities, which future studies should seek to clarify. The duration of the critical period may differ for other aspects of language, like phonology and vocabulary. Moreover, we cannot be certain that syntax learning ability is a unitary construct rather than the combination of multiple factors potentially operating on distinct timelines and affecting different aspects of syntax differently. Second, the exact timing of the critical period may be obfuscated by older learners deploying conscious learning strategies, absorbing explicit instruction, or transferring knowledge from the first language. Some purchase on these issues may come from additional studies, potentially using different methods (e.g., online processing, production, ERP, or longitudinal studies), should obtaining sufficiently many subjects become feasible. Finally, because our dataset consists of people’s performance in a second language, it does not directly address the question of how age affects the learning of a first language. It is possible that exposure to linguistic input delays the atrophy of language learning circuitry, in which case the decline in learning ability we have documented would represent the prolongation of a critical period that terminates sooner in people who have been deprived of all language input ( Curtiss, 1994 ; de Villiers, 2007 ; Mayberry, 1993 ; Newport, 1990 ). Because delayed first-language acquisition is fortunately rare, it would be impossible to achieve a sample size similar to the one here, but our results could be used to guide smaller, targeted studies.
Crucially, the investigation of these issues—all of which have long been of interest but difficult to address—can now be guided by the finding that the ability to learn the grammar of a new language, though indeed compromised in adults compared to children, is largely or entirely preserved up to the cusp of adulthood.
4.2.2. Additional implications
The dataset bears on many issues beyond those discussed in detail above. For instance, the data contain a rich source of information about dialect variation and L1 transfer effects. We briefly mention a few other issues. First, prior work has indicated that simultaneous bilinguals do not reach the same level of proficiency in phonology as individuals with a single first language ( Sebastián-Gallés et al., 2005 ). We extend this finding to syntax, where it is apparent throughout the lifespan Fig. 5B ). ( This finding is consistent with some earlier work suggesting that a sufficiently sensitive test can distinguish even highly proficient bilinguals from monolinguals ( Abrahamsson & Hyltenstam, 2008 , 2009 ). 6 Our model captures this difference as one of exposure, estimating that simultaneous bilinguals receive only 63% as much English input as monolinguals (see Fig. S6 ). Though parsimonious, this is not the only possible explanation; alternatives include the effects of suppression of the non-target language and influences of each language on the other ( Birdsong & Gertken, 2013 ).
Similarly, there are a number of interesting demographic effects. We confirm prior findings of a main effect of education on ultimate attainment, with post-secondary education resulting in higher accuracy (see Supplementary Materials , “Education Differences”) ( Birdsong, 2014 ; Hakuta et al., 2003 ). We likewise find a main effect for gender, with higher accuracy by females (see Supplementary Materials , “Gender Differences”). In neither case do these main effects appear to interact with age at first exposure, and so they are unlikely to be relevant for critical periods. However, they likely have implications for other aspects of language learning.
We have made the data available ( http://osf.io/pyb8s ) in the hopes they will be prove informative for investigation of these and other questions.
Supplementary Material
Acknowledgments.
We are indebted to David Barner, David Birdsong, Kenji Hakuta, Elissa Newport, Laura-Ann Petitto, and Michael Ullman for comments, to Tanya Ivonchyk and Brandon Benson for help with developing the quiz, and to the hundreds of thousands of volunteers who participated in the study. This research was supported by an NIH NRSA award to JKH (5F32HD072748) and the Center for Minds, Brains, & Machines (NSF STC CCF-1231216).
Appendix A. Supplementary material
Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.cognition.2018.04.007 .
The first several thousand participants were asked to list their “native languages.” Based on participant feedback, this was adjusted to “native languages (learned from birth).”
A small proportion of the non-immersion learners (2.7%) reported ages of first exposure between 1 and 3 years. These learners scored quite poorly (the ultimate attainment of those with ages of exposure of 1 year was as poor as those with ages of exposure in their 20 s) and exhibited noisy performance curves that, unlike those of all other learners, failed to show any improvement with age ( Fig. S1 ). While this might be a genuine and surprising finding, it more likely reffects the idiosyncratic histories or questionnaire responses of these learners. Unlike the later non-immersion learners, many of whom cited school instruction as their initial source of their exposure, the early non-immersion learners gave little indication about the nature of their first exposure, and it is possible that they had little formal instruction and had learned primarily through television and movies (frequently cited by non-immersion learners as significant sources of English input). Given this uncertainty, we excluded these participants from the main analyses.
We also noted a number of limitations and confounds in prior studies, such as how ultimate attainment was defined, which would have biased results. However, detailed investigation shows that the resulting biases and imprecisions were likely swamped by the effect of low power (see Supplementary Materials , “Effect of Analysis Decisions”).
We note a further difficulty. All research in this domain has treated items as fixed effects, averaging across them. This simplifies calculation, but at a cost: such statistical analyses do not directly assess the question of whether the results generalize beyond the items used ( Baayen, Davidson, & Bates, 2008 ; Clark, 1973 ). This problem is mitigated somewhat when using a large and representative set of items—as we do—but is particularly problematic when looking at smaller samples of items. The standard solution currently is to use mixed effects modeling ( Baayen et al., 2008 ). However, mixed effects modeling requires significant computational power. We have so far been unable to identify a tractable method of applying mixed effects modeling to a dataset the size of the present one.
Note that while critical period researchers widely assume that there are age-related effects on cultural identification among immigrant groups, this may not in fact be the case ( Chudek, Cheung, & Heine, 2015 ).
This finding also has practical consequences for research. Many researchers have argued that if later learners can reach monolingual levels of performance, that would be evidence against critical periods (and conversely, the failure of later learners to match monolinguals would be evidence for critical periods) (e.g., Abrahamsson & Hyltenstam, 2009 ). This standard, in conjunction with our results, leads to the unlikely conclusion that the critical period for syntax closes prior to birth. For additional discussion, see Birdsong and Gertken (2013) .
Contributions
JKH designed the study, collected the data, and performed the analyses. All three authors contributed to designing the analyses and to writing the paper.
- Abrahamsson N. Age of onset and nativelike L2 ultimate attainment of morphosyntactic and phonetic intuition. Studies in Second Language Acquisition. 2012;34(02):187–214. [ Google Scholar ]
- Abrahamsson N, Hyltenstam K. The robustness of aptitude effects in near-native second language acquisition. Studies in Second Language Acquisition. 2008;30(4):481–509. [ Google Scholar ]
- Abrahamsson N, Hyltenstam K. Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language Learning. 2009;59(2):249–306. [ Google Scholar ]
- Baayen RH, Davidson DJ, Bates DM. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language. 2008;59:390–412. [ Google Scholar ]
- Berman RA, editor. Language development across childhood and adolescence. Philadelphia, PA: John Benjamins Publishing Company; 2004. [ Google Scholar ]
- Berman RA. Developing linguistic knowledge and language use across adolescence. In: Hoff E, Shatz M, editors. Blackwell handbook of langauge development. Malden, MA: Blackwell Publishing; 2007. pp. 347–367. [ Google Scholar ]
- Bialystok E, Miller B. The problem of age in second-language acquisition: Influences from language, structure, and task. Bilingualism: Language and Cognition. 1999;2(02):127–145. [ Google Scholar ]
- Birdsong D. Age and second language acquisition and processing: A selective overview. Language Learning. 2006;56:9–49. [ Google Scholar ]
- Birdsong D. The critical period hypothesis for second language acquisition: Tailoring the coat of many colors. In: Pawlak M, Aronin L, editors. Essential topics in applied linguistics and multilingualism. Studies in honor of David Singleton. Berlin and New York: Springer; 2014. pp. 43–50. [ Google Scholar ]
- Birdsong D. Critical periods. In: Aronoff M, editor. Oxford bibliographies in linguistics. New York: Oxford University Press; 2017. [ Google Scholar ]
- Birdsong D, Gertken LM. In faint praise of folly: A critical review of native/ non-native speaker comparisons, with examples from native and bilingual processing of French complex syntax. Language, Interaction and Acquisition. 2013;4(2):107–133. [ Google Scholar ]
- Birdsong D, Molis M. On the evidence for maturational constraints in second-language acquisition. Journal of Memory and Language. 2001;44(2):235–249. [ Google Scholar ]
- Blakemore SJ, Mills KL. Is adolescence a sensitive period for sociocultural processing? Annual Review of Psychology. 2014;65:9.1–9.21. doi: 10.1146/annurev-psych-010213-115202. [ DOI ] [ PubMed ] [ Google Scholar ]
- Bruer JT. The myth of the first three years. New York: Free Press; 1999. [ Google Scholar ]
- Chudek M, Cheung BY, Heine SJ. US immigrants' patterns of acculturation are sensitive to their age, language, and cultural contact but show no evidence of a sensitive window for acculturation. Journal of Cognition and Culture. 2015;15:174–190. [ Google Scholar ]
- Clark HH. The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior. 1973;12(4):335–359. [ Google Scholar ]
- Coppieters R. Competence differences between native and near-native speakers. Language. 1987:544–573. [ Google Scholar ]
- Crain S, Thornton R. Syntax acquisition. WIREs Cognitive Science. 2011;3(2):185–203. doi: 10.1002/wcs.1158. [ DOI ] [ PubMed ] [ Google Scholar ]
- Crews F, He J, Hodge C. Adolescent cortical development: A critical period of vulnerability for addiction. Pharmacology, Biochemistry, and Behavior. 2007;86:189–199. doi: 10.1016/j.pbb.2006.12.001. [ DOI ] [ PubMed ] [ Google Scholar ]
- Curtiss S. Learning as a cognitive system: Its independence and selective vulnerability. In: Otero CP, editor. Noam Chomsky: Critical assessments. Vol. 1. New York, NY: Routledge; 1994. pp. 227–228. [ Google Scholar ]
- de Villiers JG. The interface of language and theory of mind. Lingua. 2007;117:1858–1878. doi: 10.1016/j.lingua.2006.11.006. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- DeKeyser RM. The robustness of critical period effects in second language acquisition. Studies in Second Language Acquisition. 2000;22(04):499–533. [ Google Scholar ]
- DeKeyser RM, Alfi-Shabtay I, Ravid D. Cross-linguistic evidence for the nature of age effects in second language acquisition. Applied Psycholinguistics. 2010;31(03):413–438. [ Google Scholar ]
- Efron B, Tibshirani R. An introduction to the bootstrap. Boca Raton, FL: Chapman & Hall/CRC; 1993. [ Google Scholar ]
- Flege JE, Yeni-Komshian GH, Liu S. Age constraints on second-language acquisition. Journal of Memory and Language. 1999;41(1):78–104. [ Google Scholar ]
- Friedman JH. Multivariate adaptive regression splines. The Annals of Statistics. 1991;19(1):1–141. [ Google Scholar ]
- Fuhrmann D, Knoll LJ, Blakemore SJ. Adolescence as a sensitive period of brain development. Trends in Cognitive Sciences. 2015;19(10):558–566. doi: 10.1016/j.tics.2015.07.008. [ DOI ] [ PubMed ] [ Google Scholar ]
- Germine LT, Duchaine B, Nakayama K. Where cognitive development and aging meet: face learning ability peaks after age 30. Cognition. 2011;118(2):201–210. doi: 10.1016/j.cognition.2010.11.002. [ DOI ] [ PubMed ] [ Google Scholar ]
- Ghitza Y, Gelman A. The Great Society, Reagan's Revolution, and generations of presidential voting 2014 [ Google Scholar ]
- Granena G, Long MH. Age of onset, length of residence, language aptitude, and ultimate L2 attainment in three linguistic domains. Second Language Research. 2013;29(3):311–343. [ Google Scholar ]
- Guion SG, Flege JE, Liu SH, Yeni-Komshian GH. Age of learning effects on the duration of sentences produced in a second language. Applied Psycholinguistics. 2000;21(02):205–228. [ Google Scholar ]
- Hakuta K, Bialystok E, Wiley E. Critical evidence: A test of the critical-period hypothesis for second-language acquisition. Psychological Science. 2003;14(1):31–38. doi: 10.1111/1467-9280.01415. [ DOI ] [ PubMed ] [ Google Scholar ]
- Halberda J, Ly R, Wilmer JB, Naiman DQ, Germine L. Number sense across the lifespan as revealed by a massive Internet-based sample. Proceedings of the National Academy of Sciences. 2012;109(28):11116–11120. doi: 10.1073/pnas.1200196109. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Hartshorne JK, Germine LT. When does cognitive functioning peak? The asynchronous rise and fall of different cognitive abilities across the life span. Psychological Science. 2015;26(4):433–443. doi: 10.1177/0956797614567339. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Haykin S. Neural networks: A comprehensive guide. 2. Upper Saddle River, NJ: Prentice Hall; 1999. [ Google Scholar ]
- Hernandez AE, Li P, MacWhinney B. The emergence of competing modules in bilingualism. Trends in Cognitive Sciences. 2005;9(5):220–225. doi: 10.1016/j.tics.2005.03.003. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Huang BH. A synthesis of empirical research on the linguistic outcomes of early foreign language instruction. International Journal of Multilingualism. 2015;13(3):257–273. doi: 10.1080/14790718.2015.1066792. [ DOI ] [ Google Scholar ]
- Jaeger TF. Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language. 2008;59(4):434–446. doi: 10.1016/j.jml.2007.11.007. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Jia G, Aaronson D, Wu Y. Long-term language attainment of bilingual immigrants: Predictive variables and language group differences. Applied Psycholinguistics. 2002;23(04):599–621. [ Google Scholar ]
- Johnson JS, Newport EL. Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology. 1989;21(1):60–99. doi: 10.1016/0010-0285(89)90003-0. [ DOI ] [ PubMed ] [ Google Scholar ]
- Johnson JS, Newport EL. Critical period effects on universal properties of language: The status of subjacency in the acquisition of a second language. Cognition. 1991;39(3):215–258. doi: 10.1016/0010-0277(91)90054-8. [ DOI ] [ PubMed ] [ Google Scholar ]
- Kidd E, Bavin EL. English-speaking children's comprehension of relative clauses: Evidence for general-cognitive and language-specific constraints on development. Journal of Psycholinguistic Research. 2002;31(6):599–617. doi: 10.1023/a:1021265021141. [ DOI ] [ PubMed ] [ Google Scholar ]
- Kidd E, Lum JAG. Sex differences in past tense overregularization. Developmental Science. 2008;11(6):882–889. doi: 10.1111/j.1467-7687.2008.00744.x. [ DOI ] [ PubMed ] [ Google Scholar ]
- Klindt D, Devaine M, Daunizeau J. Does the way we read others' mind change over the lifespan? Insights from a massive Web poll of cognitive skills from childhood to late adulthood. Cortex. 2017;86:205–215. doi: 10.1016/j.cortex.2016.09.009. [ DOI ] [ PubMed ] [ Google Scholar ]
- Krashen SD, Long MA, Scarcella RC. Age, rate, and eventual attainment in second language acquisition. TESOL Quarterly. 1979:573–582. [ Google Scholar ]
- Lenneberg E. Biological foundations of language. New York: Wiley; 1967. [ Google Scholar ]
- Marcus GF, Pinker S, Ullman MT, Hollander M, Rosen TJ, Xu F. Overregularization in language acquisition. Monographs of the Society for Research in Child Development. 1992;57(4):1–182. [ PubMed ] [ Google Scholar ]
- Mayberry RI. First-Language acquisition after childhood differs from second-language acquisition: The case of american sign language. Journal of Speech, Language, and Hearing Research. 1993;36(6):1258–1270. doi: 10.1044/jshr.3606.1258. [ DOI ] [ PubMed ] [ Google Scholar ]
- Mayberry RI, Lock E. Age constraints on first versus second language acquisition: Evidence for linguistic plasticity and epigenesis. Brain and Language. 2003;87(3):369–384. doi: 10.1016/s0093-934x(03)00137-8. [ DOI ] [ PubMed ] [ Google Scholar ]
- Mayberry RI, Lock E, Kazmi H. Development: Linguistic ability and early language exposure. Nature. 2002;417(6884):38. doi: 10.1038/417038a. [ DOI ] [ PubMed ] [ Google Scholar ]
- McDonald JL. Grammaticality judgments in a second language: Influences of age of acquisition and native language. Applied Psycholinguistics. 2000;21(03):395–423. [ Google Scholar ]
- Messenger K, Branigan HP, McLean JF, Sorace A. Is young children's passive syntax semantically constrained? Evidence from syntactic priming. Journal of Memory and Language. 2012;66:568–587. [ Google Scholar ]
- Milborrow S. Earth: Multivariate adaptive regression spline models. R package version 3.2-7. 2014 < http://cran.r-project.org/web/packages/earth/index.html >.
- Mills KL, Lalonde F, Clasen LS, Giedd JN, Blakemore SJ. Developmental changes in the structure of the social brain in late childhood and adolescence. Social Cognitive Affective Neuroscience. 2014;9(1):123–131. doi: 10.1093/scan/nss113. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Morgan-Short K, Ullman MT. The neurocognition of second language. In: Mackey A, Gass S, editors. Handbook of second language acquisition. Routledge; 2012. pp. 1–18. [ Google Scholar ]
- Mullen K, Aridia D, Gil D, Windover D, Cline J. DEoptim: An R package for global optimiziation by differential evolution. Journal of Statistical Software. 2011;40(6):1–26. [ Google Scholar ]
- Newport EL. Constraints on learning and their role in language acquisition: Studies of the acquisition of American Sign Language. Language Sciences. 1988;10(1):147–172. [ Google Scholar ]
- Newport EL. Maturational constraints on language learning. Cognitive Science. 1990;14(1):11–28. [ Google Scholar ]
- Nippold MA. Later language development: School-age children, adolescents, and young adults. 3. Austin, TX: Pro-Ed; 2007. [ Google Scholar ]
- Patkowski MS. The sensitive period for the acquisition of syntax in a secondary language. Language Learning. 1980;30(2):449–468. [ Google Scholar ]
- Pinker S. The language instinct. New York: William Morrow; 1994. [ Google Scholar ]
- Pinker S. Words and rules: The ingredients of language. New York, NY: HarperCollins; 1999. [ Google Scholar ]
- Pinto JGA, Hornby KR, Jones DG, Murphy KM. Developmental changes in GABAergic mechanisms in human visual cortex across the lifespan. Frontiers in Cellular Neuroscience. 2010;4(16) doi: 10.3389/fncel.2010.00016. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Pulvermüller F, Schumann JH. Neurobiological mechanisms of language acquisition. Language Learning. 1994;44:681–734. doi: 10.1111/j.1467-1770.1994.tb00635.x. [ DOI ] [ Google Scholar ]
- Qureshi MA. A meta-analysis: Age and second language grammar acquisition. System. 2016;60:147–160. doi: 10.1016/j.system.2016.06.001. [ DOI ] [ Google Scholar ]
- Rowland CF, Pine JM. Subject-auxiliary inversion errors and wh-question acquisition: 'what children do know?'. Journal of Child Language. 2000;27(1):157–181. doi: 10.1017/s0305000999004055. [ DOI ] [ PubMed ] [ Google Scholar ]
- Schachter J. On the issue of completeness in second language acquisition. Second Language Research. 1990;6(2):93–124. doi: 10.1177/026765839000600201. [ DOI ] [ Google Scholar ]
- Sebastián-Gallés N, Echeverría S, Bosch L. The influence of initial exposure on lexical representation: Comparing early and simultaneous bilinguals. Journal of Memory and Language. 2005;52(2):240–255. [ Google Scholar ]
- Shafee R, Buckner RL, Fischl B. Gray matter myelination of 1555 human brains using partial volume corrected MRI images. NeuroImage. 2015;105:473–485. doi: 10.1016/j.neuroimage.2014.10.054. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Snow CE, Hoefnagel-Höhle M. The critical period for language acquisition: Evidence from second language learning. Child Development. 1978:1114–1128. [ Google Scholar ]
- Stevens G. Age at immigration and second language proficiency among foreign-born adults. Language in Society. 1999;28(04):555–578. [ Google Scholar ]
- Stewart N, Ungemach C, Harris AJ, Bartels DM, Newell BR, Paolacci G, Chandler J. The average laboratory samples a population of 7,300 Amazon Mechanical Turk workers. Judgment and Decision Making. 2015;10(5):479–491. [ Google Scholar ]
- Tamnes CK, Ostby Y, Fjell AM, Westlye LT, Due-Tonnessen P, Walhovd KB. Brain maturation in adolescence and young adulthood: Regional age-related changes in cortical thickness and white matter volume and microstructure. Cerebral Cortex. 2010;20:534–548. doi: 10.1093/cercor/bhp118. [ DOI ] [ PubMed ] [ Google Scholar ]
- Vanhove J. The critical period hypothesis in second language acquisition: A statistical critique and a reanalysis. PLoS ONE. 2013;8(7):e69172. doi: 10.1371/journal.pone.0069172.s003. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Wagenmakers EJ. A practical solution to the pervasive problems of p values. Psychonomic Bulleting & Review. 2007;14(5):779–804. doi: 10.3758/bf03194105. [ DOI ] [ PubMed ] [ Google Scholar ]
- Weber-Fox C, Neville H. Maturational constraints on functional specializations for language processing: ERP and behavioral evidence in bilingual speakers. Journal of Cognitive Neuroscience. 1996;8(3):231–256. doi: 10.1162/jocn.1996.8.3.231. [ DOI ] [ PubMed ] [ Google Scholar ]
- Werker JF, Hensch T. Critical periods in speech perception: New directions. Annual Review of Psychology. 2015;66:173–196. doi: 10.1146/annurev-psych-010814-015104. [ DOI ] [ PubMed ] [ Google Scholar ]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
- View on publisher site
- PDF (1.5 MB)
- Collections
Similar articles
Cited by other articles, links to ncbi databases.
- Download .nbib .nbib
- Format: AMA APA MLA NLM
Add to Collections
The Critical Period Hypothesis for Second Language Acquisition: Tailoring the Coat of Many Colors
- First Online: 01 January 2013
Cite this chapter
- David Birdsong 4
Part of the book series: Second Language Learning and Teaching ((SLLT))
3233 Accesses
The present contribution represents an extension of David Singleton’s ( 2005 ) IRAL chapter, “The Critical Period Hypothesis: A coat of many colours”. I suggest that the CPH in its application to L2 acquisition could benefit from methodological and theoretical tailoring with respect to: the shape of the function that relates age of acquisition to proficiency, the use of nativelikeness for falsification of the CPH, and the framing of predictors of L2 attainment.
This is a preview of subscription content, log in via an institution to check access.
Access this chapter
Subscribe and save.
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
- Available as PDF
- Read on any device
- Instant download
- Own it forever
- Available as EPUB and PDF
- Compact, lightweight edition
- Dispatched in 3 to 5 business days
- Free shipping worldwide - see info
- Durable hardcover edition
Tax calculation will be finalised at checkout
Purchases are for personal use only
Institutional subscriptions
Similar content being viewed by others
Second Language Acquisition Research Methods
The Socio-educational Model of Second Language Acquisition
Granena and Long ( 2013 ) applied multiple linear regression analyses to the relationship of Chinese natives’ AoA to their attainment in L2 Spanish morphosyntax, phonology, and lexis and collocation. For each of these three linguistic domains, including breakpoints in the model revealed a small (5 %) but statistically significant increase in variance accounted for, as compared to the variance accounted for in a model with no breakpoints. According to the authors, the fact that the improvement was so small “could mean that the less complex (i.e. more parsimonious) model with no breakpoints is already a good enough fit to the data or, alternatively, that a larger sample size is needed to compensate for the loss of degrees of freedom and to minimize the risk of overfitting” (2013: 326–327).
DeKeyser ( 2000 : 515) erroneously reports that the correlation of years of schooling and GJ scores is r = 0.006 ns, for early arrivals, and r = 0.08 ns, for late arrivals. In fact, these reported coefficients reflect correlations of years of schooling with aptitude ; see discussion to follow.
Abrahamsson, N. and K. Hyltenstam. 2009. Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language Learning 59: 249–306.
Google Scholar
Ayer, A. J. 1959. History of the Logical Positivist movement. In Logical Positivism , ed. A. J. Ayer, 3–28. New York: Free Press.
Birdsong, D. 2005. Interpreting age effects in second language acquisition. In Handbook of bilingualism , eds. J. Kroll and A. DeGroot, 109–127. Oxford: Oxford University Press.
Birdsong, D. and M. Molis. 2001. On the evidence for maturational constraints in second-language acquisition. Journal of Memory and Language 44: 235–249.
Carroll, J. B. and S. M. Sapon. 1959. Modern Language Aptitude Test: Manual. New York: Psychological Corporation.
Cook, V. 2003. Effects of the second language on the first. Clevedon, UK: Multilingual Matters.
DeKeyser, R. M. 2000. The robustness of critical period effects in second language acquisition. Studies in Second Language Acquisition 22: 499–533.
DeKeyser, R., I. Alfi-Shabtay and D. Ravid. 2010. Cross-linguistic evidence for the nature of age effects in second language acquisition. Applied Psycholinguistics 31: 413–438.
Fowler, C. A., V. Sramko, D. J. Ostry, S. A. Rowland and P. Hallé. 2008. Cross language phonetic influences on the speech of French-English bilinguals. Journal of Phonetics 36: 649–663.
Granena, G. and M. H. Long. 2013. Age of onset, length of residence, language aptitude, and ultimate L2 attainment in three linguistic domains. Second Language Research 29: 311–343.
Hakuta, K., E. Bialystok and E. Wiley. 2003. Critical evidence: A test of the Critical-Period Hypothesis for second-language acquisition. Psychological Science 14: 31–38.
Hyltenstam, K. and N. Abrahamsson. 2003. Maturational constraints in SLA. The handbook of second language acquisition , eds. M. H. Long and C. J. Doughty, 539–588. Malden, MA: Blackwell.
Johnson, J. S. and E. L. Newport. 1989. Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology 21: 60–99.
Lenneberg, E. H. 1967. Biological foundations of language. New York: Wiley.
Long, M. H. 1990. Maturational constraints on language development. Studies in Second Language Acquisition 12: 251–285.
Ortega, L. 2009. Understanding second language acquisition. London: Hodder Education.
Penfield, W. and L. Roberts. 1959. Speech and brain mechanisms. Princeton, NJ: Princeton University Press.
Popper, K. 1959. The logic of scientific discovery . New York: Basic Books.
Singleton, D. 2005. The Critical Period Hypothesis: A coat of many colours. International Review of Applied Linguistics in Language Teaching 43: 269–285.
Stevens, G. 2004. Using census data to test the critical-period hypothesis for second-language acquisition. Psychological Science 15: 215–216.
Vanhove, J. 2013. The critical period hypothesis in second language acquisition: A statistical critique and a reanalysis. PLoS ONE. 8(7): e69172. doi: 10.137/journal.pone.0069172
Download references
Author information
Authors and affiliations.
University of Texas at Austin, Texas, USA
David Birdsong
You can also search for this author in PubMed Google Scholar
Corresponding author
Correspondence to David Birdsong .
Editor information
Editors and affiliations.
Faculty of Pedagogy and Fine Arts Dept. of English Studies, Adam Mickiewicz University, Kalisz, Wielkopolskie, Poland
Mirosław Pawlak
Graduate Studies Faculty, Oranim Academic College of Education, Tivon, Israel
Larissa Aronin
Rights and permissions
Reprints and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Birdsong, D. (2014). The Critical Period Hypothesis for Second Language Acquisition: Tailoring the Coat of Many Colors. In: Pawlak, M., Aronin, L. (eds) Essential Topics in Applied Linguistics and Multilingualism. Second Language Learning and Teaching. Springer, Cham. https://doi.org/10.1007/978-3-319-01414-2_3
Download citation
DOI : https://doi.org/10.1007/978-3-319-01414-2_3
Published : 19 September 2013
Publisher Name : Springer, Cham
Print ISBN : 978-3-319-01413-5
Online ISBN : 978-3-319-01414-2
eBook Packages : Humanities, Social Sciences and Law Education (R0)
Share this chapter
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
- Publish with us
Policies and ethics
- Find a journal
- Track your research
IMAGES
VIDEO
COMMENTS
Abstract. In second language acquisition research, the critical period hypothesis (cph) holds that the function between learners' age and their susceptibility to second language input is non-linear. This paper revisits the indistinctness found in the literature with regard to this hypothesis's scope and predictions.
Evidence for the critical period hypothesis (CPH) comes from a number of sources demonstrating that age is a crucial predictor for language attainment and that the capacity to learn language diminishes with age.
The present paper aims at highlighting the Critical Period Hypothesis (CPH) in Second Language Acquisition (SLA) which suggests that the individuals’ attempts to learn a second language...
The Critical Period Hypothesis (CPH), as proposed by [1], that nativelike proficiency is only attainable within a finite period, extending from early infancy to puberty, has generally been accepted in language development research, but more so for first language acquisition (L1A) than for second language acquisition (L2A).
On the assumption that the present results apply broadly to syntax acquisition by diverse learners, they have profound theoretical implications. Most importantly, they clarify the shape of the well-attested critical period for second-language acquisition: a plateau followed by a continuous decline.
The critical period hypothesis states that the first few years of life is the crucial time in which an individual can acquire a first language if presented with adequate stimuli, and that first-language acquisition relies on neuroplasticity of the brain.
The debate over the critical period hypothesis embodies some of the most basic questions about second language acquisition, and indeed, language acquisition in general. These questions permeate the foundations of several disciplines, such as linguistics, cognitive psychology, and neurolinguistics.
Johnson and Newport’s study, along with Oyama’s (1978) and Patkowski’s (1980) studies, have provided influential evidence supporting the notion that a critical period influences the acquisition of morphosyntactic structures in a second language.
A study by DeKeyser (2000), entitled ‘‘The robustness of critical period effects in second language acquisition’’, investigates the roles of factors such as AoA, lan-guage learning aptitude, and years of schooling in predicting L2 English gram-maticality judgment (GJ) accuracy by 57 Hungarian immigrants to the US. A look
Abstract. Explores reasons why humans might be subject to a critical period for language learning. This book also examines the adequacy of the Critical Period Hypothesis as an explanatory construct, the "fit" of the hypothesis with the facts.