U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Dev Cogn Neurosci
  • v.10; 2014 Oct

Development of abstract thinking during childhood and adolescence: The role of rostrolateral prefrontal cortex

Iroise dumontheil.

a Department of Psychological Sciences, Birkbeck, University of London, UK

b Institute of Cognitive Neuroscience, University College London, UK

  • • Rostral prefrontal cortex (RPFC) supports self-generated, abstract thought processing.
  • • Flexibly attending towards and processing abstract thoughts develop in adolescence.
  • • RPFC activation becomes more specific to relational integration during development.
  • • Prospective memory development remains to be further studied using neuroimaging.
  • • Training of abstract thinking, e.g. reasoning, may have implication for education.

Rostral prefrontal cortex (RPFC) has increased in size and changed in terms of its cellular organisation during primate evolution. In parallel emerged the ability to detach oneself from the immediate environment to process abstract thoughts and solve problems and to understand other individuals’ thoughts and intentions. Rostrolateral prefrontal cortex (RLPFC) is thought to play an important role in supporting the integration of abstract, often self-generated, thoughts. Thoughts can be temporally abstract and relate to long term goals, or past or future events, or relationally abstract and focus on the relationships between representations rather than simple stimulus features. Behavioural studies have provided evidence of a prolonged development of the cognitive functions associated with RLPFC, in particular logical and relational reasoning, but also episodic memory retrieval and prospective memory. Functional and structural neuroimaging studies provide further support for a prolonged development of RLPFC during adolescence, with some evidence of increased specialisation of RLPFC activation for relational integration and aspects of episodic memory retrieval. Topics for future research will be discussed, such as the role of medial RPFC in processing abstract thoughts in the social domain, the possibility of training abstract thinking in the domain of reasoning, and links to education.

1. Introduction

Abstract thoughts can be broadly defined as thoughts that are self-generated and stimuli-independent, in contrast to stimulus-oriented, perceptually-derived, information. Beyond this definition, two particular forms of abstraction can be considered (see Nee et al., 2014 ). Abstraction can be defined temporally: abstract thoughts are those that relate to long term goals, or past or future events. Alternately, abstraction can be defined relationally: abstract thoughts are those that focus on the relationships between representations rather simple stimulus features. A subset of cognitive processes has particularly high requirements of abstract thoughts manipulation, either within a single temporal or relational domain, or across both. These include the retrieval of past thoughts and memories (e.g. episodic or source memory retrieval), the manipulation of current task-related or task-unrelated self-generated information (e.g. relational reasoning and problem solving or mindwandering respectively) and the processing of thoughts linked to the future (e.g. planning, multitasking, prospective memory). Interestingly, the most anterior part of the lateral prefrontal cortex, the rostrolateral prefrontal cortex (RLPFC), has been found to show increased activations in paradigms testing this whole range of cognitive functions (e.g. see Badre, 2008 , Burgess et al., 2007a , Ramnani and Owen, 2004 for review). The rostral prefrontal cortex (RPFC), as other parts of the frontal cortex and the temporal cortices, shows prolonged structural development during adolescence (e.g. see Dumontheil et al., 2008 for review). The relationship between abstract thoughts and RPFC, in particular the RLPFC, during late childhood and adolescence will be the topic of this review.

Adolescence starts at the onset of puberty and can be broadly defined as between the ages of 10 and 19 ( Sawyer et al., 2012 ). Although brain and behavioural changes during this period are less pronounced than during infancy and childhood, adolescence is nevertheless an important period of development in terms of the acquisition of higher cognitive skills, as well as the onset of mental disorders (see Dumontheil et al. (2008) for a discussion of RPFC and developmental disorders). Adolescence emerges as a critical phase of reorganisation of regulatory systems, and may also be a period of extended brain plasticity and thus a relevant target for interventions ( Steinberg, 2005 ).

The first section of this paper will focus on the association between lateral RPFC and the ability to attend to and manipulate abstract thoughts. I will then discuss the development of this ability during late childhood and adolescence and how structural and functional development of RPFC may underlie the behavioural changes observed during adolescence. I will then briefly relate these findings to studies of the development of medial RPFC function in social cognition tasks. Finally, I will discuss future avenues of research in this field as well as potential implications of these findings for education policy and practice. This review will focus on aspects of both relationally and temporally abstract thoughts ( Nee et al., 2014 ), as identified from the research on RLPFC function in adults. Although an effort was made to gather relevant evidence, this review is unlikely to be exhaustive and is biased towards those fields where more developmental neuroimaging research has currently been published.

Recently Ferrer et al. (2009) summarised the development of fluid reasoning, which can be considered as a type of abstract thinking. Here the goal is to perform a more extensive review of the development of abstract thinking more generally, including recent studies on the topic. Although some aspects of metacognition are relevant to the domain of abstract thought and reasoning, there has been until now little cognitive neuroscience research done with a developmental focus (see Fleming and Dolan, 2012 , Fleming et al., 2010 ) and thus metacognition will not be reviewed here (see Schneider, 2008 for a review of the development of meta-cognitive knowledge).

2. Rostral prefrontal cortex function

2.1. rostral prefrontal cortex: cytoarchitecture and subdivisions.

RPFC, which corresponds approximately to Brodmann area 10 (BA10), is a large brain region in humans and is thought to be subdivided into separate subregions distinct in terms of cellular organisation and function ( Christoff and Gabrieli, 2000 , Gilbert et al., 2006a , Gilbert et al., 2006b ). Two quite different types of cognitive ability have been associated with the RPFC. The lateral parts of RPFC (RLPFC) appear to support the ability to detach oneself from the environment and to elaborate, evaluate and maintain abstract rules and information, as it is involved in reasoning, problem solving, and more generally abstract thinking ( Amati and Shallice, 2007 , Christoff and Gabrieli, 2000 , Christoff et al., 2009b , Gilbert et al., 2006b , Koechlin et al., 2003 , Ramnani and Owen, 2004 ) (see below for further details). The medial aspect of RPFC, or medial prefrontal cortex (MPFC), is implicated in social cognition, that is, the understanding of other people's minds ( Amodio and Frith, 2006 , Blakemore, 2008 , Van Overwalle, 2009 ).

In the last decade, large scale magnetic resonance (MRI) studies have shown that the RPFC is one of the last brain regions to reach maturity in humans (see Dumontheil et al., 2008 for review). This region is also particularly interesting in terms of its cellular organisation and connection with other regions. RPFC is the only prefrontal region that is predominantly interconnected with supramodal cortex in the PFC ( Andersen et al., 1985 , Petrides and Pandya, 1999 ), anterior temporal cortex ( Amaral and Price, 1984 , Moran et al., 1987 ) and cingulate cortex ( Andersen et al., 1985 , Arikuni et al., 1994 , Bachevalier et al., 1997 , Morecraft and Van Hoesen, 1993 ). In addition, its projections to these other regions are broadly reciprocal ( Passingham, 2002 ; see Ramnani and Owen, 2004 for review). RPFC has a low cell density, which may indicate that this region in humans has more space available for connections both within this region and with other brain regions ( Semendeferi et al., 2011 , Semendeferi et al., 2001 ). RPFC also has a particularly high number of dendritic spines per cell, an indicator of the number of synaptic connections, which suggests that the computational properties of RPFC are more likely to involve the integration of inputs than those of comparable areas ( Ramnani and Owen, 2004 ).

In line with these findings, Amati and Shallice (2007) proposed that RPFC may support a novel type of cognitive computational process required for “abstract projectuality”, that may be behind the cognitive capacities specific to modern humans. They propose that this brain operation permits a fluent sequence of non-routine computational operations to occur over a prolonged timecourse. This qualitatively different type of brain operation may have emerged from increasing prefrontal cortical connectivity in the RPFC, induced by gradual (quantitative) genetic changes affecting RPFC structure and organisation over evolution ( Amati and Shallice, 2007 ). This model fits well with current theories of RLPFC function which will be detailed in the next section.

2.2. RLPFC and abstract thinking

A number of theories of the functional organisation of the frontal lobes have been proposed in the last decade based on neuroimaging and lesion data. The broad consensus is that the frontal cortex may possess a rostro-caudal organisation whereby more rostral regions support cognitive control involving progressively more abstract representations ( Azuar et al., 2014 , Badre and D’Esposito, 2007 , Badre and D’Esposito, 2009 , Badre, 2008 , Botvinick, 2008 , Christoff et al., 2009b , Koechlin and Jubault, 2006 , Koechlin and Summerfield, 2007 , Koechlin et al., 2003 , Petrides, 2005 ). In this organisation, posterior PFC supports the control and manipulation of temporally proximate, concrete action representations, while anterior PFC supports the control of temporally extended, abstract representations ( Badre, 2008 ). Fig. 1 , adapted from Badre (2008) , shows a representation of this organisation. Of interest here is the position of the RLPFC, at the top of this frontal lobe hierarchy, and the suggestion that this brain region is recruited when temporally extended, abstract representations are attended to or manipulated.

An external file that holds a picture, illustration, etc.
Object name is gr1.jpg

Sub-divisions of the frontal lobes. (a) Schematic representation of the major anatomical sub-divisions of the frontal lobes. Following a caudal to rostral direction, labelled areas include motor cortex, dorsal and ventral premotor cortices, dorsal and ventral aspects of anterior premotor cortex, ventrolateral prefrontal cortex (VLPFC), dorsolateral prefrontal cortex (DLPFC), and lateral frontopolar cortex, also termed rostrolateral prefrontal cortex (RLPFC). Boundaries and Brodmann areas (BA) are approximate. (b) Schematic representation of the rostro-caudal gradiant of the organisation of the prefrontal cortex. The consensus among diverse theoretical accounts of the organisation of the PFC is that progressively more anterior PFC regions support cognitive control of progressively more abstract and temporally extended representations (adapted from Badre, 2008 ).

RLPFC indeed shows increased blood oxygen level dependent (BOLD) signal in a number of tasks that require such aspects of cognition, including the retrieval of episodic or source memory (e.g. Dobbins et al., 2004 , Turner et al., 2008 ; see Gilbert et al., 2006b for review and Spaniol et al., 2009 for meta-analysis); prospective memory ( Barban et al., 2013 , Benoit et al., 2011 , Burgess et al., 2007b ); the manipulation of highly abstract information ( Christoff et al., 2009b ); the selection and maintenance of task rules ( Bengtsson et al., 2009 , Braver et al., 2003 , Dumontheil et al., 2011 , Sakai and Passingham, 2003 , Sakai and Passingham, 2006 ); sub-goal processing or branching ( Badre and D’Esposito, 2007 , Braver and Bongiolatti, 2002 , Koechlin et al., 2003 ); integration of information ( Badre and Wagner, 2004 , Wolfensteller and von Cramon, 2011 ); analogical and relational reasoning ( Bunge et al., 2009 , Geake and Hansen, 2005 , Hampshire et al., 2011 , Smith et al., 2007 , Volle et al., 2010 , Wendelken et al., 2008 , Wendelken et al., 2012 , Wright et al., 2008 ) – although note that medial dorsal RPFC has also been implicated in analogical reasoning ( Green et al., 2006 , Krawczyk, 2012 , Volle et al., 2010 ); reality monitoring ( Simons et al., 2008 ); and mind-wandering ( Christoff et al., 2004 , Christoff et al., 2009a , Dumontheil et al., 2010a , Schooler et al., 2011 ).

Lesion studies also provide supporting evidence for a role of RPFC in the control of temporally extended abstract representations, although, by their nature, these studies rarely distinguish between lateral and medial aspects of RPFC, and therefore between the social cognition and cognitive control aspects of RPFC function ( Burgess, 2000 , Burgess et al., 2009 , Gläscher et al., 2010 , Roca et al., 2010 , Shallice and Burgess, 1991 , Volle et al., 2011 ).

3. Behavioural studies of the development of abstract thinking

Abstract thinking encompasses a number of different cognitive processes, but one definition adopted here is that abstract thinking can be considered as the manipulation of self-generated thoughts, or thoughts that are not directly connected to the environment. A distinction is made between relationally and temporally abstract thoughts. As described above, neuroimaging and lesion studies in adults suggest that RLPFC is thought to be specifically involved in the elaboration, evaluation and maintenance of abstract rules ( Amati and Shallice, 2007 , Christoff and Gabrieli, 2000 , Christoff et al., 2009b , Koechlin et al., 2003 , Ramnani and Owen, 2004 ), as well as in the ability to flexibly control whether one selectively attends towards self-generated thoughts or the environment ( Burgess et al., 2007a ), whether this self-generated information is task-relevant, or task-irrelevant, i.e. when the mind wanders ( Christoff et al., 2004 , Christoff et al., 2009a , Dumontheil et al., 2010a ). A number of theorists have suggested that adolescents can operate at a new and more abstract level of thought because they can integrate the results of two different sorts of lower-order processing ( Case, 1985 , Fischer, 1980 , Halford, 1982 ). This new intellectual potential emerging in adolescence builds on the idea that children can progressively handle first one new abstract element, then two, and then multiple abstract elements simultaneously (see Marini and Case, 1994 , for review). Below are described behavioural studies investigating the development of the ability to flexibly attend towards self-generated thoughts, the development of the ability to reason logically and integrate relations or representations, and finally the development of the processing of self-generated thoughts that can be considered temporally abstract, and are related to past experiences (episodic memory) or future events (prospective memory). Although multitasking, or branching, has been a particular focus of neuroimaging and lesion research on RLPFC function in adults ( Badre and D’Esposito, 2007 , Braver and Bongiolatti, 2002 , Burgess, 2000 , Koechlin et al., 2003 ), this topic has not been specifically investigated in developmental psychology research.

3.1. Development of the flexible selection of self-generated thoughts

An important aspect of the manipulation of abstract thought resides in the ability to modulate the balance between cognition that is provoked by perceptual experience (stimulus-oriented, SO) and that which occurs in the absence of sensory input (self-generated, or stimulus-independent, SI) ( Burgess et al., 2007a ). In children, manipulation of SI thoughts has been studied in the context of fluid intelligence and relational reasoning ( Crone, 2009 , Wright et al., 2008 ; see below) and working memory (WM) tasks ( Crone et al., 2006 ), while the ability to resist distracting SO information has been studied in perceptual ( Booth et al., 2003 , Bunge et al., 2002 ) and WM tasks ( Olesen et al., 2007 ). In this latter study 13 year-old participants showed poorer accuracy than adults in visuospatial WM trials that included distraction relative to trials that did not.

In a recent study ( Dumontheil et al., 2010b ), we tested 179 female participants aged 7–27-year old on a single task (Alphabet task) that could be performed on the basis of either SO or SI information, without high working memory requirements ( Gilbert et al., 2005 , Gilbert et al., 2007 , Gilbert et al., 2008 ). Participants were asked to classify letters of the alphabet according to whether the upper case letter contained a curve or not. In SO blocks consecutive letters of the alphabet were presented on the screen, while in SI blocks either no letter (No-distractor condition) or distracting non-consecutive letters (Distractor condition) were presented on the screen. In SI blocks participants were asked to continue going through the alphabet sequence in their head and continue responding (see Fig. 2a ). Different patterns of development were observed for the different aspects of this task. Resistance to visual distractors exhibited small improvements with age, both in accuracy and speed of responding, while the manipulation of SI thoughts and switching between SI and SO thoughts showed steeper response speed improvements extending into late adolescence (see Fig. 2b ). This development in the speed of manipulating self-generated thoughts and in the speed of switching between perceptually-derived and self-generated thoughts may underlie improvements during adolescence in planning, reasoning and abstract thinking, abilities that rely on the manipulation of thoughts that are not directly derived from the environment ( Anderson et al., 2001 , De Luca et al., 2003 , Huizinga et al., 2006 , Rosso et al., 2004 ). Below is described in more detail the particular case of the development of reasoning.

An external file that holds a picture, illustration, etc.
Object name is gr2.jpg

Development of the flexible switching between selecting thoughts derived from the environment and abstract thoughts. (a) Alphabet task. Participants classify letters of the alphabet according to their shape (line or curve). When the letter is red, participants judge the letter presented on the screen (stimulus-oriented (SO) blocks). When the letter is blue (or when there is no letter) participants continue reciting the alphabet in their head and judge the shape of the letter in their head (stimulus-independent (SI) blocks), while ignoring the distracting letter presented on the screen (Distractor condition), or in the absence of a letter on the screen (No-distractor condition). Performance in the two types of blocks (SI vs. SO) and the two conditions (Distractor vs. No-distractor), and performance in switch trials (first trial of a SO or SI block) and subsequent trials (stay trials) were compared. (b) Behavioural results. The speed of responding in SI vs. SO, and in switch vs. stay trials continued to increase during adolescence. The speed of responding in the presence of Distractors also improved but followed a flatter linear developmental function (adapted from Dumontheil et al., 2010b ). (c) Functional MRI results. The main effect of switching between SO and SI conditions vs. a simple change of colour of the stimuli over the whole age range is presented (family-wise error corrected p < .05), highlighting the right superior RLPFC activation (top). RLPFC activity in this contrast is plotted against age (bottom). There was a significant decrease in activity during adolescence, which was not purely a consequence of differences in performance and brain structure between the participants and could reflect the maturation of neurocognitive strategies (see Dumontheil et al., 2010b ). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

3.2. Development of logical reasoning

Problem solving by analogy requires the transfer of previously acquired solutions or strategies from one context or situation to another. Preschoolers (e.g. Holyoak et al., 1984 ) and even infants (e.g. Chen et al., 1997 ) exhibit an ability to draw analogies and use a solution learned from a one problem to solve another problem. However older children are better able to detect the underlying similarities between the original problem and the novel problem situation (e.g. Chen and Daehler, 1992 , Daehler and Chen, 1993 , Holyoak et al., 1984 ; see Chen et al., 1997 for review). Experimental paradigms have tended to be action-based, requiring children to perform a particular action to achieve a goal. However, analogical reasoning is also assessed using verbal or pictorial stimuli in propositional analogy tasks ( Ferrer et al., 2009 ), for example asking children to match the sequence “bread: slice of bread:: orange:?” with one of the following options: slice of orange, slice of cake, squeezed oranges, orange balloon, orange basketball. The relational shift hypothesis proposes that young children interpret analogy and metaphor first in terms of object similarity, and then in terms of relational similarity. Support for this hypothesis is given for example by the observation that when relational similarity competes with object similarity, young children make object-similarity responses, while with increasing age/experience responses become in line with relational similarity ( Rattermann and Gentner, 1998 ). This relational shift is thought to be not simply age-determined, but knowledge-related, which means it can occur at different ages in different domains. However, adults continue to use both object commonalities and relational commonalities in processing comparisons (see Rattermann and Gentner (1998) for discussion). In a recent computational study, Morrison et al. (2011) propose that the development of analogical reasoning during childhood is best explained by a combination of improved information processing, in particular working memory (which supports the maintenance of a greater number of relations) and inhibitory control (which supports the resistance to distraction by object commonalities), in combination with knowledge accretion.

Subsequent developmental changes have been observed during adolescence. Marini and Case (1994) show that a capacity for abstract reasoning begins to emerge in both social and non-social domains about the age of 11 or 12 and that further development of this ability is constrained by the number of abstract elements that can be coordinated at one time, independent of the particular content of these abstract elements. The task used required participants to predict the movement of a beam where both the weight and distance from the centre were relevant factors to be combined, or to predict a character's behaviour based on personality traits abstracted from a scenario. Similarly, Hatcher et al. (1990) observed development of abstract thinking between ages 10, 13 and 17-year old, using the balance beam task and a verbal analogical reasoning task. Using conditional reasoning (if… then… statement) tasks, De Neys and Everaerts (2008) showed that improvements in conditional reasoning observed during adolescence were not only related to the start of the formal reasoning stage around age 12, but also depended on the ability to retrieve alternatives from memory and to inhibit these alternatives when necessary. The authors note that according to other studies (see De Neys and Everaerts, 2008 , for review) not all adolescents will show this ability to inhibit alternatives when they are irrelevant, leading to individual differences in conditional reasoning in adulthood.

These studies therefore suggest that logical reasoning depends on the interplay of the ability to maintain and manipulate information in working memory, the inhibition of irrelevant or incorrect alternatives, and domain-specific knowledge, in addition to the requirements of integrating multiple abstract representations.

3.3. Behavioural measures of relational reasoning development during adolescence

Although, as discussed above, relational processing can be recruited for analogical reasoning, a number of studies have focused more specifically on relational reasoning per se. The relational reasoning demands of a problem can be defined in terms of the number of dimensions, or sources of variation, that need to be considered simultaneously to reach a correct solution. Children under 5 years can solve 0- and 1-relational problems, but fail to solve 2-relational problems ( Halford et al., 1998 ). Early improvements in relational reasoning may reflect a shift from a focus on object similarity to relational similarity ( Rattermann and Gentner, 1998 ). Further improvements during childhood and adolescence may relate to increased relational knowledge or increased working memory capacity ( Crone et al., 2009 , Sternberg and Rifkin, 1979 ; see Richland et al., 2006 , for discussion). Indeed, Carpenter et al. (1990) argued that the processes leading to individual differences on relational reasoning tasks such as the Raven's matrices ( Raven, 1998 ) are primarily the ability to extract abstract relations and to dynamically manage a large set of problem-solving goals in working memory. Thus, for relational reasoning as for logical reasoning, working memory is thought to play an important role in supporting the maintenance of multiple abstract thoughts to allow their comparison and integration.

Prolonged developmental changes in relational reasoning into adolescence have been observed in a few behavioural studies (see also the next section on neuroimaging studies). For example, although their age groups were small, Rosso et al. (2004) showed that accuracy in the matrix reasoning section of the WAIS-III increased with age in the range 9–19-year old. We recently employed a relational reasoning task initially developed by Christoff et al. (2003) , to investigate relational reasoning development during adolescence in a large sample of healthy participants ( Dumontheil et al., 2010c , Experiment 1). The Shapes task required participants to assess whether two pairs of items, which could vary in shape and/or texture, differed or changed along the same dimension. The pairs of items could both show texture differences or both show shape differences, in which case participants were asked to response yes, i.e. the pairs change along the same dimension (match). Alternatively, one pair of items differed in texture while the other pair differed in shape, in which case participants were asked to respond no, i.e. the pairs change along different dimensions (no-match). One hundred and seventy nine female participants aged 7–27-year old participated in the study (same participant as Dumontheil et al. (2010b) ). When comparing the relational integration (or 2-relational) condition of the task to a condition requiring the processing of only 1-relation (either shape, or texture), the results showed a non-linear pattern of improvement in accuracy across age. After an early improvement in accuracy, with 9–11-year olds performing at adult levels, performance dipped in the 11–14-year olds and gradually improved again to adult levels throughout late adolescence. Further analysis of these data using a combined measure of reaction time over accuracy to take into account a potential speed-accuracy trade-off suggests that in fact 2-relational vs. 1-relational performance in this task improved progressively during late childhood and mid-adolescence, with a significant improvement between the 7–9 and 14–17 years old age groups on this combined measure.

3.4. Development of episodic memory

Episodic memory refers to memories for specific episodes previously experienced. Memories for such events are often accompanied by the phenomenal experience of recollective experience ( Tulving, 1983 ). Sander and colleagues have proposed that episodic memory relies on the combination of an associative and a strategic processing component ( Sander et al., 2012 ). Raj and Bell (2010) have reviewed the development of episodic memory formation in childhood extensively and similarly contrast binding and source memory to source monitoring. It is generally believed that by the age of 4 years, children have an episodic memory system in place ( Raj and Bell, 2010 ). The associative component, which relies primarily on mediotemporal and posterior brain regions (e.g. Simons and Spiers, 2003 ; see Raj and Bell, 2010 for review) is relatively mature by middle childhood ( Gathercole, 1998 , Rhodes et al., 2011 ). However, some studies still show continuing improvements in episodic memory performance between late childhood and adulthood ( DeMaster and Ghetti, 2013 , Lorsbach and Reimer, 2005 ), in particular in tasks requiring memory for combined features (e.g. objects and locations) ( Lorsbach and Reimer, 2005 ).

In contrast, the strategic component, which refers to top-down control processes involved in the organisation and monitoring of memory representations mainly relies on prefrontal brain regions ( Miller and Cohen, 2001 ), particularly for tasks requiring binding of feature information and source memory retrieval. This component shows more prolonged development in childhood, adolescence and until young adulthood. For example, in a longitudinal study following children between 4 and 10 years of age, different developmental timecourses were observed for the memory for individual items vs. a combination of source and facts ( Riggins, 2014 ). Overall, younger children perform worse than adolescents on source discrimination tasks, and adolescents perform themselves worse than adults ( De Chastelaine et al., 2007 , DeMaster and Ghetti, 2013 , Ghetti et al., 2010 ). Adults also perform better than children and adolescents on tasks requiring a recollection judgement, i.e. requiring the specific contextual details of a memory episode, but not in tasks requiring a recognition judgement, i.e. knowing that an item has been previously encountered ( Billingsley et al., 2002 , Ofen et al., 2007 ). Sander et al. (2012) showed that, similarly to adults, children and adolescents could benefit from mnemonic instruction and training in an episodic memory task, highlighting the role of strategy implementation in episodic memory performance.

Executive function (EF) abilities have been suggested to play a role in episodic memory performance. Indeed, higher EF scores are associated with better performance on source memory tests, and lower rates of source memory errors, particularly lower false alarm rates. Frontal lobe function may support the integration of item and source information, content and context, during encoding, and may also support contextual memory retrieval by guiding the search and monitoring processes and inhibition of feelings of familiarity (see Raj and Bell, 2010 for review). The specific role of RLPFC in episodic memory may be in supporting the coordination of search and monitoring processes during episodic memory retrieval ( Spaniol et al., 2009 ), with BOLD signal increases in RLPFC possibly specific to intentional rather than incidental retrieval ( Fletcher and Henson, 2001 , Simons and Spiers, 2003 ).

Little research has been done to investigate the role played by EF during episodic memory development. In young children (4 and 6 years old), Rajan et al. (2014) found that language ability, and a composite measure of EF (combining inhibitory control, working memory and set shifting) uniquely predicted fact and source memory retrieval, however when the EF measures were considered individually, the only significant association was that inhibitory control predicted source recall. Rhodes et al. (2011) found that 10 and 11-year old children, but not 8 and 9-year olds, showed a relationship between episodic memory and verbal working memory, which differed from the observed relationship between episodic memory and spatial working memory in adults, and thus suggested that the relationship between episodic memory and executive (frontal) components of episodic memory retrieval changed over the period of adolescence. Picard et al. (2012) also found that EF contributed to changes in temporal and spatial context aspects of episodic memory during adolescence. Ruffman et al. (2001) found that in children aged 6, 8 and 10 years old, working memory was related to accuracy in source monitoring judgements, while inhibitory control uniquely predicted false alarm rates.

3.5. Development of prospective memory

Prospective memory (PM) is the ability to “remember to remember”, and is particularly difficult when an individual is simultaneously engaged in other activities. Research suggests that active strategical monitoring is more likely to be required when the PM cues are non-focal, non-distinctive, when the task is non-demanding and non-absorbing, when high importance is given to the PM task and the interval retentions are short ( McDaniel and Einstein, 2007 ). Although a number of studies have now investigated the development of PM in childhood, fewer studies have investigated later development during adolescence ( McDaniel and Einstein, 2007 ).

Event-based PM can be observed in preschool aged children (e.g. Guajardo and Best, 2000 ), however performance tends to be poor when the ongoing task needs to be interrupted (e.g. Kliegel et al., 2008 ) or when the cue is non-focal, suggesting that children aged 5 or younger have not developed strategic monitoring processes or do not have the attentional resources to deploy them during ongoing task performance (see also McDaniel and Einstein, 2007 for review). Event-based PM continues to develop as children become more able to use external reminders to cue prospective remembering and to interrupt ongoing task performance when necessary ( Kliegel et al., 2008 ). Time-based PM requires greater strategic monitoring than event-based PM. Although time-based PM has also been observed in young children (5–7-year olds, Aberle and Kliegel, 2010 ), it tends overall to be associated with poorer performance than event-based PM (e.g. in 7–12-year-olds Yang et al., 2011 ). Time-based PM has been shown to continue to develop in late childhood and early adolescence ( Yang et al., 2011 ) as children become increasingly proficient at using time-checking strategies ( Kerns, 2000 , Mackinlay et al., 2009 , Voigt et al., 2011 ).

Developmental changes in PM performance are also observed further into adolescence, with more correct event-related PM responses made by adults than adolescents (aged 12 in Zöllig et al. (2007) ; aged 14 in Wang et al. (2006) ; but no difference observed with 13–14-year olds in Zimmermann and Meier (2006) ). In a large online study, Maylor and Logie (2010) found (using a single event-based PM trial) that performance peaked in late adolescence (16–19-year old) and that females outperform males in early adolescence. Ward et al. (2005) showed that adolescents detected more PM cues than children, with similar performance to adults, however they relied more than adults on a remembering strategy described as “Thought about all the time/looked out for the cues”, while adults used more frequently a strategy described as “Remembered only when saw the cues”. This indicates that to achieve a similar performance, adolescents needed to use a more active monitoring strategy than the adults. In a realistic time-based PM task requiring participants to remember to take baking cakes out of an oven while playing a video game, 14-year-olds were better than 10-year-old s , even though both age groups were able to deploy strategic clock-monitoring strategies ( Ceci and Bronfenbrenner, 1985 ). Consistent with the greater need for strategic monitoring, the development of PM abilities is mainly observed during adolescence when non-focal cues are used ( Wang et al., 2011 ).

The realisation of delayed intentions is thought to rely on a prospective component, the detection or recognition of prospective cues, but also a retrospective component, the retrieval of an intention from memory following the recognition of a prospective cue ( Simons et al., 2006 ). The retrospective component is likely to share many of the processes that support episodic memory, in particular the retrieval of contextual information from long-term memory. Zöllig et al. (2007) found that adolescents made more confusion errors than young adults, which the authors argue indicates that the retrospective component of PM is less efficient in adolescents. Similarly, Yang et al. (2011) report that 7–8-year-olds missed PM cues more often than 11–12-year olds, while 9–10-year olds showed a higher frequency of confusion (false-alarm and wrong responses) than 11–12-year olds suggesting differential developmental patterns of the PM and retrospective memory components. Maylor and Logie (2010) similarly observed earlier development of PM performance compared to retrospective memory performance in a lifespan study.

Successful PM is thought to rely on a range of other executive skills, however evidence is mixed regarding which aspects of EF are most relevant to PM development. A few studies have investigated this with time-based PM tasks. Aberle and Kliegel (2010) found that PM performance in 5–7-year olds was associated with processing speed and working memory. In older, 7–12-year old children, Mackinlay et al. (2009) found that the majority of the developmental changes in PM performance could be explained by planning and task switching performance measures, while Mäntylä et al. (2007) found children aged 8–12-year old achieved similar accuracy to adults in a time-based PM task by checking the clock more often, and that while in children inhibition and updating (within a single “supervision” factor), but not shifting, predicted clock monitoring frequency, in adults they predicted timing error.

To summarise, similarly to the investigations of logical and relational reasoning, these studies highlight the role of working memory in supporting temporally abstract thinking. In addition, good performance on prospective and episodic memory tasks may depend on the use of appropriate strategies, themselves dependent on the ability to extract and evaluate abstract information regarding task rules, goals and performance monitoring. It is this higher level of abstraction, either in the relational or temporal domain, which is thought to be specific to RLPFC ( Badre, 2008 ).

4. Functional neuroimaging studies of abstract thinking development

This section reviews the functional MRI findings on the development of abstract thinking during adolescence. The focus will first be on research on relationally abstract thinking, reviewing studies which have investigated the orientation of attention towards self-generated thoughts and the manipulation and integration of relations. Second, I will discuss findings related to the processing of temporally abstract thoughts, reviewing studies of episodic memory retrieval and prospective memory, although the evidence is more limited for the latter.

4.1. Neuroimaging study of the development of the flexible selection of self-generated thoughts

On the basis of studies in adults, Burgess et al. (2007a) have suggested that RPFC supports the flexible orientation of attention towards perceptually-derived information or self-generated thoughts. In a recent study, the Alphabet task described above, which contrasts SI and SO phases with very similar task requirements, was tested in a smaller group of participants aged 11–30 years old using functional MRI (fMRI). Two comparisons were performed using this task ( Dumontheil et al., 2010b ): SI vs. SO thought manipulation and switches between SO and SI phases versus switches of the colour of the letter stimuli. In this sample of 37 participants, the difference in performance between SI and SO trials did not change with age, however participants did become faster in the SO/SI switch trials with age. The comparison of SI vs. SO thought manipulation led to increased BOLD signal in a large fronto-parietal network of regions that extended into RLPFC bilaterally. Among this network, only the left anterior insula showed developmental changes, with a decrease in activation with age, which was independent of individual differences in performance. The comparison of SO/SI switches versus Colour switches led to a much smaller network of brain regions including the right superior RLPFC, precuneus and superior temporal gyrus ( Fig. 2c ). In this comparison only the RLPFC cluster showed a trend for a decrease in activation with age, similarly not accounted for by individual differences in performance ( Fig. 2c ).

4.2. Neuroimaging studies of visuospatial relational reasoning development

Neuroimaging studies in adults have shown that a fronto-parietal network of brain regions is recruited during relational integration, i.e. when solving 2-relational problems, with activation in RLPFC, and in particular left RLPFC, specific to relational integrational demands ( Bunge et al., 2009 , Christoff et al., 2003 , Smith et al., 2007 , Wendelken et al., 2012 ). Four recent studies have investigated the development of relational reasoning between late childhood and adolescence or adulthood using fMRI ( Crone et al., 2009 , Dumontheil et al., 2010c , Eslinger et al., 2009 , Wendelken et al., 2011 ). These four studies used paradigms of relational processing in the visuospatial domain. Dumontheil et al. (2010c) and Wendelken et al. (2011) used very similar tasks and compared 2-relational (i.e. relational integration), 1-relational, and fixation conditions. Crone et al. (2009) used problems derived from the Ravens Progressive Matrices ( Raven, 1998 ) and included an additional 0-relational condition and a simple orientation of arrows task as baseline. Eslinger et al. (2009) used coloured geometrical shape sequences as stimuli and compared 2-relational and 1-relational conditions.

In terms of behaviour, Crone et al. (2009) found that 8–12-year old made more errors, but were not slower, than 18–25-year olds in 2-relational than 1-relational trials; Dumontheil et al. (2010c, Experiment 2) found that 11–14-year olds responded faster than 14–18-year olds in 2-relational than 1-relational trials, but neither group differed from the adult group, and there was no age group difference in accuracy; Wendelken et al. (2011) did not observe age differences in 2-relational vs. 1-relational performance over the age range of 7–18-year old using age as a continuous variable; Eslinger et al. (2009) do not report analyses of performance changes in the 8–19-year age range they studied. Thus the performance findings are mixed in these studies and performance was typically included as a covariate in the analyses.

Neuroimaging results of the first three studies, with a particular focus on the RLPFC findings, are described in Fig. 3 . Crone et al. (2009) found increased specificity for 2-relational vs. 1-relational problems between childhood and adulthood in the left RLPFC ( Fig. 3a ) in the later part of the trial period, and increased specificity for 2-relational vs. 1-relational problems with age within the child group, aged 8–12-year old. Performance was not included as a covariate in these analyses, however the authors suggested that the fact that the left RLPFC in children showed increased BOLD signal in 2-relational trials compared to 1-relational in the initial part of the trial may be associated with the poorer performance observed in children in 2-relational trials. Dumontheil et al. (2010c) observed a trend for an increase in activation in the left RLPFC in 2-relational vs. 1-relational trials between early - and mid-adolescence, and a subsequent decreased activation in this region between mid-adolescence and adulthood ( Fig. 3b ). The early- to mid-adolescence increase did not remain when performance was included as covariates, while the mid-adolescence to adulthood increase was only partially accounted for by accuracy differences. Wendelken et al. (2011) found decrease activation with age in 1-relational trials in the left RLPFC, which led to increased activation in 2-relational vs. 1-relational trials between the ages of 6 and 18 years old ( Fig. 3c ). This developmental effect remained significant when performance was covaried. Finally, Eslinger et al. (2009) report increases with age between late childhood and adolescence in the parietal cortex bilaterally and decreases in age across large parts of the frontal cortex, but no specific findings in RLPFC. The development of the relational integration of semantic stimuli will be described below, before a possible general pattern of developmental change observed in these studies is discussed.

An external file that holds a picture, illustration, etc.
Object name is gr3.jpg

Increased specificity of left RLPFC activation for relational integration (2nd order vs. 1st order relational processing) during development. Although the three studies summarised here used slightly different tasks, methods and age groups, the overall pattern shows an increased specificity of left RLPFC activation, in particular between late childhood and mid-adolescence. (a) RLPFC activation observed in adult ( N = 17, age 18–25) and children ( N = 15, age 8–12) performing problems following the general form of the Raven Progressive Matrices test ( Raven, 1998 ), with a varying number of dimensions to be integrated. On the left are shown activations related to 1st order relational processing (REL-1 > REL-0) and relational integration (REL-2 > REL-1) in adults ( p < .001 uncorrected) and children ( p < .005 uncorrected) in the 8–16 s interval of a timecourse analysis. On the right are plotted the timecourses of activation from left RLPFC regions of interset in adults and children. In the later part of the timecourses, there was a significant interaction between age group and condition (grey highlight), with activations greater in REL-2 than REL-1 in adults, and greater in REL-1 than REL-0 in children (adapted from Crone et al., 2009 ). (b) Left RLPFC activation observed in three groups of children and adolescents (total N = 85) performing a task requiring 1st or 2nd order visuospatial relational processing. Analyses using age as a continuous variable show a significant decrease in left RLPFC associated with 1st-order relational processing only, resulting in a significant age × condition interaction (adapted from Wendelken et al., 2011 ). (c) Left hemisphere activation observed in a group of adult ( N = 13, age 22–30) and adolescent ( N = 24, age 11–18) participants performing a similar task to (b). In the left RLPFC, Relational > Control activation, i.e. that specific to 2nd vs. 1st order relational processing, increased marginally between early and mid-adolescence (#), while it decreased between mid-adolescence and adulthood (*) (adapted from Dumontheil et al., 2010a , Dumontheil et al., 2010b , Dumontheil et al., 2010c ).

4.3. Development of relational integration of semantic stimuli

Another study also investigated the development of relational integration, however the paradigm was an analogical reasoning task requiring the integration of semantic information ( Wright et al., 2008 ). Stimuli were pictures of objects. In the analogical condition participants were , for example , presented with a bee and a bee's nest, and a spider, and had to pick the correct matching object (a spider's web) among other items. In the control semantic condition the participant had to pick the most closely related object to a presented target object (e.g. a baseball for a baseball bat). A group of 6–13-year old children and a group of 19–26-year old adults participated in this study. The children/young adolescents were overall slower and made more errors than the adults, and also made disproportionally more errors in the analogical problems. In addition, children's RT was affected to a greater extent than adults by lure which were semantically vs. perceptually related to one of the stimulus items. Overall the comparison of analogical and semantic problems did not show increased BOLD signal in RLPFC. However, further analyses showed (1) increasing RLPFC activation with age in children both for semantic and analogical problems, and (2) in adulthood, greater RLPFC activation in the right RLPFC associated with greater accuracy in analogical problems. The authors argue this suggests that RLPFC is first increasingly involved in the processing of 1-relational (semantic) and 2-relational (analogical) problems, while in adulthood, its activation becomes more specific to relational integration, i.e. the analogical problems. In addition, Wright et al. (2008) similarly to Crone et al. (2009) observed timecourse differences in RLPFC activity between the children and the adults, with respectively later and more prolonged activation observed in children.

The use of a paradigm recruiting the manipulation of semantic relations raises the question of the role of verbal abilities in relational reasoning, including visuospatial reasoning. As discussed below, a recent study investigated the domain specificity of relational integration ( Wendelken et al., 2012 ), comparing visuo-spatial and semantic variants of the Shapes task described above. The results indicated that both tasks recruited left RLPFC specifically for the relational integration condition vs. the processing of two relations without integration. This left hemisphere-specificity of relational integration activity may be related to a verbal recoding during relational reasoning. In terms of development, it has been shown that after age 7 children tend to recode visuospatial or pictorial information in a verbal format in working memory tasks ( Conrad, 1971 , Flavell et al., 1966 ), and that these processes are related to their use of self-regulatory private speech ( Al-Namlah et al., 2006 ). This shift to phonological recoding has been suggested to be part of a general transition towards verbal mediation of cognitive processes ( Ford and Silber, 1994 , Hitch et al., 1991 ). Articulatory suppression has been shown to affect performance of executive functions tasks more broadly (e.g. in task switching ( Baddeley et al., 2001 ), or Tower of London tasks ( Wallace et al., 2009 )) and a diminished use of inner speech among individuals with autism spectrum disorders is thought to contribute to the executive dysfunction associated with these disorders ( Wallace et al., 2009 , Whitehouse et al., 2006 ). In addition, a large-scale lesion study in adults showed that performance deficits on the Raven's Colored Progressive Matrices, which is considered to be a non-verbal test of reasoning, were associated with lesions in temporal regions essential for language processing, as well as in the left inferior parietal lobule ( Baldo et al., 2010 ).

Therefore, current results suggest that relational reasoning in adults relies on verbal recoding of the relations and specific activations in the left RLPFC, however whether verbal recoding becomes more prevalent with age during relational reasoning, as in certain EF tasks, has not yet been investigated, and more research will be necessary to further explore these issues.

4.4. Increasing specificity of RLPFC activation for relational integration during development

A common overall pattern of the studies described above was of an increased activation in 2-relational problems vs. 1-relational problems between childhood and adolescence, which may be specific to the left RLPFC. However, this pattern of increased specialisation may be similar in a broader network of brain regions. Indeed, Crone et al. (2009) found that left dorsolateral prefrontal cortex (DLPFC) and left parietal cortex showed similar increased specialisation of activation for 2-relational trials vs. 1-relational trials when comparing children and adults. Wendelken et al. (2011) also found increased specialisation, although weaker, in bilateral intraparietal lobules, but not in the DLPFC. When comparing adolescents to adults Dumontheil et al. (2010c) did not find age effects in either DLPFC or parietal cortex. It is possible that only more sensitive analyses looking at BOLD signal timecourse or including a large number of children and adolescent participants may be able to pick up specialisation of brain activation in these regions.

It is as yet unclear how much this increased specialisation may relate to changes in accuracy and reaction times in 2-relational trials. However, the pattern suggests specialisation of left RLPFC, and potentially DLPFC and parietal cortex for relational integration compared to relational processing during adolescence. Only one of these studies compared later adolescence to adulthood and the findings showed decreased activation in the 2-relational vs. 1-relational comparison ( Dumontheil et al., 2010c ), which was partly related to accuracy differences between these age groups.

The pattern of increasing specialisation of brain activation for relational integration was driven in some studies by decreasing activation for relational processing, which highlights the complexity of investigating fMRI data developmentally. In particular, it is unclear whether increased activation (e.g. in WM task, Klingberg et al., 2002 ) or decreased activation (e.g. in response inhibition tasks, Tamm et al., 2002 ) reflect “more efficient” neural processing. One interpretation is that increased activation reflects greater specialisation of the brain region for a particular cognitive process, while decreased activation may reflect the fact that with more efficient neural processing in other brain regions or increased connectivity between regions, a particular brain region is no longer necessary for a particular cognitive process (e.g. RLPFC for the processing of single relations). In this context, as is true in general for fMRI studies, the specific contrast investigated is particularly relevant, for example whether one is contrasting relational integration (2-Rel) to relational processing (1-Rel) or to a fixation control condition. Although RLPFC did not show an increased BOLD signal during a Raven reasoning task at the corrected threshold used, a recent study in adults by Perfetti et al. (2009) speaks to the fact that lower performance or abilities overall may be associated with less specific brain activations in fronto-parietal regions. Comparing high and low fluid intelligence (gf) participants, Perfetti et al. (2009) found that while the high gf group showed increased fronto-parietal activation in the analytical (more complex) problems compared to the figural problems, the low gf group showed greater activations in the figural condition than the high gf group, and a tendency for the activations in the analytical condition to be lower than in the figural condition. In the visual analogy task described above, Wright et al. (2008) found that in adults the specificity of RLPFC activations for relational integration was positively correlated with accuracy on the task. In another study, it was shown that high gf participants showed greater parietal activations than low gf participants in a relational integration task ( Lee et al., 2006 ). This later result highlights the importance of processing in brain regions other than RLPFC for the performance of relational integration. The parietal cortex has been suggested to support the identification of the visuo-spatial relations that are the basis of relational integration ( Ferrer et al., 2009 ).

In summary, fMRI studies have demonstrated changes in RLPFC activation during adolescence during the manipulation and integration of self-generated thoughts and their relations. The overall pattern suggests increasing specialisation of activations in the left RLPFC in particular, but also in the DLPFC and parietal cortex, which are thought to support the processing of single relations. More work will be needed to assess how these observed functional changes relate to developmental changes in performance. One factor that has been proposed to play a role is brain structure, which will be discussed in Section 4.7.

4.5. RLPFC and episodic memory retrieval during development

RPFC has been suggested to play a role in the control, and possibly processing, of temporally extended representation ( Badre, 2008 , Fig. 1 ), as suggested by its increased activation during branching or multitasking ( Badre and D’Esposito, 2007 , Braver and Bongiolatti, 2002 , Koechlin et al., 2003 ), prospective memory ( Benoit et al., 2011 , Burgess et al., 2007b ), episodic memory, in particular episodic memory retrieval ( Dobbins et al., 2004 , Spaniol et al., 2009 , Turner et al., 2008 ) and mindwandering ( Christoff et al., 2009a , Christoff et al., 2004 , Dumontheil et al., 2010a , Schooler et al., 2011 ). Studies investigating the development of the neural correlates for episodic memory have tended to focus on the encoding phase of episodic memory, rather than episodic memory retrieval ( Chiu et al., 2006 , Ghetti et al., 2010 , Ofen et al., 2007 ). However a few very recent studies investigated episodic memory retrieval using fMRI and event-related potentials (ERPs).

Findings regarding the development of the neural correlates of episodic memory in the hippocampus have been mixed. In contrast, more consistent findings have been observed in the frontal and parietal cortices thought to support memory retrieval (see DeMaster et al., 2013 for review). Paz-Alonso et al. (2008) focused on the development of true and false recognition and tested children age 8 and 12-year old, and 19–23-year old adults. The results showed region-specific developmental changes in the MTL, bilateral DLPFC, posterior parietal cortex, and right RLPFC. Adults, but not children, exhibited strongest right RLPFC activation for hits and those trials where a semantically-related lure was correctly rejected, i.e., according to the authors, those conditions in which monitoring was both required (due to the presentation of semantically relevant stimuli), and successful (leading to a correct response) ( Fig. 4a ).

An external file that holds a picture, illustration, etc.
Object name is gr4.jpg

Developmental changes in RLPFC activation during episodic memory tasks. (a) Neural correlates of episodic memory retrieval. Top left: increased activation with age associated with hit trials compared to trials with correctly rejected semantically unrelated lures; top right: increased activation with age associated with trials where a semantically related (critical) lure vs. an unrelated lure is correctly identified; bottom: region of interest analysis suggesting that in adults right RLPFC is involved in the monitoring of performance during episodic memory retrieval, with greater activation associated to correctly recognised semantically relevant items (hits or critical lures). CR: correct rejections; FA: false alarms; aPFC: anterior prefrontal cortex (adapted from Paz-Alonso et al., 2008 ). (b) Region of interest analysis of left RLPFC activation during source memory retrieval. The condition × age group interaction was significant, revealing increased RLPFC activation for increasingly amount of recollected information (correct border = both drawing and colour were remembered (source memory); incorrect border = the drawing but not its border colour was remembered (item memory); Miss = error trial; correct rejection = drawing correctly identified as not presented before) in the adults, but not the children, who showed similar RLPFC recruitment across trial types (adapted from DeMaster and Ghetti, 2013 ). (c) Region of interest analysis of left RLPFC activation during source memory retrieval. The condition × age group interaction was significant, revealing increased RLPFC activation for increasingly amount of recollected information (correct spatial recall = both drawing and its location were remembered (source memory); incorrect spatial recall = the drawing but not its location was remembered (item memory); Miss = error trial; correct rejection = drawing correctly identified as not presented before) in the adults, with a difference between source and item memory in the 10–11-year olds, but activation for item memory only for the 8–9-year olds (adapted from DeMaster et al., 2013 ).

DeMaster and Ghetti (2013) scanned children aged 8–11-year old and adults aged 18–25-year old who were asked whether a drawing shown on the screen had been presented before or not (item memory) and what colour was the border of the drawing during its first presentation (context or source memory). Activations associated with successful retrieval across age groups were observed in the right MTL, left posterior parietal cortex, left RLPFC and precuneus. In the RLPFC activation was observed across conditions and was unspecific to successful retrieval in children, while in adults the activation was greater for trials where the colour-drawing pair was successfully remembered than when the drawing was recognised but the colour not remembered, and in turn these trials show greater activation than for drawings correctly recognised as new ( Fig. 4b ).

In a second study, DeMaster et al. (2013) used a spatial context (drawing presented on the left or right of the screen) rather than a colour border and scanned children aged 8–9 or 10–12 years old and adults. Similarly to their previous study, DeMaster et al. (2013) observed an age × condition interaction in the left RLPFC (with a similar but weaker pattern in the right RLPFC). Adults showed greater activation for correct than incorrect source memory retrieval, and more activation for incorrect source memory retrieval (but correct old item recognition) than for correctly rejected items (new items) ( Fig. 4c ). In 10–11-year-olds, only the comparison correct vs. incorrect source memory retrieval was significant, while in 8–9-year olds activation was greater for correctly recognised items than for items correctly identified as new ( Fig. 4c ). A similar pattern of developmental changes was observed in the left parietal cortex and precuneus, but differed in the insula and DLPFC. The similar pattern observed between the parietal cortex and RLPFC further reinforces the idea that these two regions interact strongly during abstract thinking, as suggested in the relational abstract thoughts studies described above and in Section 5 below. Although DeMaster et al. (2013) point out that these two regions have been associated with different cognitive processes in the past, they suggest that further work needs to be done to disentangle their role during episodic memory retrieval development.

Contrary to the three studies described above ( Fig. 4 ), Güler and Thomas (2013) did not observe developmental changes in RLPFC during episodic memory retrieval. However this study compared 9–10 and 12–13-year olds children and did not include an adult group, which may have limited the size of the developmental effect. In addition, the paradigm used was a paired-associate picture memory task rather than a source memory paradigm. Developmental differences in activation associated with successful recall were instead observed in a more posterior part of the left middle frontal gyrus (area 46/47), right middle temporal gyrus and cerebellum, left inferior parietal lobule and anterior cingulate gyrus ( Güler and Thomas, 2013 ).

To summarise, recent studies investigating episodic memory development using neuroimaging methods show prolonged development of the neural correlates of item and source memory retrieval between late childhood and adulthood, with evidence of increased sensitivity of RLPFC activation to specific components of episodic memory (e.g. source vs. item memory, old vs. new item) in adults compared to children.

4.6. Neuroimaging studies of episodic memory and prospective memory during development

Only two studies have investigated the neural correlates of PM development. Both studies used event-related PM paradigms and collected ERP data. Mattli et al. (2011) tested children (mean age 10.3 years) and younger adults (mean age 31.4 years) (as well as an older adult group not discussed here). The N300 component reflects greater negativity for PM hits than PM misses and ongoing activity trials over the occipito–parietal region of the scalp. It is therefore thought to be associated with the detection of an event-based PM cue in the environment. Mattli et al. (2011) observed no difference in N300 amplitude for PM hits versus ongoing trials between the age group, however while adults showed greater N300 amplitude for PM hits than PM misses, children did not. According to the authors, this suggests that in children cue detection was not necessarily associated with realisation of the intention, possibly reflecting failure of executive processes associated with switching or disengaging from the ongoing activity. Reversely, a parietal positivity discriminated between PM hits and misses in children, but not in adults. No difference between age group was found between a frontal positivity which also discriminated between PM hits and PM misses. In a study including adolescent participants, Zöllig et al. (2007) observed larger N300 amplitudes in adolescents than in adults when a PM intention had to be inhibited, and a larger parietal positivity between 600 and 800 ms when a PM intention had to be executed, as compared to ongoing trials. The latter effect is similar to that observed by Mattli et al. (2011) . Source analyses suggested differences in current density between adolescents and adults for PM execution in mostly posterior brain regions, while ongoing trials were associated with greater right middle frontal gyrus activations in adolescents, which may be associated with some sort of anticipatory processing ( Simons et al., 2006 ). However, adolescents also showed poorer performance in ongoing trials, limiting the inferences that can be made from these results. To summarise, very little neuroimaging research has been done to investigate the development of PM during late childhood and adolescence. Further work, including fMRI studies, will be necessary to inform our understanding of the role played by RLPFC during PM development.

5. Association between structural changes during development and abstract thinking

RLPFC undergoes substantial structural changes during adolescence (see Dumontheil et al., 2008 for review). Research on developmental changes in brain structure have tended to consist of whole-brain analyses and do not typically report analyses in anatomical subdivisons of the frontal cortex. Overall the results show increases in white matter volumes and decreases in grey matter volumes with age in the frontal cortex during adolescence ( Barnea-Goraly et al., 2005 , Giedd et al., 1999 , Shaw et al., 2008 , Sowell et al., 1999 , Sowell et al., 2004 , Tamnes et al., 2010 , Westlye et al., 2010 ). Behavioural and functional changes during development, and in particular late childhood and adolescence, are often interpreted as being a consequence of the structural changes that occur during this period ( Crone and Dahl, 2012 , Luna et al., 2010 , Spear, 2000 ). Decreases in functional activations are considered to reflect developmental reductions in grey matter volume, presumably related to synaptic pruning. Increases are thought to relate to improved and more localised task-specific processing, potentially facilitated by faster long-range connections due to increased axonal myelination and size ( Luna et al., 2010 ). Understanding the link between structural and functional changes is critical in understanding the mechanisms of neurocognitive development, yet very few studies have directly compared structural and functional data within the same individuals (e.g. Lu et al., 2009 , Olesen et al., 2003 , Van den Bos et al., 2012 ). The association between structural changes during development and relationally abstract thinking will be described below, presenting data from recent studies which attempt to integrate brain and behavioural measures. No studies to date have investigated associations between brain structure and temporally abstract thinking during development.

Cortical thickness of RLPFC, in particular in females (e.g. Narr et al., 2007 ), and during adolescence (e.g. Shaw et al., 2006 ), has been shown to be positively correlated with standardised intelligence quotient (IQ). IQ is typically measured using tests such as the Wechsler intelligence scales ( Wechsler, 1997 ), which include a variety of subtests testing verbal and performance intelligence. Some of these tests will require the manipulation of self-generated and abstract thoughts; however, it is as yet unclear whether this accounts for the observed link between RLPFC structure and IQ ( Narr et al., 2007 , Shaw et al., 2006 ). The finding by Shaw et al. (2006) that the developmental timecourse of cortical thickness changes was associated with IQ, rather than cortical thickness in early childhood or in adulthood, stresses the importance of studying developmental trajectories. However, very few research groups have the means to do so using large longitudinal samples and most of the data discussed below are cross-sectional.

Using the datasets described above, collected while participants performed the Alphabet and Shapes tasks ( Dumontheil et al., 2010b , Dumontheil et al., 2010c ), we aimed to test the hypothesis that decreases in functional BOLD signal during adolescence may reflect the concomitant local decreases in grey matter volume. To do so we extracted local grey and white matter volumes in the brain regions showing functional developmental changes and entered these data into multiple regression analyses. The results revealed that the decrease in superior RLPFC during switching between self-generated and perceptually-derived information was not accounted for by local structural changes ( Dumontheil et al., 2010b ). Analyses of the relational integration data from the Shapes task ( Dumontheil et al., 2010c ) provided a different picture, showing that the decreased BOLD signal between mid-adolescents and adults did not remain significant when local structural measures (and performance) were covaried. Further tests were performed to relate structural changes to the connectivity changes observed using dynamic causal modelling (DCM) ( Bazargani et al., 2014 ). Grey matter volume in RLPFC and fixed connectivity (i.e. connectivity in 1-relational trials) between frontal and insular regions were both found to decrease with age. RLPFC grey matter volume was further found to predict short-range fixed connectivity. However, no significant mediation of the effect of age on short-range fixed connectivity by RLPFC grey matter volume was observed ( Bazargani et al., 2014 ). RLPFC grey matter volume in addition predicted 2-relational vs. 1-relational accuracy ( Bazargani et al., 2014 ). In the other study of relational integration development in children and adolescent participants described above, increased functional selectivity in the left RLPFC was partly accounted for by cortical thinning in the left inferior parietal lobule ( Wendelken et al., 2011 ), with a positive correlation between inferior parietal lobule thickness and activation in the left RLPFC in 1-relational trials.

The first two sets of results, within the same participants, provide evidence for the complex relationships between developmental changes in task-related brain activity, performance and local changes in brain structure. Overall the results discussed above suggest that individual differences in grey matter, in RLPFC or the inferior parietal lobule, can play a role in the development of functional networks supporting relational integration. There is less evidence suggesting specific roles of individual differences or developmental changes in white matter in the development of relational reasoning. Indeed, a recent study has shown that developmental changes in whole-brain measures of white matter volume or fractional anisotropy predicted developmental improvements in visuospatial reasoning ability. However, this effect was mediated via processing speed and was not found to be specific to fronto-parietal white matter tracts ( Ferrer et al., 2013 ). This suggests that, contrary to grey matter volume, the influence of structural developmental changes in white matter on reasoning ability may not be region-specific.

6. Questions for future research

6.1. influence of puberty vs. chronological age.

The role of puberty in the developing adolescent brain ( Blakemore et al., 2010 , Crone and Dahl, 2012 ) and whether changes observed during adolescence are a consequence of chronological age or puberty levels has been the topic of a few recent studies investigating structural changes ( Goddings et al., 2014 ) and functional changes during a social cognition task ( Goddings et al., 2012 ). Although in this latter study the functional changes observed in the MPFC were related to age rather than puberty level (in contrast to the functional changes observed in the temporal cortex), very little is known about the effect of puberty stage on the development of abstract thinking and the lateral parts of the prefrontal cortex during adolescence. More generally, there is currently little evidence of gender differences in this age range in functional imaging data (e.g. Hatcher et al., 1990 , Wendelken et al., 2011 ), however the available data is limited as some studies only included participants of one gender (e.g. Dumontheil et al., 2010b , Dumontheil et al., 2010c ), and others did not test for potential gender differences (e.g. DeMaster and Ghetti, 2013 , Crone et al., 2009 ), likely because of sample size limitations. However, structural neuroimaging studies have shown that the RPFC is the region with the greatest difference in rates of cortical thinning between males and females between the ages of 9 and 22 years ( Raznahan et al., 2010 ), and that there are sex differences in the relationship between cortical thickness maturation in the RPFC and in the superior frontal cortex in the same age range ( Raznahan et al., 2011 ). These structural studies suggest investigating the possible consequences of these structural differences over chronological and pubertal development for RLPFC function maturation is warranted.

6.2. Investigation of the role of RLPFC in the development of temporally abstract thinking

As mentioned above, RLPFC has been implicated in prospective memory, episodic memory retrieval and mindwandering, i.e. cognitive processes associated with the manipulation of temporally extended abstract information. Although recent neuroimaging work has started to investigate the neural correlates of episodic memory retrieval, only a couple of ERP studies have investigated PM, and no research has been done on mindwandering development. Future research on these topics will broaden our understanding of the development of adolescents’ ability to retrieve past experience and think about the future, and how these abilities relate to the control of attention towards perceptually-derived vs. self-generated thoughts.

6.3. Abstract thinking in the social domain: the role of medial RPFC

Anatomical studies investigating the cytoarchitectonic properties of RPFC (e.g. Öngür et al., 2003 ) and meta-analyses of fMRI data ( Gilbert et al., 2006b , Van Overwalle, 2009 ) suggest a distinction between the medial and lateral aspects of RPFC. Activations along the medial wall have mainly been observed in social cognition tasks, in particular those involving theory of mind, or mentalising, i.e. our ability to understand our own and other people's mental states (except in the most polar part of Brodmann area 10, see Gilbert et al., 2006b , Van Overwalle, 2009 ). In some situations another person's intention may be quite apparent on the basis of their overt behaviour, and our own mental states or feelings may be salient via e.g. increased heart beat frequency, sweat or stomach-ache in response to stress. In such cases , mentalising would rely on perceptually-derived information. In other situations, one may need to retrieve from episodic memory past behaviour of a friend, or to retrieve social scripts and semantic information in order to judge how they should respond to a friend's comment or behave in a novel social situation. In such cases, one would need to manipulate and integrate self-generated information. Along these lines, Van Overwalle (2009) in his review describes MPFC “as a module that integrates social information across time and allows reflection and representation of traits and norms, and presumably also of intentionality, at a more abstract cognitive level”.

Of particular interest for further research would therefore be the functional relationship between RLPFC and MPFC during abstract thinking, and whether there is anything special about the reasoning and manipulation of social vs. non-social information. A couple of recent studies speak to this. In one study, the storage and manipulation of social information in working memory was associated with activations in both the typical lateral fronto-parietal network associated with working memory and regions of the social brain, including the MPFC and temporo-parietal junction ( Meyer et al., 2012 ). In contrast, the other study, using a relational reasoning task on social information (how pleasant or unpleasant the participant or a participant's friend finds a particular concept), did not observe greater medial PFC activation during relational integration compared to the manipulation of single relations, but did observe left RLPFC activation, consistent with the relational integration studies reported above ( Raposo et al., 2011 ). Note however that neither study included a non-social comparison condition, which would be needed to assess activation patterns that are specific to the manipulation of self-generated information of a social nature.

In terms of development, adolescents typically show increased MPFC activation during social cognition tasks ( Blakemore, 2008 , Crone and Dahl, 2012 ), although we recently showed that a pattern of increasing specialisation for perspective taking compared to the processing of social stimuli could be observed between adolescence and adulthood ( Dumontheil et al., 2012 ). Touching on the relationship between abstract thinking about social vs. non-social information, an older study reported complex links in participants aged 10, 13 and 17-year old between abstract reasoning and self- or other- mentalising measures, which were found to differ according to sex ( Hatcher et al., 1990 ). Finally, results of a recent qualitative study suggest that older teenagers coordinate an increasing number of psychological components while telling stories about their family and themselves, and in so doing, create increasingly abstract and coherent psychological profiles of themselves and others ( Mckeough and Malcolm, 2010 ). A better understanding of the link between abstract thinking and social cognition during development may thus inform our understanding of the development of the self-concept during adolescence.

7. Training studies and implications for education

Fluid intelligence can be defined as the use of deliberate mental operations to solve novel problems. These mental operations include drawing inferences, concept formation, classification, generating and testing hypothesis, identifying relations, comprehending implications, problem solving, extrapolating, and transforming information. Thus, fluid intelligence is tightly linked to abstract thinking and relational integration ( Ferrer et al., 2009 ). Fluid intelligence is thought to be an essential component of cognitive development ( Goswami, 1992 ) and the basis for acquisition of abilities in various domains during childhood and adolescence ( Blair, 2006 ; see Ferrer et al., 2009 for review). Fluid intelligence in childhood predicts achievements at school (e.g. in maths during early adolescence ( Primi et al., 2010 )), university and in cognitively demanding occupations ( Gottfredson, 1997 ). Fluid intelligence is therefore a predictor of learning , especially in novel and complex situations. Consequently, a better understanding of the development of abstract thinking and reasoning during late childhood and adolescence , both in terms of behaviour and neuroscience, may have implications for education.

Of particular relevance are recent studies assessing the training of abstract thinking or reasoning skills. A few studies have investigating fluid reasoning training during childhood. For example, computerised non-verbal reasoning training was shown to improve fluid intelligence in a large sample of 4-year olds ( Bergman Nutley et al., 2011 ), and fluid reasoning training emphasising planning and relational integration led to substantial improvement on performance IQ, but not speed of reasoning , in children aged 7–9-year old from low socioeconomic backgrounds ( Mackey et al., 2011 ). A couple of studies in young adults further report that students taking a US Law School Admissions Test (LSAT) course offering 70 h of reasoning training showed a strengthening in fronto-parietal and parietal-striatal resting state connectivity compared to matched control participants ( Mackey et al., 2013 ), as well as changes in white matter structure in the frontal and parietal lobes ( Mackey et al., 2012 ). Very little work has been done investigating training of reasoning in adolescents, although Chapman and Gamino (2008) have developed the Strategic Memory and Reasoning Training (SMART) programme, designed to improve top down reasoning skills. The aim of this programme is to teach children how to learn rather than what to learn, by supporting higher-order abstraction of meaning from incoming details and world knowledge, and there is promising evidence that this training programme leads to improved gist-reasoning and fact-learning ability ( Gamino et al., 2010 ).

Whether children and adolescents may benefit more from training than adults will be an important area of research. Relatively little is currently known about developmental differences in brain plasticity in response to training interventions, however research in this domain has greater potential for tailoring appropriate training interventions to different age groups (see Jolles and Crone, 2012 for discussion). Both childhood and adolescence may be “sensitive periods” for teaching, as significant brain reorganisation is taking place during these periods. Perhaps the aims of adolescents’ education might usefully include a focus on abilities that are controlled by the parts of the brain that undergo most change during adolescence, including those described in this review: abstract thinking and reasoning, and the ability to focus on one's own thoughts in spite of environmental distraction. However, training intervention may be limited by the current level of structural brain development and cognitive capacity (as pointed out in Jolles and Crone, 2012 ), in particular for those training interventions based on strategy rather than repeated performance.

8. Conclusion

Rostrolateral prefrontal cortex supports a wide range of cognitive processes, which may have in common their requirement of retrieval, maintenance, manipulation and/or integration of self-generated, or stimulus-independent thoughts, considered broadly here as abstract thoughts, either relationally abstract, or temporally abstract. This review focused on summarising the evidence from behavioural and neuroimaging studies of the development of RLPFC and its associated functions. Behavioural studies have shown prolonged changes in the speed and accuracy of attending towards and processing self-generated information, in particular in reasoning tasks. These developmental changes appear to build on working memory and inhibitory control functions, as well as the acquisition of domain-specific knowledge. This dependence on the maturation of other aspects of cognition, including working memory and inhibitory control, which are dependent on more posterior regions of the frontal cortex, reinforces the idea that the maturation of RLPFC function will be relatively more protracted. Certain aspects of episodic memory and prospective memory, namely those that rely on implementation of strategies for recollecting source memory, and for time-checking in prospective memory tasks also continue to develop during adolescence. Neuroimaging evidence suggests a possible developmental pattern of increasing specialisation of RLPFC for the integration of relational information, with complex relationships between developmental changes in structure, performance and brain activation, and increasing specialisation for the retrieval of source memory, and item memory information, compared to the processing of new items. A strong relationship between RLPFC and the parietal cortex was apparent across tasks, and further work, in particular using connectivity analyses, may inform our understanding of how the interplay between these brain regions permits the increasingly successful integration of relationally and temporally abstract thoughts over development. Future research could inform our understanding of development of reasoning and abstract thinking in the social domain, and whether functions associated with the RPFC could be trained, with potential benefits in the domain of education.

Acknowledgements

I thank Prof. Uta Frith for inviting this review and Prof. Sarah-Jayne Blakemore for her continuing support.

Available online 12 August 2014

  • Aberle I., Kliegel M. Time-based prospective memory performance in young children. Eur. J. Dev. Psychol. 2010; 7 :419–431. [ Google Scholar ]
  • Al-Namlah A.S., Fernyhough C., Meins E. Sociocultural influences on the development of verbal mediation: private speech and phonological recoding in Saudi Arabian and British samples. Dev. Psychol. 2006; 42 :117–131. [ PubMed ] [ Google Scholar ]
  • Amaral D.G., Price J.L. Amygdalo-cortical projections in the monkey ( Macaca fascicularis ) J. Comp. Neurol. 1984; 230 :465–496. [ PubMed ] [ Google Scholar ]
  • Amati D., Shallice T. On the emergence of modern humans. Cognition. 2007; 103 :358–385. [ PubMed ] [ Google Scholar ]
  • Amodio D.M., Frith C.D. Meeting of minds: the medial frontal cortex and social cognition. Nat. Rev. Neurosci. 2006; 7 :268–277. [ PubMed ] [ Google Scholar ]
  • Andersen R.A., Asanuma C., Cowan W.M. Callosal and prefrontal associational projecting cell populations in area 7A of the macaque monkey: a study using retrogradely transported fluorescent dyes. J. Comp. Neurol. 1985; 232 :443–455. [ PubMed ] [ Google Scholar ]
  • Anderson V.A., Anderson P., Northam E., Jacobs R., Catroppa C. Development of executive functions through late childhood and adolescence in an Australian sample. Dev. Neuropsychol. 2001; 20 :385–406. [ PubMed ] [ Google Scholar ]
  • Arikuni T., Sako H., Murata A. Ipsilateral connections of the anterior cingulate cortex with the frontal and medial temporal cortices in the macaque monkey. Neurosci. Res. 1994; 21 :19–39. [ PubMed ] [ Google Scholar ]
  • Azuar C., Reyes P., Slachevsky A., Volle E., Kinkingnehun S., Kouneiher F., Bravo E., Dubois B., Koechlin E., Levy R. Testing the model of caudo-rostral organization of cognitive control in the human with frontal lesions. Neuroimage. 2014; 84 :1053–1060. [ PubMed ] [ Google Scholar ]
  • Bachevalier J., Meunier M., Lu M.X., Ungerleider L.G. Thalamic and temporal cortex input to medial prefrontal cortex in rhesus monkeys. Exp. Brain Res. 1997; 115 :430–444. [ PubMed ] [ Google Scholar ]
  • Baddeley A., Chincotta D., Adlam A. Working memory and the control of action: evidence from task switching. J. Exp. Psychol. Gen. 2001; 130 :641–657. [ PubMed ] [ Google Scholar ]
  • Badre D. Cognitive control, hierarchy, and the rostro-caudal organization of the frontal lobes. Trends Cognit. Sci. 2008; 12 :193–200. [ PubMed ] [ Google Scholar ]
  • Badre D., D’Esposito M. Functional magnetic resonance imaging evidence for a hierarchical organization of the prefrontal cortex. J. Cognit. Neurosci. 2007; 19 :2082–2099. [ PubMed ] [ Google Scholar ]
  • Badre D., D’Esposito M. Is the rostro-caudal axis of the frontal lobe hierarchical? Nat. Rev. Neurosci. 2009; 10 :659–669. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Badre D., Wagner A.D. Selection, integration, and conflict monitoring; assessing the nature and generality of prefrontal cognitive control mechanisms. Neuron. 2004; 41 :473–487. [ PubMed ] [ Google Scholar ]
  • Baldo J.V., Bunge S.A., Wilson S.M., Dronkers N.F. Is relational reasoning dependent on language? A voxel-based lesion symptom mapping study. Brain Lang. 2010; 113 :59–64. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Barban F., Carlesimo G.A., Macaluso E., Caltagirone C., Costa A. Functional brain activity within the medial and lateral portion of BA10 during a prospective memory task. Behav. Neurol. 2013; 26 :207–209. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Barnea-Goraly N., Menon V., Eckert M., Tamm L., Bammer R., Karchemskiy A., Dant C.C., Reiss A.L. White matter development during childhood and adolescence: a cross-sectional diffusion tensor imaging study. Cereb. Cortex. 2005; 15 :1848–1854. [ PubMed ] [ Google Scholar ]
  • Bazargani N., Hillebrandt H., Christoff K., Dumontheil I. Developmental changes in effective connectivity associated with relational reasoning. Hum. Brain Mapp. 2014; 35 :3262–3276. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bengtsson S.L., Haynes J.-D., Sakai K., Buckley M.J., Passingham R.E. The representation of abstract task rules in the human prefrontal cortex. Cereb. Cortex. 2009; 19 :1929–1936. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Benoit R.G., Gilbert S.J., Frith C.D., Burgess P.W. Rostral prefrontal cortex and the focus of attention in prospective memory. Cereb. Cortex. 2011; 22 :1876–1886. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bergman Nutley S., Söderqvist S., Bryde S., Thorell L.B., Humphreys K., Klingberg T. Gains in fluid intelligence after training non-verbal reasoning in 4-year-old children: a controlled, randomized study. Dev. Sci. 2011; 14 :591–601. [ PubMed ] [ Google Scholar ]
  • Billingsley R., Smith M., McAndrews M. Developmental patterns in priming and familiarity in explicit recollection. 2002; 82 :251–277. [ PubMed ] [ Google Scholar ]
  • Blair C. How similar are fluid cognition and general intelligence? A developmental neuroscience perspective on fluid cognition as an aspect of human cognitive ability. Behav. Brain Sci. 2006; 29 :109–125. (discussion 125-60) [ PubMed ] [ Google Scholar ]
  • Blakemore S.J. The social brain in adolescence. Nat. Rev. Neurosci. 2008; 9 :267–277. [ PubMed ] [ Google Scholar ]
  • Blakemore S.-J., Burnett S., Dahl R.E. The role of puberty in the developing adolescent brain. Hum. Brain Mapp. 2010; 31 :926–933. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Booth J.R., Burman D.D., Meyer J.R., Lei Z., Trommer B.L., Davenport N.D., Li W., Parrish T.B., Gitelman D.R., Mesulam M.M. Neural development of selective attention and response inhibition. Neuroimage. 2003; 20 :737–751. [ PubMed ] [ Google Scholar ]
  • Botvinick M.M. Hierarchical models of behavior and prefrontal function. Trends Cognit. Sci. 2008; 12 :201–208. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Braver T.S., Bongiolatti S.R. The role of frontopolar cortex in subgoal processing during working memory. Neuroimage. 2002; 15 :523–536. [ PubMed ] [ Google Scholar ]
  • Braver T.S., Reynolds J.R., Donaldson D.I. Neural mechanisms of transient and sustained cognitive control during task switching. Neuron. 2003; 39 :713–726. [ PubMed ] [ Google Scholar ]
  • Bunge S.A., Dudukovic N.M., Thomason M.E., Vaidya C.J., Gabrieli J.D. Immature frontal lobe contributions to cognitive control in children: evidence from fMRI. Neuron. 2002; 33 :301–311. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bunge S.A., Helskog E.H., Wendelken C. Left, but not right, rostrolateral prefrontal cortex meets a stringent test of the relational integration hypothesis. Neuroimage. 2009; 46 :338–342. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Burgess P.W. Strategy application disorder: the role of the frontal lobes in human multitasking. Psychol. Res. 2000; 63 :279–288. [ PubMed ] [ Google Scholar ]
  • Burgess P.W., Alderman N., Volle E., Benoit R.G., Gilbert S.J. Mesulam's frontal lobe mystery re-examined. Restor. Neurol. Neurosci. 2009; 27 :493–506. [ PubMed ] [ Google Scholar ]
  • Burgess P.W., Dumontheil I., Gilbert S.J. The gateway hypothesis of rostral prefrontal cortex (area 10) function. Trends Cognit. Sci. 2007; 11 :290–298. [ PubMed ] [ Google Scholar ]
  • Burgess P.W., Dumontheil I., Gilbert S.J., Okuda J., Schölvinck M.L., Simons J.S. On the role of rostral prefrontal cortex (area 10) in prospective memory. In: Kliegel M., McDaniel M.A., Einstein G.O., editors. On the Role of Rostral Prefrontal Cortex (area 10) in Prospective Memory. Erlbaum; Mahwah: 2007. [ Google Scholar ]
  • Carpenter P.A., Just M.A., Shell P. What one intelligence test measures: a theoretical account of the processing in the Raven Progressive Matrices Test. Psychol. Rev. 1990; 97 :404–431. [ PubMed ] [ Google Scholar ]
  • Case R. Academic Press; New York: 1985. Intellectual Development: Birth to Adulthood. [ Google Scholar ]
  • Ceci S.J., Bronfenbrenner U. Don’t forget to take the cupcakes out of the oven: prospective memory, strategic time-monitoring, and context. Child Dev. 1985; 56 :152–164. [ PubMed ] [ Google Scholar ]
  • Chapman S.B., Gamino J.F. Center for Brain Health; Dallas, TX: 2008. Strategic Memory and Reasoning Training (SMART) [ Google Scholar ]
  • Chen Z., Daehler M.W. Intention and outcome: key components of causal structure facilitating mapping in children's analogical transfer. J. Exp. Child Psychol. 1992; 53 :237–257. [ PubMed ] [ Google Scholar ]
  • Chen Z., Sanchez R.P., Campbell T. From beyond to within their grasp: the rudiments of analogical problem solving in 10- and 13-month-olds. Dev. Psychol. 1997; 33 :790–801. [ PubMed ] [ Google Scholar ]
  • Chiu C.-Y.P., Schmithorst V.J., Brown R.D., Holland S.K., Dunn S. Making memories: a cross-sectional investigation of episodic memory encoding in childhood using FMRI. Dev. Neuropsychol. 2006; 29 :321–340. [ PubMed ] [ Google Scholar ]
  • Christoff K., Gabrieli J.D.E. The frontopolar cortex and human cognition: evidence for a rostrocaudal hierarchical organization within the human prefrontal cortex. Psychobiology. 2000; 28 :168–186. [ Google Scholar ]
  • Christoff K., Gordon A.M., Smallwood J., Smith R., Schooler J.W. Experience sampling during fMRI reveals default network and executive system contributions to mind wandering. Proc. Natl. Acad. Sci. USA. 2009; 106 :8719–8724. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Christoff K., Keramatian K., Gordon A.M., Smith R., Mädler B. Prefrontal organization of cognitive control according to levels of abstraction. Brain Res. 2009; 1286 :94–105. [ PubMed ] [ Google Scholar ]
  • Christoff K., Ream J.M., Gabrieli J.D. Neural basis of spontaneous thought processes. Cortex. 2004; 40 :623–630. [ PubMed ] [ Google Scholar ]
  • Christoff K., Ream J.M., Geddes L.P., Gabrieli J.D. Evaluating self-generated information: anterior prefrontal contributions to human cognition. Behav. Neurosci. 2003; 117 :1161–1168. [ PubMed ] [ Google Scholar ]
  • Conrad R. The chronology of the development of cover speech in children. Dev. Psychol. 1971; 5 :398–405. [ Google Scholar ]
  • Crone E.A. Executive functions in adolescence: inferences from brain and behavior. Dev. Sci. 2009; 12 :825–830. [ PubMed ] [ Google Scholar ]
  • Crone E.A., Dahl R.E. Understanding adolescence as a period of social-affective engagement and goal flexibility. Nat. Rev. Neurosci. 2012; 13 :636–650. [ PubMed ] [ Google Scholar ]
  • Crone E.A., Wendelken C., Donohue S., Van L.L., Bunge S.A. Neurocognitive development of the ability to manipulate information in working memory. Proc. Natl. Acad. Sci. USA. 2006; 103 :9315–9320. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Crone E.A., Wendelken C., van Leijenhorst L., Honomichl R.D., Christoff K., Bunge S.A. Neurocognitive development of relational reasoning. Dev. Sci. 2009; 12 :55–66. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Daehler M.W., Chen Z. Protagonist, theme, and goal object: effects of surface features on analogical transfer. Cognit. Dev. 1993; 8 :211–229. [ Google Scholar ]
  • De Chastelaine M., Friedman D., Cycowicz Y.M. The development of control processes supporting source memory discrimination as revealed by event-related potentials. J. Cognit. Neurosci. 2007; 19 :1286–1301. [ PubMed ] [ Google Scholar ]
  • De Luca C.R., Wood S.J., Anderson V., Buchanan J.-A., Proffitt T.M., Mahony K., Pantelis C. Normative data from the CANTAB. I: Development of executive function over the lifespan. J. Clin. Exp. Neuropsychol. 2003; 25 :242–254. [ PubMed ] [ Google Scholar ]
  • De Neys W., Everaerts D. Developmental trends in everyday conditional reasoning: the retrieval and inhibition interplay. J. Exp. Child Psychol. 2008; 100 :252–263. [ PubMed ] [ Google Scholar ]
  • DeMaster D., Pathman T., Ghetti S. Development of memory for spatial context: hippocampal and cortical contributions. Neuropsychologia. 2013; 51 :2415–2426. [ PubMed ] [ Google Scholar ]
  • DeMaster D.M., Ghetti S. Developmental differences in hippocampal and cortical contributions to episodic retrieval. Cortex. 2013; 49 :1482–1493. [ PubMed ] [ Google Scholar ]
  • Dobbins I.G., Simons J.S., Schacter D.L. fMRI evidence for separable and lateralized prefrontal memory monitoring processes. J. Cognit. Neurosci. 2004; 16 :908–920. [ PubMed ] [ Google Scholar ]
  • Dumontheil I., Burgess P.W., Blakemore S.-J. Development of rostral prefrontal cortex and cognitive and behavioural disorders. Dev. Med. Child Neurol. 2008; 50 :168–181. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Dumontheil I., Gilbert S.J., Frith C.D., Burgess P.W. Recruitment of lateral rostral prefrontal cortex in spontaneous and task-related thoughts. Q. J. Exp. Psychol. 2010; 63 :1740–1756. [ PubMed ] [ Google Scholar ]
  • Dumontheil I., Hassan B., Gilbert S.J., Blakemore S.-J. Development of the selection and manipulation of self-generated thoughts in adolescence. J. Neurosci. 2010; 30 :7664–7671. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Dumontheil I., Hillebrandt H., Apperly I.A., Blakemore S.-J. Developmental differences in the control of action selection by social information. J. Cognit. Neurosci. 2012; 24 :2080–2095. [ PubMed ] [ Google Scholar ]
  • Dumontheil I., Houlton R., Christoff K., Blakemore S.-J. Development of relational reasoning during adolescence. Dev. Sci. 2010; 13 :F15–F24. [ PubMed ] [ Google Scholar ]
  • Dumontheil I., Thompson R., Duncan J. Assembly and use of new task rules in fronto-parietal cortex. J. Cognit. Neurosci. 2011; 23 :168–182. [ PubMed ] [ Google Scholar ]
  • Eslinger P.J., Blair C., Wang J., Lipovsky B., Realmuto J., Baker D., Thorne S., Gamson D., Zimmerman E., Rohrer L., Yang Q.X. Developmental shifts in fMRI activations during visuospatial relational reasoning. Brain Cognit. 2009; 69 :1–10. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ferrer E., O’Hare E.D., Bunge S.A. Fluid reasoning and the developing brain. Front. Neurosci. 2009; 3 :46–51. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ferrer E., Whitaker K.J., Steele J.S., Green C.T., Wendelken C., Bunge S.A. White matter maturation supports the development of reasoning ability through its influence on processing speed. Dev. Sci. 2013; 16 :941–951. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fischer K.W. A theory of cognitive development: the control and construction of hierarchies of skills. Psychol. Rev. 1980; 87 :477–531. [ Google Scholar ]
  • Flavell J.H., Beach D.R., Chinsky J.M. Spontaneous verbal rehearsal in a memory task as a function of age. Child Dev. 1966; 37 :283–299. [ PubMed ] [ Google Scholar ]
  • Fleming S.M., Dolan R.J. The neural basis of metacognitive ability. Philos. Trans. R. Soc. B Biol. Sci. Trans. R Soc. B. 2012; 367 :1338–1349. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fleming S.M., Weil R.S., Nagy Z., Dolan R.J., Rees G. Relating introspective accuracy to individual differences in brain structure. Science. 2010; 329 (80):1541–1543. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fletcher P.C., Henson R.N. Frontal lobes and human memory: insights from functional neuroimaging. Brain. 2001; 124 :849–881. [ PubMed ] [ Google Scholar ]
  • Ford S., Silber K.P. Working memory in children: a developmental approach to the phonological coding of pictorial material. Br. J. Dev. Psychol. 1994; 12 :165–175. [ Google Scholar ]
  • Gamino J.F., Chapman S.B., Hull E.L., Lyon G.R. Effects of higher-order cognitive strategy training on gist-reasoning and fact-learning in adolescents. Front. Psychol. 2010; 1 :188. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gathercole S.E. The development of memory. J. Child Psychol. Psychiatry. 1998; 39 :3–27. [ PubMed ] [ Google Scholar ]
  • Geake J.G., Hansen P.C. Neural correlates of intelligence as revealed by fMRI of fluid analogies. Neuroimage. 2005; 26 :555–564. [ PubMed ] [ Google Scholar ]
  • Giedd J.N., Blumenthal J., Jeffries N.O., Castellanos F.X., Liu H., Zijdenbos A., Paus T., Evans A.C., Rapoport J.L. Brain development during childhood and adolescence: a longitudinal MRI study. Nat. Neurosci. 1999; 2 :861–863. [ PubMed ] [ Google Scholar ]
  • Gilbert S.J., Bird G., Brindley R., Frith C.D., Burgess P.W. Atypical recruitment of medial prefrontal cortex in autism spectrum disorders: an fMRI study of two executive function tasks. Neuropsychologia. 2008; 46 :2281–2291. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gilbert S.J., Frith C.D., Burgess P.W. Involvement of rostral prefrontal cortex in selection between stimulus-oriented and stimulus-independent thought. Eur. J. Neurosci. 2005; 21 :1423–1431. [ PubMed ] [ Google Scholar ]
  • Gilbert S.J., Spengler S., Simons J.S., Frith C.D., Burgess P.W. Differential functions of lateral and medial rostral prefrontal cortex (area 10) revealed by brain-behavior associations. Cereb. Cortex. 2006; 16 :1783–1789. [ PubMed ] [ Google Scholar ]
  • Gilbert S.J., Spengler S., Simons J.S., Steele J.D., Lawrie S.M., Frith C.D., Burgess P.W. Functional specialization within rostral prefrontal cortex (area 10): a meta-analysis. J. Cognit. Neurosci. 2006; 18 :932–948. [ PubMed ] [ Google Scholar ]
  • Gilbert S.J., Williamson I.D.M., Dumontheil I., Simons J.S., Frith C.D., Burgess P.W. Distinct regions of medial rostral prefrontal cortex supporting social and nonsocial functions. Soc. Cognit. Affect. Neurosci. 2007; 2 :217–226. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ghetti S., DeMaster D.M., Yonelinas A.P., Bunge S.A. Developmental differences in medial temporal lobe function during memory encoding. J. Neurosci. 2010; 30 :9548–9556. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gläscher J., Rudrauf D., Colom R., Paul L.K., Tranel D., Damasio H., Adolphs R. Distributed neural system for general intelligence revealed by lesion mapping. Proc. Natl. Acad. Sci. USA. 2010; 107 :4705–4709. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Goddings A.-L., Burnett Heyes S., Bird G., Viner R.M., Blakemore S.-J. The relationship between puberty and social emotion processing. Dev. Sci. 2012; 15 :801–811. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Goddings A.-L., Mills K.L., Clasen L.S., Giedd J.N., Viner R.M., Blakemore S.-J. The influence of puberty on subcortical brain development. Neuroimage. 2014; 88 :242–251. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Goswami U. Lawrence Erlbaum; Hillsdale, NJ: 1992. Analogical Reasoning in Children. [ Google Scholar ]
  • Gottfredson L.S. Why g matters: the complexity of everyday life. Intelligence. 1997; 24 :79–132. [ Google Scholar ]
  • Green A.E., Fugelsang J.A., Kraemer D.J., Shamosh N.A., Dunbar K.N. Frontopolar cortex mediates abstract integration in analogy. Brain Res. 2006; 1096 :125–137. [ PubMed ] [ Google Scholar ]
  • Guajardo N.R., Best D.L. Do preschoolers remember what to do? Incentive and external cues in prospective memory. Cognit. Dev. 2000; 15 :75–97. [ Google Scholar ]
  • Güler O.E., Thomas K.M. Developmental differences in the neural correlates of relational encoding and recall in children: an event-related fMRI study. Dev. Cogn. Neurosci. 2013; 3 :106–116. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Halford G.S. Erlbaum; Hillsdale, NJ: 1982. The Development of Thought. [ Google Scholar ]
  • Halford G.S., Wilson W.H., Phillips S. Processing capacity defined by relational complexity: implications for comparative, developmental, and cognitive psychology. Behav. Brain Sci. 1998; 21 :803–831. discussion 831. [ PubMed ] [ Google Scholar ]
  • Hampshire A., Thompson R., Duncan J., Owen A.M. Lateral prefrontal cortex subregions make dissociable contributions during fluid reasoning. Cereb. Cortex. 2011; 21 :1–10. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hatcher R., Hatcher S., Berlin M., Okla K., Richards J. Psychological mindedness and abstract reasoning in late childhood and adolescence: an exploration usign new instruments. J. Youth Adolesc. 1990; 19 :307–326. [ PubMed ] [ Google Scholar ]
  • Hitch G.J., Halliday M.S., Schaafstal A.M., Heffernan T.M. Speech, inner speech, and the development of short-term memory: effects of picture labeling on recall. J. Exp. Child Psychol. 1991; 51 :220–234. [ PubMed ] [ Google Scholar ]
  • Holyoak K.J., Junn E.N., Billman D.O. Development of analogical problem-solving skill. Child Dev. 1984; 55 :2042–2055. [ PubMed ] [ Google Scholar ]
  • Huizinga M., Dolan C., van der Molen V.M.W. Age-related change in executive function: developmental trends and a latent variable analysis. Neuropsychologia. 2006; 44 :2017–2036. [ PubMed ] [ Google Scholar ]
  • Jolles D.D., Crone E.A. Training the developing brain: a neurocognitive perspective. Front. Hum. Neurosci. 2012; 6 :76. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kerns K.A. The CyberCruiser: an investigation of development of prospective memory in children. J. Int. Neuropsychol. Soc. 2000; 6 :62–70. [ PubMed ] [ Google Scholar ]
  • Kliegel M., Mackinlay R., Jäger T. Complex prospective memory: development across the lifespan and the role of task interruption. Dev. Psychol. 2008; 44 :612–617. [ PubMed ] [ Google Scholar ]
  • Klingberg T., Forssberg H., Westerberg H. Increased brain activity in frontal and parietal cortex underlies the development of visuospatial working memory capacity during childhood. J. Cognit. Neurosci. 2002; 14 :1–10. [ PubMed ] [ Google Scholar ]
  • Koechlin E., Jubault T. Broca's area and the hierarchical organization of human behavior. Neuron. 2006; 50 :963–974. [ PubMed ] [ Google Scholar ]
  • Koechlin E., Ody C., Kouneiher F. The architecture of cognitive control in the human prefrontal cortex. Science. 2003; 302 (80):1181–1185. [ PubMed ] [ Google Scholar ]
  • Koechlin E., Summerfield C. An information theoretical approach to prefrontal executive function. Trends Cognit. Sci. 2007; 11 :229–235. [ PubMed ] [ Google Scholar ]
  • Krawczyk D.C. The cognition and neuroscience of relational reasoning. Brain Res. 2012; 1428 :13–23. [ PubMed ] [ Google Scholar ]
  • Lee K.H., Choi Y.Y., Gray J.R., Cho S.H., Chae J.-H., Lee S., Kim K. Neural correlates of superior intelligence: stronger recruitment of posterior parietal cortex. Neuroimage. 2006; 29 :578–586. [ PubMed ] [ Google Scholar ]
  • Lorsbach T.C., Reimer J.F. Feature binding in children and young adults. J. Genet. Psychol. 2005; 166 :313–327. [ PubMed ] [ Google Scholar ]
  • Lu L.H., Dapretto M., O’Hare E.D., Kan E., McCourt S.T., Thompson P.M., Toga A.W., Bookheimer S.Y., Sowell E.R. Relationships between brain activation and brain structure in normally developing children. Cereb. Cortex. 2009; 19 :2595–2604. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Luna B., Padmanabhan A., O’Hearn K. What has fMRI told us about the development of cognitive control through adolescence? Brain Cognit. 2010; 72 :101–113. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Mackey A.P., Hill S.S., Stone S.I., Bunge S.A. Differential effects of reasoning and speed training in children. Dev. Sci. 2011; 14 :582–590. [ PubMed ] [ Google Scholar ]
  • Mackey A.P., Miller Singley A.T., Bunge S.A. Intensive reasoning training alters patterns of brain connectivity at rest. J. Neurosci. 2013; 33 :4796–4803. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Mackey A.P., Whitaker K.J., Bunge S.A. Experience-dependent plasticity in white matter microstructure: reasoning training alters structural connectivity. Front. Neuroanat. 2012; 6 :32. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Mackinlay R.J., Kliegel M., Mäntylä T. Predictors of time-based prospective memory in children. J. Exp. Child Psychol. 2009; 102 :251–264. [ PubMed ] [ Google Scholar ]
  • Mäntylä T., Carelli M.G., Forman H. Time monitoring and executive functioning in children and adults. J. Exp. Child Psychol. 2007; 96 :1–19. [ PubMed ] [ Google Scholar ]
  • Marini Z., Case R. The development of abstract reasoning about the physical and social world. Child Dev. 1994; 65 :147–159. [ Google Scholar ]
  • Mattli F., Zöllig J., West R. Age-related differences in the temporal dynamics of prospective memory retrieval: a lifespan approach. Neuropsychologia. 2011; 49 :3494–3504. [ PubMed ] [ Google Scholar ]
  • Maylor E.A., Logie R.H. A large-scale comparison of prospective and retrospective memory development from childhood to middle age. Q. J. Exp. Psychol. 2010; 63 :442–451. [ PubMed ] [ Google Scholar ]
  • McDaniel M.A., Einstein G.O. SAGE Publications; Los Angeles: 2007. Prospective Memory: An Overview and Synthesis of an Emerging Field. [ Google Scholar ]
  • Mckeough A., Malcolm J. Stories of family, stories of self: developmental pathways to interpretive thought during adolescence. New Dir. Child Adolesc. Dev. 2010; 131 :59–71. [ PubMed ] [ Google Scholar ]
  • Meyer M.L., Spunt R.P., Berkman E.T., Taylor S.E., Lieberman M.D. Evidence for social working memory from a parametric functional MRI study. Proc. Natl. Acad. Sci. USA. 2012; 109 :1883–1888. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Miller E.K., Cohen J.D. An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 2001; 24 :167–202. [ PubMed ] [ Google Scholar ]
  • Moran M.A., Mufson E.J., Mesulam M.M. Neural inputs into the temporopolar cortex of the rhesus monkey. J. Comp. Neurol. 1987; 256 :88–103. [ PubMed ] [ Google Scholar ]
  • Morecraft R.J., Van Hoesen G.W. Frontal granular cortex input to the cingulate (M3), supplementary (M2) and primary (M1) motor cortices in the rhesus monkey. J. Comp. Neurol. 1993; 337 :669–689. [ PubMed ] [ Google Scholar ]
  • Morrison R.G., Doumas L.A.A., Richland L.E. A computational account of children's analogical reasoning: balancing inhibitory control in working memory and relational representation. Dev. Sci. 2011; 14 :516–529. [ PubMed ] [ Google Scholar ]
  • Narr K.L., Woods R.P., Thompson P.M., Szeszko P., Robinson D., Dimtcheva T., Gurbani M., Toga A.W., Bilder R.M. Relationships between IQ and regional cortical gray matter thickness in healthy adults. Cereb. Cortex. 2007; 17 :2163–2171. [ PubMed ] [ Google Scholar ]
  • Nee D.E., Jahn A., Brown J.W. Prefrontal cortex organization: dissociating effects of temporal abstraction, relational abstraction, and integration with fMRI. Cereb. Cortex. 2014; 24 :2377–2387. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ofen N., Kao Y.-C., Sokol-Hessner P., Kim H., Whitfield-Gabrieli S., Gabrieli J.D.E. Development of the declarative memory system in the human brain. Nat. Neurosci. 2007; 10 :1198–1205. [ PubMed ] [ Google Scholar ]
  • Olesen P.J., Macoveanu J., Tegnér J., Klingberg T. Brain activity related to working memory and distraction in children and adults. Cereb. Cortex. 2007; 17 :1047–1054. [ PubMed ] [ Google Scholar ]
  • Olesen P.J., Nagy Z., Westerberg H., Klingberg T. Combined analysis of DTI and fMRI data reveals a joint maturation of white and grey matter in a fronto-parietal network. Cognit. Brain Res. 2003; 18 :48–57. [ PubMed ] [ Google Scholar ]
  • Öngür D., Ferry A.T., Price J.L. Architectonic subdivision of the human orbital and medial prefrontal cortex. J. Comp. Neurol. 2003; 460 :425–449. [ PubMed ] [ Google Scholar ]
  • Passingham R.E. The frontal cortex: does size matter? Nat. Neurosci. 2002; 5 :190–192. [ PubMed ] [ Google Scholar ]
  • Paz-Alonso P.M., Ghetti S., Donohue S.E., Goodman G.S., Bunge S.A. Neurodevelopmental correlates of true and false recognition. Cereb. Cortex. 2008; 18 :2208–2216. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Perfetti B., Saggino A., Ferretti A., Caulo M., Romani G.L., Onofrj M. Differential patterns of cortical activation as a function of fluid reasoning complexity. Hum. Brain Mapp. 2009; 30 :497–510. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Petrides M. Lateral prefrontal cortex: architectonic and functional organization. Philos. Trans. R. Soc. Lond. – Ser. B Biol. Sci. 2005; 360 :781–795. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Petrides M., Pandya D.N. Dorsolateral prefrontal cortex: comparative cytoarchitectonic analysis in the human and the macaque brain and corticocortical connection patterns. Eur. J. Neurosci. 1999; 11 :1011–1036. [ PubMed ] [ Google Scholar ]
  • Picard L., Cousin S., Guillery-Girard B., Eustache F., Piolino P. How do the different components of episodic memory develop? Role of executive functions and short-term feature-binding abilities. Child Dev. 2012; 83 :1037–1050. [ PubMed ] [ Google Scholar ]
  • Primi R., Ferrão M.E., Almeida L.S. Fluid intelligence as a predictor of learning: a longitudinal multilevel approach applied to math. Learn. Individ. Differ. 2010; 20 :446–451. [ Google Scholar ]
  • Raj V., Bell M.A. Cognitive processes supporting episodic memory formation in childhood: the role of source memory, binding, and executive functioning. Dev. Rev. 2010; 30 :384–402. [ Google Scholar ]
  • Rajan V., Cuevas K., Bell M.A. The contribution of executive function to source memory development in early childhood. J. Cognit. Dev. 2014; 15 :304–324. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ramnani N., Owen A.M. Anterior prefrontal cortex: insights into function from anatomy and neuroimaging. Nat. Rev. Neurosci. 2004; 5 :184–194. [ PubMed ] [ Google Scholar ]
  • Raposo A., Vicens L., Clithero J.A., Dobbins I.G., Huettel S.A. Contributions of frontopolar cortex to judgments about self, others and relations. Soc. Cognit. Affect. Neurosci. 2011; 6 :260–269. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rattermann M.J., Gentner D. More evidence for a relational shift in the development of analogy: children's performance on a causal-mapping task. Cognit. Dev. 1998; 13 :453–478. [ Google Scholar ]
  • Raven J.C. Oxford Psychologists Press; Oxford: 1998. Manual for Raven's Progressive Matrices. [ Google Scholar ]
  • Raznahan A., Lee Y., Stidd R., Long R., Greenstein D., Clasen L., Addington A. Longitudinally mapping the influence of sex and androgen signaling on the dynamics of human cortical maturation in adolescence. Proc. Natl. Acad. Sci. USA. 2010; 107 :16988–16993. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Raznahan A., Lerch J.P., Lee N., Greenstein D., Wallace G.L., Stockman M., Clasen L., Shaw P.W., Giedd J.N. Patterns of coordinated anatomical change in human cortical development: a longitudinal neuroimaging study of maturational coupling. Neuron. 2011; 72 :873–884. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rhodes S.M., Murphy D., Hancock P.J.B. Developmental changes in the engagement of episodic retrieval processes and their relationship with working memory during the period of middle childhood. Br. J. Dev. Psychol. 2011; 29 :865–882. [ PubMed ] [ Google Scholar ]
  • Richland L.E., Morrison R.G., Holyoak K.J. Children's development of analogical reasoning: insights from scene analogy problems. J. Exp. Child Psychol. 2006; 94 :249–273. [ PubMed ] [ Google Scholar ]
  • Riggins T. Longitudinal investigation of source memory reveals different developmental trajectories for item memory and binding. Dev. Psychol. 2014; 50 :449–459. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Roca M., Parr A., Thompson R., Woolgar A., Torralva T., Antoun N., Manes F., Duncan J. Executive function and fluid intelligence after frontal lobe lesions. Brain. 2010; 133 :234–247. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rosso I.M., Young A.D., Femia L.A., Yurgelun-Todd D.A. Cognitive and emotional components of frontal lobe functioning in childhood and adolescence. Ann. N. Y. Acad. Sci. 2004; 1021 :355–362. [ PubMed ] [ Google Scholar ]
  • Ruffman T., Rustin C., Garnham W., Parkin A.J. Source monitoring and false memories in children: relation to certainty and executive functioning. J. Exp. Child Psychol. 2001; 80 :95–111. [ PubMed ] [ Google Scholar ]
  • Sakai K., Passingham R.E. Prefrontal interactions reflect future task operations. Nat. Neurosci. 2003; 6 :75–81. [ PubMed ] [ Google Scholar ]
  • Sakai K., Passingham R.E. Prefrontal set activity predicts rule-specific neural processing during subsequent cognitive performance. J. Neurosci. 2006; 26 :1211–1218. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sander M.C., Werkle-Bergner M., Gerjets P., Shing Y.L., Lindenberger U. The two-component model of memory development, and its potential implications for educational settings. Dev. Cognit. Neurosci. 2012; 2 :S67–S77. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sawyer S.M., Afifi R.A., Bearinger L.H., Blakemore S.-J., Dick B., Ezeh A.C., Patton G.C. Adolescence: a foundation for future health. Lancet. 2012; 379 :1630–1640. [ PubMed ] [ Google Scholar ]
  • Schneider W. The development of metacognitive knowledge in children and adolescents: major trends and implications for education. Mind Brain Educ. 2008; 2 :114–121. [ Google Scholar ]
  • Schooler J.W., Smallwood J., Christoff K., Handy T.C., Reichle E.D., Sayette M.A. Meta-awareness, perceptual decoupling and the wandering mind. Trends Cognit. Sci. 2011; 15 :319–326. [ PubMed ] [ Google Scholar ]
  • Semendeferi K., Armstrong E., Schleicher A., Zilles K., Van Hoesen G.W. Prefrontal cortex in humans and apes: a comparative study of area 10. Am. J. Phys. Anthropol. 2001; 114 :224–241. [ PubMed ] [ Google Scholar ]
  • Semendeferi K., Teffer K., Buxhoeveden D.P., Park M.S., Bludau S., Amunts K., Travis K., Buckwalter J. Spatial organization of neurons in the frontal pole sets humans apart from great apes. Cereb. Cortex. 2011; 21 :1485–1497. [ PubMed ] [ Google Scholar ]
  • Shallice T., Burgess P.W. Deficits in strategy application following frontal lobe damage in man. Brain. 1991; 114 (Pt 2):727–741. [ PubMed ] [ Google Scholar ]
  • Shaw P., Greenstein D., Lerch J., Clasen L., Lenroot R., Gogtay N., Evans A., Rapoport J., Giedd J. Intellectual ability and cortical development in children and adolescents. Nature. 2006; 440 :676–679. [ PubMed ] [ Google Scholar ]
  • Shaw P., Kabani N.J., Lerch J.P., Eckstrand K., Lenroot R., Gogtay N., Greenstein D., Clasen L., Evans A., Rapoport J.L., Giedd J.N., Wise S.P. Neurodevelopmental trajectories of the human cerebral cortex. J. Neurosci. 2008; 28 :3586–3594. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Simons J.S., Henson R.N.A., Gilbert S.J., Fletcher P.C. Separable forms of reality monitoring supported by anterior prefrontal cortex. J. Cognit. Neurosci. 2008; 20 :447–457. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Simons J.S., Scholvinck M.L., Gilbert S.J., Frith C.D., Burgess P.W. Differential components of prospective memory? Evidence from fMRI. Neuropsychologia. 2006; 44 :1388–1397. [ PubMed ] [ Google Scholar ]
  • Simons J.S., Spiers H.J. Prefrontal and medial temporal lobe interactions in long-term memory. Nat. Rev. Neurosci. 2003; 4 :637–648. [ PubMed ] [ Google Scholar ]
  • Smith R., Keramatian K., Christoff K. Localizing the rostrolateral prefrontal cortex at the individual level. Neuroimage. 2007; 36 (1387):96. [ PubMed ] [ Google Scholar ]
  • Sowell E.R., Thompson P.M., Holmes C.J., Jernigan T.L., Toga A.W. In vivo evidence for post-adolescent brain maturation in frontal and striatal regions. Nat. Neurosci. 1999; 2 :859–861. [ PubMed ] [ Google Scholar ]
  • Sowell E.R., Thompson P.M., Leonard C.M., Welcome S.E., Kan E., Toga A.W. Longitudinal mapping of cortical thickness and brain growth in normal children. J. Neurosci. 2004; 24 :8223–8231. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Spaniol J., Davidson P.S.R., Kim A.S.N., Han H., Moscovitch M., Grady C.L. Event-related fMRI studies of episodic encoding and retrieval: meta-analyses using activation likelihood estimation. Neuropsychologia. 2009; 47 :1765–1779. [ PubMed ] [ Google Scholar ]
  • Spear L.P. The adolescent brain and age-related behavioral manifestations. Neurosci. Biobehav. Rev. 2000; 24 :417–463. [ PubMed ] [ Google Scholar ]
  • Steinberg L. Cognitive and affective development in adolescence. Trends Cognit. Sci. 2005; 9 :69–74. [ PubMed ] [ Google Scholar ]
  • Sternberg R.J., Rifkin B. The development of analogical reasoning processes. J. Exp. Child Psychol. 1979; 27 :195–232. [ PubMed ] [ Google Scholar ]
  • Tamm L., Menon V., Reiss A.L. Maturation of brain function associated with response inhibition. J. Am. Acad. Child Adolesc. Psychiatry. 2002; 41 :1231–1238. [ PubMed ] [ Google Scholar ]
  • Tamnes C.K., Ostby Y., Fjell A.M., Westlye L.T., Due-Tønnessen P., Walhovd K.B. Brain maturation in adolescence and young adulthood: regional age-related changes in cortical thickness and white matter volume and microstructure. Cereb. Cortex. 2010; 20 :534–548. [ PubMed ] [ Google Scholar ]
  • Tulving, E.B.T.-E. of E.M., 1983. Elements of Episodic Memory. Clarendon Press, Oxford.
  • Turner M.S., Simons J.S., Gilbert S.J., Frith C.D., Burgess P.W. Distinct roles for lateral and medial rostral prefrontal cortex in source monitoring of perceived and imagined events. Neuropsychologia. 2008; 46 :1442–1453. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Van den Bos W., Crone E.A., Güroğlu B. Brain function during probabilistic learning in relation to IQ and level of education. Dev. Cognit. Neurosci. 2012; 2 :S78–S89. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Van Overwalle F. Social cognition and the brain: a meta-analysis. Hum. Brain Mapp. 2009; 30 :829–858. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Voigt B., Aberle I., Schonfeld J., Kliegel M. Time-based prospective memory in schoolchildren. Zeitschrift für Psychol./J. Psychol. 2011; 219 :92–99. [ Google Scholar ]
  • Volle E., Gilbert S.J., Benoit R.G., Burgess P.W. Specialization of the rostral prefrontal cortex for distinct analogy processes. Cereb. Cortex. 2010; 20 :2647–3269. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Volle E., Gonen-Yaacovi G., Costello A., de L., Gilbert S.J., Burgess P.W. The role of rostral prefrontal cortex in prospective memory: a voxel-based lesion study. Neuropsychologia. 2011; 49 :2185–2198. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wallace G.L., Silvers J.A., Martin A., Kenworthy L.E. Brief report: further evidence for inner speech deficits in autism spectrum disorders. J. Autism Dev. Disord. 2009; 39 :1735–1739. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wang L., Altgassen M., Liu W., Xiong W., Akgün C., Kliegel M. Prospective memory across adolescence: the effects of age and cue focality. Dev. Psychol. 2011; 47 :226–232. [ PubMed ] [ Google Scholar ]
  • Wang L., Kliegel M., Yang Z., Liu W. Prospective memory performance across adolescence. J. Genet. Psychol. 2006; 167 :179–188. [ PubMed ] [ Google Scholar ]
  • Ward H., Shum D., McKinlay L., Baker-Tweney S., Wallace G. Development of prospective memory: tasks based on the prefrontal-lobe model. Child Neuropsychol. 2005; 11 :527–549. [ PubMed ] [ Google Scholar ]
  • Wechsler D. Psychol. Corp.; San Antonio: 1997. Wechsler Adult Intelligence Scale-III (WAIS-III) [ Google Scholar ]
  • Wendelken C., Chung D., Bunge S.A. Rostrolateral prefrontal cortex: domain-general or domain-sensitive? Hum. Brain Mapp. 2012; 33 :1952–1963. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wendelken C., Nakhabenko D., Donohue S.E., Carter C.S., Bunge S.A. Brain is to thought as stomach is to??: investigating the role of rostrolateral prefrontal cortex in relational reasoning. J. Cognit. Neurosci. 2008; 20 :682–693. [ PubMed ] [ Google Scholar ]
  • Wendelken C., O’Hare E.D., Whitaker K.J., Ferrer E., Bunge S.A. Increased functional selectivity over development in rostrolateral prefrontal cortex. J. Neurosci. 2011; 31 :17260–17268. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Westlye L.T., Walhovd K.B., Dale A.M., Bjørnerud A., Due-Tønnessen P., Engvig A., Grydeland H., Tamnes C.K., Ostby Y., Fjell A.M. Life-span changes of the human brain white matter: diffusion tensor imaging (DTI) and volumetry. Cereb. Cortex. 2010; 20 :2055–2068. [ PubMed ] [ Google Scholar ]
  • Whitehouse A.J.O., Maybery M.T., Durkin K. Inner speech impairments in autism. J. Child Psychol. Psychiatry. 2006; 47 :857–865. [ PubMed ] [ Google Scholar ]
  • Wolfensteller U., von Cramon D.Y. Strategy-effects in prefrontal cortex during learning of higher-order S-R rules. Neuroimage. 2011; 57 :598–607. [ PubMed ] [ Google Scholar ]
  • Wright S.B., Matlen B.J., Baym C.L., Ferrer E., Bunge S.A. Neural correlates of fluid reasoning in children and adults. Front. Hum. Neurosci. 2008; 1 :8. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Yang T., Chan R.C.K., Shum D. The development of prospective memory in typically developing children. Neuropsychology. 2011; 25 :342–352. [ PubMed ] [ Google Scholar ]
  • Zimmermann T.D., Meier B. The rise and decline of prospective memory performance across the lifespan. Q. J. Exp. Psychol. 2006; 59 :2040–2046. [ PubMed ] [ Google Scholar ]
  • Zöllig J., West R., Martin M., Altgassen M., Lemke U., Kliegel M. Neural correlates of prospective memory across the lifespan. Neuropsychologia. 2007; 45 :3299–3314. [ PubMed ] [ Google Scholar ]

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 02 August 2022

The science of effective learning with spacing and retrieval practice

  • Shana K. Carpenter   ORCID: orcid.org/0000-0003-0784-9026 1 ,
  • Steven C. Pan   ORCID: orcid.org/0000-0001-9080-5651 2 &
  • Andrew C. Butler   ORCID: orcid.org/0000-0002-6367-0795 3 , 4  

Nature Reviews Psychology volume  1 ,  pages 496–511 ( 2022 ) Cite this article

7484 Accesses

26 Citations

561 Altmetric

Metrics details

  • Learning and memory

Research on the psychology of learning has highlighted straightforward ways of enhancing learning. However, effective learning strategies are underused by learners. In this Review, we discuss key research findings on two specific learning strategies: spacing and retrieval practice. We focus on how these strategies enhance learning in various domains across the lifespan, with an emphasis on research in applied educational settings. We also discuss key findings from research on metacognition — learners’ awareness and regulation of their own learning. The underuse of effective learning strategies by learners could stem from false beliefs about learning, lack of awareness of effective learning strategies or the counter-intuitive nature of these strategies. Findings in learner metacognition highlight the need to improve learners’ subjective mental models of how to learn effectively. Overall, the research discussed in this Review has important implications for the increasingly common situations in which learners must effectively monitor and regulate their own learning.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 12 digital issues and online access to articles

55,14 € per year

only 4,60 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

abstract learning research

Similar content being viewed by others

abstract learning research

Testing theory of mind in large language models and humans

abstract learning research

Mechanisms linking social media use to adolescent mental health vulnerability

abstract learning research

Sleep quality, duration, and consistency are associated with better academic performance in college students

Witherby, A. E. & Tauber, S. K. The current status of students’ note-taking: why and how do students take notes? J. Appl. Res. Mem. Cogn. 8 , 139–153 (2019).

Article   Google Scholar  

Feitosa de Moura, V., Alexandre de Souza, C. & Noronha Viana, A. B. The use of massive open online courses (MOOCs) in blended learning courses and the functional value perceived by students. Comput. Educ. 161 , 104077 (2021).

Hew, K. F. & Cheung, W. S. Students’ and instructors’ use of massive open online courses (MOOCs): motivations and challenges. Educ. Res. Rev. 12 , 45–58 (2014).

Adesope, O. O., Trevisan, D. A. & Sundararajan, N. Rethinking the use of tests: a meta-analysis of practice testing. Rev. Educ. Res. 87 , 659–701 (2017).

Carpenter, S. K. in Learning and Memory: A Comprehensive Reference 2nd edn (ed. Byrne, J. H.) 465–485 (Academic, 2017).

Carpenter, S. K. Distributed practice or spacing effect. Oxford Research Encyclopedia of Education https://oxfordre.com/education/view/10.1093/acrefore/9780190264093.001.0001/acrefore-9780190264093-e-859 (2020).

Yang, C., Luo, L., Vadillo, M. A., Yu, R. & Shanks, D. R. Testing (quizzing) boosts classroom learning: a systematic and meta-analytic review. Psychol. Bull. 147 , 399–435 (2021).

Article   PubMed   Google Scholar  

Agarwal, P. K., Nunes, L. D. & Blunt, J. R. Retrieval practice consistently benefits student learning: a systematic review of applied research in schools and classrooms. Educ. Psychol. Rev. 33 , 1409–1453 (2021).

Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T. & Rohrer, D. Distributed practice in verbal recall tasks: a review and quantitative synthesis. Psychol. Bull. 132 , 354–380 (2006).

Chi, M. T. H. & Ohlsson, S. in The Cambridge Handbook of Thinking and Reasoning 371–399 (Cambridge Univ. Press, 2005).

Bransford, J. D. & Schwartz, D. L. Chapter 3: Rethinking transfer: a simple proposal with multiple implications. Rev. Res. Educ. 24 , 61–100 (1999).

Google Scholar  

Barnett, S. M. & Ceci, S. J. When and where do we apply what we learn?: a taxonomy for far transfer. Psychol. Bull. 128 , 612–637 (2002).

Ebbinghaus, H. Über das Gedächtnis: Untersuchungen zur experimentellen Psychologie [German] (Duncker & Humblot, 1885).

Vlach, H. A., Sandhofer, C. M. & Kornell, N. The spacing effect in children’s memory and category induction. Cognition 109 , 163–167 (2008).

Jackson, C. E., Maruff, P. T. & Snyder, P. J. Massed versus spaced visuospatial memory in cognitively healthy young and older adults. Alzheimer’s Dement. 9 , S32–S38 (2013).

Emeny, W. G., Hartwig, M. K. & Rohrer, D. Spaced mathematics practice improves test scores and reduces overconfidence. Appl. Cognit. Psychol. 35 , 1082–1089 (2021). This study demonstrates significant benefits of spacing over massed learning on 11–12-year-old students’ mathematics knowledge .

Vlach, H. A. & Sandhofer, C. M. Distributing learning over time: the spacing effect in children’s acquisition and generalization of science concepts: spacing and generalization. Child. Dev. 83 , 1137–1144 (2012).

Article   PubMed   PubMed Central   Google Scholar  

Foot-Seymour, V., Foot, J. & Wiseheart, M. Judging credibility: can spaced lessons help students think more critically online? Appl. Cognit. Psychol. 33 , 1032–1043 (2019). This study demonstrates significant long-term benefits of spacing on 9–12-year-old children’s ability to evaluate the credibility of information on websites .

Rohrer, D., Dedrick, R. F., Hartwig, M. K. & Cheung, C.-N. A randomized controlled trial of interleaved mathematics practice. J. Educ. Psychol. 112 , 40–52 (2020).

Yazdani, M. A. & Zebrowski, E. Spaced reinforcement: an effective approach to enhance the achievement in plane geometry. J. Math. Sci . 7 , 37–43 (2006).

Samani, J. & Pan, S. C. Interleaved practice enhances memory and problem-solving ability in undergraduate physics. npj Sci. Learn. 6 , 32 (2021). This study demonstrates significant benefits of distributing homework problems on retention and transfer of university students’ physics knowledge over an academic term .

Raman, M. et al. Teaching in small portions dispersed over time enhances long-term knowledge retention. Med. Teach. 32 , 250–255 (2010).

Moulton, C.-A. E. et al. Teaching surgical skills: what kind of practice makes perfect?: a randomized, controlled trial. Ann. Surg. 244 , 400–409 (2006).

Van Dongen, K. W., Mitra, P. J., Schijven, M. P. & Broeders, I. A. M. J. Distributed versus massed training: efficiency of training psychomotor skills. Surg. Tech. Dev. 1 , e17 (2011).

Spruit, E. N., Band, G. P. H. & Hamming, J. F. Increasing efficiency of surgical training: effects of spacing practice on skill acquisition and retention in laparoscopy training. Surg. Endosc. 29 , 2235–2243 (2015).

Lyle, K. B., Bego, C. R., Hopkins, R. F., Hieb, J. L. & Ralston, P. A. S. How the amount and spacing of retrieval practice affect the short- and long-term retention of mathematics knowledge. Educ. Psychol. Rev. 32 , 277–295 (2020).

Kapler, I. V., Weston, T. & Wiseheart, M. Spacing in a simulated undergraduate classroom: long-term benefits for factual and higher-level learning. Learn. Instr. 36 , 38–45 (2015).

Sobel, H. S., Cepeda, N. J. & Kapler, I. V. Spacing effects in real-world classroom vocabulary learning. Appl. Cognit. Psychol. 25 , 763–767 (2011).

Carpenter, S. K., Pashler, H. & Cepeda, N. J. Using tests to enhance 8th grade students’ retention of US history facts. Appl. Cognit. Psychol. 23 , 760–771 (2009). This study finds that spacing and retrieval practice can improve eighth- grade students’ knowledge of history facts across a 9-month period .

Cepeda, N. J., Vul, E., Rohrer, D., Wixted, J. T. & Pashler, H. Spacing effects in learning: a temporal ridgeline of optimal retention. Psychol. Sci. 19 , 1095–1102 (2008).

Delaney, P. F., Spirgel, A. S. & Toppino, T. C. A deeper analysis of the spacing effect after “deep” encoding. Mem. Cogn. 40 , 1003–1015 (2012).

Hintzman, D. L., Block, R. A. & Summers, J. J. Modality tags and memory for repetitions: locus of the spacing effect. J. Verbal Learn. Verbal Behav. 12 , 229–238 (1973).

Glenberg, A. M. Component-levels theory of the effects of spacing of repetitions on recall and recognition. Mem. Cogn. 7 , 95–112 (1979).

Verkoeijen, P. P. J. L., Rikers, R. M. J. P. & Schmidt, H. G. Detrimental influence of contextual change on spacing effects in free recall. J. Exp. Psychol. Learn. Mem. Cogn. 30 , 796–800 (2004).

Benjamin, A. S. & Tullis, J. What makes distributed practice effective? Cognit. Psychol. 61 , 228–247 (2010).

Thios, S. J. & D’Agostino, P. R. Effects of repetition as a function of study-phase retrieval. J. Verbal Learn. Verbal Behav. 15 , 529–536 (1976).

Smolen, P., Zhang, Y. & Byrne, J. H. The right time to learn: mechanisms and optimization of spaced learning. Nat. Rev. Neurosci. 17 , 77–88 (2016).

Goossens, N. A. M. C., Camp, G., Verkoeijen, P. P. J. L., Tabbers, H. K. & Zwaan, R. A. Spreading the words: a spacing effect in vocabulary learning. J. Cognit. Psychol. 24 , 965–971 (2012).

Zulkiply, N., McLean, J., Burt, J. S. & Bath, D. Spacing and induction: application to exemplars presented as auditory and visual text. Learn. Instr. 22 , 215–221 (2012).

Küpper-Tetzel, C. E. & Erdfelder, E. Encoding, maintenance, and retrieval processes in the lag effect: a multinomial processing tree analysis. Memory 20 , 37–47 (2012).

Verkoeijen, P. P. J. L., Rikers, R. M. J. P. & Schmidt, H. G. Limitations to the spacing effect: demonstration of an inverted U-shaped relationship between interrepetition spacing and free recall. Exp. Psychol. 52 , 257–263 (2005).

Randler, C., Kranich, K. & Eisele, M. Block scheduled versus traditional biology teaching—an educational experiment using the water lily. Instr. Sci. 36 , 17–25 (2008).

Abbott, E. E. On the analysis of the factor of recall in the learning process. Psychol. Rev. Monogr. Suppl. 11 , 159–177 (1909).

Roediger, H. L. & Butler, A. C. The critical role of retrieval practice in long-term retention. Trends Cognit. Sci. 15 , 20–27 (2011).

Rowland, C. A. The effect of testing versus restudy on retention: a meta-analytic review of the testing effect. Psychol. Bull. 140 , 1432–1463 (2014).

Pan, S. C. & Rickard, T. C. Transfer of test-enhanced learning: meta-analytic review and synthesis. Psychol. Bull. 144 , 710–756 (2018).

Sheffield, E. & Hudson, J. You must remember this: effects of video and photograph reminders on 18-month-olds’ event memory. J. Cogn. Dev. 7 , 73–93 (2006).

Fazio, L. K. & Marsh, E. J. Retrieval-based learning in children. Curr. Dir. Psychol. Sci. 28 , 111–116 (2019). This brief review highlights evidence that retrieval practice can benefit learning as early as infancy .

Coane, J. H. Retrieval practice and elaborative encoding benefit memory in younger and older adults. J. Appl. Res. Mem. Cogn. 2 , 95–100 (2013).

Bahrick, H. P., Bahrick, L. E., Bahrick, A. S. & Bahrick, P. E. Maintenance of foreign language vocabulary and the spacing effect. Psychol. Sci. 4 , 316–321 (1993). This classic study demonstrates benefits of spaced retrieval practice (successive relearning) on the learning of foreign language vocabulary in adults over a period of 5 years .

Bahrick, H. P. & Phelps, E. Retention of Spanish vocabulary over 8 years. J. Exp. Psychol. Learn. Mem. Cogn. 13 , 344–349 (1987).

Kulhavy, R. W. & Stock, W. A. Feedback in written instruction: the place of response certitude. Educ. Psychol. Rev. 1 , 279–308 (1989).

Pan, S. C., Hutter, S. A., D’Andrea, D., Unwalla, D. & Rickard, T. C. In search of transfer following cued recall practice: the case of process-based biology concepts. Appl. Cogn. Psychol. 33 , 629–645 (2019).

Pashler, H., Cepeda, N. J., Wixted, J. T. & Rohrer, D. When does feedback facilitate learning of words? J. Exp. Psychol. Learn. Mem. Cogn. 31 , 3–8 (2005).

Kang, S. H. K., McDermott, K. B. & Roediger, H. L. Test format and corrective feedback modify the effect of testing on long-term retention. Eur. J. Cognit. Psychol. 19 , 528–558 (2007).

Jaeger, A., Eisenkraemer, R. E. & Stein, L. M. Test-enhanced learning in third-grade children. Educ. Psychol. 35 , 513–521 (2015).

Pan, S. C., Rickard, T. C. & Bjork, R. A. Does spelling still matter — and if so, how should it be taught? Perspectives from contemporary and historical research. Educ. Psychol. Rev. 33 , 1523–1552 (2021).

Jones, A. C. et al. Beyond the rainbow: retrieval practice leads to better spelling than does rainbow writing. Educ. Psychol. Rev. 28 , 385–400 (2016).

McDermott, K. B., Agarwal, P. K., D’Antonio, L., Roediger, H. L. & McDaniel, M. A. Both multiple-choice and short-answer quizzes enhance later exam performance in middle and high school classes. J. Exp. Psychol. Appl. 20 , 3–21 (2014).

Roediger, H., Agarwal, P., McDaniel, M. & McDermott, K. Test-enhanced learning in the classroom: long-term improvements from quizzing. J. Exp. Psychol. Appl. 17 , 382–395 (2011).

Bobby, Z. & Meiyappan, K. “Test-enhanced” focused self-directed learning after the teaching modules in biochemistry. Biochem. Mol. Biol. Educ. 46 , 472–477 (2018).

Pan, S. C. et al. Online and clicker quizzing on jargon terms enhances definition-focused but not conceptually focused biology exam performance. CBE Life Sci. Educ. 18 , ar54 (2019).

Thomas, A. K., Smith, A. M., Kamal, K. & Gordon, L. T. Should you use frequent quizzing in your college course? Giving up 20 minutes of lecture time may pay off. J. Appl. Res. Mem. Cogn. 9 , 83–95 (2020).

Lyle, K. B. & Crawford, N. A. Retrieving essential material at the end of lectures improves performance on statistics exams. Teach. Psychol. 38 , 94–97 (2011).

Larsen, D. P., Butler, A. C. & Roediger, H. L. III Comparative effects of test-enhanced learning and self-explanation on long-term retention. Med. Educ. 47 , 674–682 (2013).

Eglington, L. G. & Kang, S. H. K. Retrieval practice benefits deductive inference. Educ. Psychol. Rev. 30 , 215–228 (2018).

Butler, A. C. Repeated testing produces superior transfer of learning relative to repeated studying. J. Exp. Psychol. Learn. Mem. Cogn. 36 , 1118–1133 (2010). This study demonstrates that retrieval practice can promote the ability to answer inferential questions involving a new knowledge domain (far transfer) .

Brabec, J. A., Pan, S. C., Bjork, E. L. & Bjork, R. A. True–false testing on trial: guilty as charged or falsely accused? Educ. Psychol. Rev. 33 , 667–692 (2021).

McDaniel, M. A., Wildman, K. M. & Anderson, J. L. Using quizzes to enhance summative-assessment performance in a web-based class: an experimental study. J. Appl. Res. Mem. Cogn. 1 , 18–26 (2012).

Rawson, K. A., Dunlosky, J. & Sciartelli, S. M. The power of successive relearning: improving performance on course exams and long-term retention. Educ. Psychol. Rev. 25 , 523–548 (2013).

Morris, P. E. & Fritz, C. O. The name game: using retrieval practice to improve the learning of names. J. Exp. Psychol. Appl. 6 , 124–129 (2000).

Smith, M. A., Roediger, H. L. & Karpicke, J. D. Covert retrieval practice benefits retention as much as overt retrieval practice. J. Exp. Psychol. Learn. Mem. Cogn. 39 , 1712–1725 (2013).

Rummer, R., Schweppe, J., Gerst, K. & Wagner, S. Is testing a more effective learning strategy than note-taking? J. Exp. Psychol. Appl. 23 , 293–300 (2017).

Karpicke, J. D. & Blunt, J. R. Retrieval practice produces more learning than elaborative studying with concept mapping. Science 331 , 772–775 (2011).

Ebersbach, M., Feierabend, M. & Nazari, K. B. B. Comparing the effects of generating questions, testing, and restudying on students’ long-term recall in university learning. Appl. Cognit. Psychol. 34 , 724–736 (2020).

Roelle, J. & Nückles, M. Generative learning versus retrieval practice in learning from text: the cohesion and elaboration of the text matters. J. Educ. Psychol. 111 , 1341–1361 (2019).

Endres, T., Carpenter, S., Martin, A. & Renkl, A. Enhancing learning by retrieval: enriching free recall with elaborative prompting. Learn. Instr. 49 , 13–20 (2017).

Glover, J. A. The ‘testing’ phenomenon: not gone but nearly forgotten. J. Educ. Psychol. 81 , 392–399 (1989).

Karpicke, J. D., Lehman, M. & Aue, W. R. in Psychology of Learning and Motivation Vol. 61 Ch. 7 (ed. Ross, B. H.) 237–284 (Academic, 2014).

Carpenter, S. K. Cue strength as a moderator of the testing effect: the benefits of elaborative retrieval. J. Exp. Psychol. Learn. Mem. Cogn. 35 , 1563–1569 (2009).

Carpenter, S. K. Semantic information activated during retrieval contributes to later retention: support for the mediator effectiveness hypothesis of the testing effect. J. Exp. Psychol. Learn. Mem. Cogn. 37 , 1547–1552 (2011).

Rickard, T. C. & Pan, S. C. A dual memory theory of the testing effect. Psychon. Bull. Rev. 25 , 847–869 (2018).

Bjork, R. A. Retrieval as a Memory Modifier: An Interpretation of Negative Recency and Related Phenomena (CiteSeer X , 1975).

Arnold, K. M. & McDermott, K. B. Test-potentiated learning: distinguishing between direct and indirect effects of tests. J. Exp. Psychol. Learn. Mem. Cogn. 39 , 940–945 (2013).

Roediger, H. L. & Karpicke, J. D. The power of testing memory: basic research and implications for educational practice. Perspect. Psychol. Sci. 1 , 181–210 (2006). This review details the history of psychology research on the retrieval practice effect and is contributing heavily to the resurgence of researcher interest in the topic .

Carpenter, S. K. Testing enhances the transfer of learning. Curr. Dir. Psychol. Sci. 21 , 279–283 (2012).

Pan, S. C. & Agarwal, P. K. Retrieval Practice and Transfer of Learning: Fostering Students’ Application of Knowledge (Univ. of California, 2018).

Tran, R., Rohrer, D. & Pashler, H. Retrieval practice: the lack of transfer to deductive inferences. Psychon. Bull. Rev. 22 , 135–140 (2015).

Wissman, K. T., Zamary, A. & Rawson, K. A. When does practice testing promote transfer on deductive reasoning tasks? J. Appl. Res. Mem. Cogn. 7 , 398–411 (2018).

van Gog, T. & Sweller, J. Not new, but nearly forgotten: the testing effect decreases or even disappears as the complexity of learning materials increases. Educ. Psychol. Rev. 27 , 247–264 (2015).

Carpenter, S. K., Endres, T. & Hui, L. Students’ use of retrieval in self-regulated learning: implications for monitoring and regulating effortful learning experiences. Educ. Psychol. Rev. 32 , 1029–1054 (2020).

Yeo, D. J. & Fazio, L. K. The optimal learning strategy depends on learning goals and processes: retrieval practice versus worked examples. J. Educ. Psychol. 111 , 73–90 (2019).

Peterson, D. J. & Wissman, K. T. The testing effect and analogical problem-solving. Memory 26 , 1460–1466 (2018).

Hostetter, A. B., Penix, E. A., Norman, M. Z., Batsell, W. R. & Carr, T. H. The role of retrieval practice in memory and analogical problem-solving. Q. J. Exp. Psychol. 72 , 858–871 (2019).

Karpicke, J. D., Blunt, J. R., Smith, M. A. & Karpicke, S. S. Retrieval-based learning: the need for guided retrieval in elementary school children. J. Appl. Res. Mem. Cogn. 3 , 198–206 (2014).

Smith, M. A. & Karpicke, J. D. Retrieval practice with short-answer, multiple-choice, and hybrid tests. Memory 22 , 784–802 (2014).

Latimier, A., Peyre, H. & Ramus, F. A meta-analytic review of the benefit of spacing out retrieval practice episodes on retention. Educ. Psychol. Rev. 33 , 959–987 (2021).

Higham, P. A., Zengel, B., Bartlett, L. K. & Hadwin, J. A. The benefits of successive relearning on multiple learning outcomes. J. Educ. Psychol. https://doi.org/10.1037/edu0000693 (2021).

Hopkins, R. F., Lyle, K. B., Hieb, J. L. & Ralston, P. A. S. Spaced retrieval practice increases college students’ short- and long-term retention of mathematics knowledge. Educ. Psychol. Rev. 28 , 853–873 (2016).

Bahrick, H. P. Maintenance of knowledge: questions about memory we forgot to ask. J. Exp. Psychol. Gen. 108 , 296–308 (1979).

Rawson, K. A. & Dunlosky, J. Successive relearning: an underexplored but potent technique for obtaining and maintaining knowledge. Curr. Dir. Psychol. Sci. https://doi.org/10.1177/09637214221100484 (2022). This brief review discusses the method of successive relearning — an effective learning technique that combines spacing and retrieval — and its benefits .

Rawson, K. A. & Dunlosky, J. When is practice testing most effective for improving the durability and efficiency of student learning? Educ. Psychol. Rev. 24 , 419–435 (2012).

Janes, J. L., Dunlosky, J., Rawson, K. A. & Jasnow, A. Successive relearning improves performance on a high-stakes exam in a difficult biopsychology course. Appl. Cognit. Psychol. 34 , 1118–1132 (2020).

Rawson, K. A., Dunlosky, J. & Janes, J. L. All good things must come to an end: a potential boundary condition on the potency of successive relearning. Educ. Psychol. Rev. 32 , 851–871 (2020).

Rawson, K. A. & Dunlosky, J. Optimizing schedules of retrieval practice for durable and efficient learning: how much is enough? J. Exp. Psychol. Gen. 140 , 283–302 (2011).

Flavell, J. H. Metacognition and cognitive monitoring: a new area of cognitive–developmental inquiry. Am. Psychol. 34 , 906–911 (1979). This classic paper introduces ideas that are now foundational to research on metacognition .

Kuhn, D. Metacognition matters in many ways. Educ. Psychol. 57 , 73–86 (2021).

Norman, E. et al. Metacognition in psychology. Rev. Gen. Psychol. 23 , 403–424 (2019).

Was, C. A. & Al-Harthy, I. S. Persistence of overconfidence in young children: factors that lead to more accurate predictions of memory performance. Eur. J. Dev. Psychol. 15 , 156–171 (2018).

Forsberg, A., Blume, C. L. & Cowan, N. The development of metacognitive accuracy in working memory across childhood. Dev. Psychol. 57 , 1297–1317 (2021).

Kuhn, D. Metacognitive development. Curr. Dir. Psychol. Sci . 9 , 178-181 (2000).

Bell, P. & Volckmann, D. Knowledge surveys in general chemistry: confidence, overconfidence, and performance. J. Chem. Educ. 88 , 1469–1476 (2011).

Saenz, G. D., Geraci, L. & Tirso, R. Improving metacognition: a comparison of interventions. Appl. Cognit. Psychol. 33 , 918–929 (2019).

Morphew, J. W. Changes in metacognitive monitoring accuracy in an introductory physics course. Metacogn. Learn. 16 , 89–111 (2021).

Geller, J. et al. Study strategies and beliefs about learning as a function of academic achievement and achievement goals. Memory 26 , 683–690 (2018).

Kornell, N. & Bjork, R. A. The promise and perils of self-regulated study. Psychon. Bull. Rev. 14 , 219–224 (2007).

Yan, V. X., Thai, K.-P. & Bjork, R. A. Habits and beliefs that guide self-regulated learning: do they vary with mindset? J. Appl. Res. Mem. Cogn. 3 , 140–152 (2014).

Rivers, M. L. Metacognition about practice testing: a review of learners’ beliefs, monitoring, and control of test-enhanced learning. Educ. Psychol. Rev. 33 , 823–862 (2021).

Carpenter, S. K. et al. Students’ use of optional online reviews and its relationship to summative assessment outcomes in introductory biology. LSE 16 , ar23 (2017).

Corral, D., Carpenter, S. K., Perkins, K. & Gentile, D. A. Assessing students’ use of optional online lecture reviews. Appl. Cognit. Psychol. 34 , 318–329 (2020).

Blasiman, R. N., Dunlosky, J. & Rawson, K. A. The what, how much, and when of study strategies: comparing intended versus actual study behaviour. Memory 25 , 784–792 (2017).

Karpicke, J. D., Butler, A. C. & Roediger, H. L. III Metacognitive strategies in student learning: do students practise retrieval when they study on their own? Memory 17 , 471–479 (2009).

Hamman, D., Berthelot, J., Saia, J. & Crowley, E. Teachers’ coaching of learning and its relation to students’ strategic learning. J. Educ. Psychol. 92 , 342–348 (2000).

Kistner, S. et al. Promotion of self-regulated learning in classrooms: investigating frequency, quality, and consequences for student performance. Metacogn. Learn. 5 , 157–171 (2010).

Morehead, K., Rhodes, M. G. & DeLozier, S. Instructor and student knowledge of study strategies. Memory 24 , 257–271 (2016).

Pomerance, L., Greenberg, J. & Walsh, K. Learning about Learning: What Every New Teacher Needs to Know (National Council on Teacher Quality, 2016).

Dinsmore, D. L., Alexander, P. A. & Loughlin, S. M. Focusing the conceptual lens on metacognition, self-regulation, and self-regulated learning. Educ. Psychol. Rev. 20 , 391–409 (2008). This conceptual review paper explores the relationship between metacognition, self-regulation and self-regulated learning .

Winne, P. H. in Handbook of Self-regulation of Learning and Performance 2nd edn 36–48 (Routledge/Taylor & Francis, 2018).

Pintrich, P. R. A conceptual framework for assessing motivation and self-regulated learning in college students. Educ. Psychol. Rev. 16 , 385–407 (2004).

Zimmerman, B. J. Self-efficacy: an essential motive to learn. Contemp. Educ. Psychol. 25 , 82–91 (2000).

McDaniel, M. A. & Butler, A. C. in Successful Remembering and Successful Forgetting: A Festschrift in Honor of Robert A. Bjork 175–198 (Psychology Press, 2011).

Bjork, R. A., Dunlosky, J. & Kornell, N. Self-regulated learning: beliefs, techniques, and illusions. Annu. Rev. Psychol. 64 , 417–444 (2013). This review provides an overview of the cognitive psychology perspective on the metacognition of strategy planning and use .

Nelson, T. O. & Narens, L. in Psychology of Learning and Motivation Vol. 26 (ed. Bower, G. H.) 125–173 (Academic, 1990).

Fiechter, J. L., Benjamin, A. S. & Unsworth, N. in The Oxford Handbook of Metamemory (eds Dunlosky, J. & Tauber, S. K.) 307–324 (Oxford Univ. Press, 2016).

Efklides, A. Interactions of metacognition with motivation and affect in self-regulated learning: the MASRL model. Educ. Psychol. 46 , 6–25 (2011).

Zimmerman, B. J. in Handbook of Self-regulation (eds Boekaerts, M. & Pintrich, P. R.) 13–39 (Academic, 2000). This paper lays out a prominent theory of self-regulated learning and exemplifies the educational psychology perspective on the metacognition of strategy planning and use .

Wolters, C. A. Regulation of motivation: evaluating an underemphasized aspect of self-regulated learning. Educ. Psychol. 38 , 189–205 (2003).

Wolters, C. A. & Benzon, M. Assessing and predicting college students’ use of strategies for the self-regulation of motivation. J. Exp. Educ. 18 , 199–221 (2013).

Abel, M. & Bäuml, K.-H. T. Would you like to learn more? Retrieval practice plus feedback can increase motivation to keep on studying. Cognition 201 , 104316 (2020).

Kang, S. H. K. & Pashler, H. Is the benefit of retrieval practice modulated by motivation? J. Appl. Res. Mem. Cogn. 3 , 183–188 (2014).

Vermunt, J. D. & Verloop, N. Congruence and friction between learning and teaching. Learn. Instr. 9 , 257–280 (1999).

Coertjens, L., Donche, V., De Maeyer, S., Van Daal, T. & Van Petegem, P. The growth trend in learning strategies during the transition from secondary to higher education in Flanders. High. Educ.: Int. J. High. Education Educ. Plan. 3 , 499–518 (2017).

Severiens, S., Ten Dam, G. & Van Hout Wolters, B. Stability of processing and regulation strategies: two longitudinal studies on student learning. High. Educ. 42 , 437–453 (2001).

Watkins, D. & Hattie, J. A longitudinal study of the approaches to learning of Austalian tertiary students. Hum. Learn. J. Practical Res. Appl. 4 , 127–141 (1985).

Russell, J. M., Baik, C., Ryan, A. T. & Molloy, E. Fostering self-regulated learning in higher education: making self-regulation visible. Act. Learn. Higher Educ . 23 , 97–113 (2020).

Schraw, G. Promoting general metacognitive awareness. Instr. Sci. 26 , 113–125 (1998).

Lundeberg, M. A. & Fox, P. W. Do laboratory findings on test expectancy generalize to classroom outcomes? Rev. Educ. Res. 61 , 94–106 (1991).

Rivers, M. L. & Dunlosky, J. Are test-expectancy effects better explained by changes in encoding strategies or differential test experience? J. Exp. Psychol. Learn. Mem. Cognn. 47 , 195–207 (2021).

Chi, M. in Handbook of Research on Conceptual Change (ed. Vosniadou, S.) 61–82 (Lawrence Erlbaum, 2009).

Susser, J. A. & McCabe, J. From the lab to the dorm room: metacognitive awareness and use of spaced study. Instr. Sci. 41 , 345–363 (2013).

Yan, V. X., Bjork, E. L. & Bjork, R. A. On the difficulty of mending metacognitive illusions: a priori theories, fluency effects, and misattributions of the interleaving benefit. J. Exp. Psychol. Gen. 145 , 918–933 (2016).

Ariel, R. & Karpicke, J. D. Improving self-regulated learning with a retrieval practice intervention. J. Exp. Psychol.Appl. 24 , 43–56 (2018).

Biwer, F., oude Egbrink, M. G. A., Aalten, P. & de Bruin, A. B. H. Fostering effective learning strategies in higher education — a mixed-methods study. J. Appl. Res. Mem. Cogn. 9 , 186–203 (2020).

McDaniel, M. A. & Einstein, G. O. Training learning strategies to promote self-regulation and transfer: the knowledge, belief, commitment, and planning framework. Perspect. Psychol. Sci. 15 , 1363–1381 (2020). This paper provides a framework for training students on how to use learning strategies .

Cleary, A. M. et al. Wearable technology for automatizing science-based study strategies: reinforcing learning through intermittent smartwatch prompting. J. Appl. Res. Mem. Cogn. 10 , 444–457 (2021).

Fazio, L. K. Repetition increases perceived truth even for known falsehoods. Collabra: Psychology 6 , 38 (2020).

Kozyreva, A., Lewandowsky, S. & Hertwig, R. Citizens versus the Internet: confronting digital challenges with cognitive tools. Psychol. Sci. Public. Interest. 21 , 103–156 (2020).

Pennycook, G. & Rand, D. G. The psychology of fake news. Trends Cognit. Sci. 25 , 388–402 (2021).

Ecker, U. K. H. et al. The psychological drivers of misinformation belief and its resistance to correction. Nat. Rev. Psychol. 1 , 13–29 (2022).

Toppino, T. C., Kasserman, J. E. & Mracek, W. A. The effect of spacing repetitions on the recognition memory of young children and adults. J. Exp. Child. Psychol. 51 , 123–138 (1991).

Childers, J. B. & Tomasello, M. Two-year-olds learn novel nouns, verbs, and conventional actions from massed or distributed exposures. Dev. Psychol. 38 , 967–978 (2002).

Lotfolahi, A. R. & Salehi, H. Spacing effects in vocabulary learning: young EFL learners in focus. Cogent Education 4 , 1287391 (2017).

Ambridge, B., Theakston, A. L., Lieven, E. V. M. & Tomasello, M. The distributed learning effect for children’s acquisition of an abstract syntactic construction. Cognit. Dev. 21 , 174–193 (2006).

Schutte, G. M. et al. A comparative analysis of massed vs. distributed practice on basic math fact fluency growth rates. J. Sch. Psychol. 53 , 149–159 (2015).

Küpper-Tetzel, C. E., Erdfelder, E. & Dickhäuser, O. The lag effect in secondary school classrooms: enhancing students’ memory for vocabulary. Instr. Sci. 42 , 373–388 (2014).

Bloom, K. C. & Shuell, T. J. Effects of massed and distributed practice on the learning and retention of second-language vocabulary. J. Educ. Res. 74 , 245–248 (1981).

Grote, M. G. Distributed versus massed practice in high school physics. Sch. Sci. Math. 95 , 97 (1995).

Minnick, B. Can spaced review help students learn brief forms? J. Educ. Bus. 44 , 146–148 (1969).

Dobson, J. L., Perez, J. & Linderholm, T. Distributed retrieval practice promotes superior recall of anatomy information. Anat. Sci. Educ. 10 , 339–347 (2017).

Kornell, N. & Bjork, R. A. Learning concepts and categories: is spacing the “enemy of induction”? Psychol. Sci. 19 , 585–592 (2008).

Rawson, K. A. & Kintsch, W. Rereading effects depend on time of test. J. Educ. Psychol. 97 , 70–80 (2005).

Butler, A. C., Marsh, E. J., Slavinsky, J. P. & Baraniuk, R. G. Integrating cognitive science and technology improves learning in a STEM classroom. Educ. Psychol. Rev. 26 , 331–340 (2014).

Carpenter, S. K. & DeLosh, E. L. Application of the testing and spacing effects to name learning. Appl. Cognit. Psychol. 19 , 619–636 (2005).

Pan, S. C., Tajran, J., Lovelett, J., Osuna, J. & Rickard, T. C. Does interleaved practice enhance foreign language learning? The effects of training schedule on Spanish verb conjugation skills. J. Educ. Psychol. 111 , 1172–1188 (2019).

Miles, S. W. Spaced vs. massed distribution instruction for L2 grammar learning. System 42 , 412–428 (2014).

Rohrer, D. & Taylor, K. The effects of overlearning and distributed practise on the retention of mathematics knowledge. Appl. Cognit. Psychol. 20 , 1209–1224 (2006).

Wahlheim, C. N., Dunlosky, J. & Jacoby, L. L. Spacing enhances the learning of natural concepts: an investigation of mechanisms, metacognition, and aging. Mem. Cogn. 39 , 750–763 (2011).

Simmons, A. L. Distributed practice and procedural memory consolidation in musicians’ skill learning. J. Res. Music. Educ. 59 , 357–368 (2012).

Ebersbach, M. & Barzagar Nazari, K. Implementing distributed practice in statistics courses: benefits for retention and transfer. J. Appl. Res. Mem. Cogn. 9 , 532–541 (2020).

Kornell, N. Optimising learning using flashcards: spacing is more effective than cramming. Appl. Cognit. Psychol. 23 , 1297–1317 (2009).

Bouzid, N. & Crawshaw, C. M. Massed versus distributed wordprocessor training. Appl. Ergon. 18 , 220–222 (1987).

Lin, Y., Cheng, A., Grant, V. J., Currie, G. R. & Hecker, K. G. Improving CPR quality with distributed practice and real-time feedback in pediatric healthcare providers—a randomized controlled trial. Resuscitation 130 , 6–12 (2018).

Terenyi, J., Anksorus, H. & Persky, A. M. Impact of spacing of practice on learning brand name and generic drugs. Am. J. Pharm. Educ. 82 , 6179 (2018).

Kerfoot, B. P., DeWolf, W. C., Masser, B. A., Church, P. A. & Federman, D. D. Spaced education improves the retention of clinical knowledge by medical students: a randomised controlled trial. Med. Educ. 41 , 23–31 (2007).

Kornell, N., Castel, A. D., Eich, T. S. & Bjork, R. A. Spacing as the friend of both memory and induction in young and older adults. Psychol. Aging 25 , 498–503 (2010).

Leite, C. M. F., Ugrinowitsch, H., Carvalho, M. F. S. P. & Benda, R. N. Distribution of practice effects on older and younger adults’ motor-skill learning ability. Hum. Mov. 14 , 20–26 (2013).

Balota, D. A., Duchek, J. M. & Paullin, R. Age-related differences in the impact of spacing, lag, and retention interval. Psychol. Aging 4 , 3–9 (1989).

Kliegl, O., Abel, M. & Bäuml, K.-H. T. A (preliminary) recipe for obtaining a testing effect in preschool children: two critical ingredients. Front. Psychol. 9 , 1446 (2018).

Fritz, C. O., Morris, P. E., Nolan, D. & Singleton, J. Expanding retrieval practice: an effective aid to preschool children’s learning. Q. J. Exp. Psychol. 60 , 991–1004 (2007).

Rohrer, D., Taylor, K. & Sholar, B. Tests enhance the transfer of learning. J. Exp. Psychol. Learn. Mem. Cogn. 36 , 233–239 (2010).

Lipowski, S. L., Pyc, M. A., Dunlosky, J. & Rawson, K. A. Establishing and explaining the testing effect in free recall for young children. Dev. Psychol. 50 , 994–1000 (2014).

Wartenweiler, D. Testing effect for visual-symbolic material: enhancing the learning of Filipino children of low socio-economic status in the public school system. Int. J. Res. Rev . 20 , 74–93 (2011).

Karpicke, J. D., Blunt, J. R. & Smith, M. A. Retrieval-based learning: positive effects of retrieval practice in elementary school children. Front. Psychol. 7 , 350 (2016).

Metcalfe, J., Kornell, N. & Son, L. K. A cognitive-science based programme to enhance study efficacy in a high and low risk setting. Eur. J. Cognit. Psychol. 19 , 743–768 (2007).

Rowley, T. & McCrudden, M. T. Retrieval practice and retention of course content in a middle school science classroom. Appl. Cognit. Psychol. 34 , 1510–1515 (2020).

McDaniel, M. A., Agarwal, P. K., Huelser, B. J., McDermott, K. B. & Roediger, H. L. Test-enhanced learning in a middle school science classroom: the effects of quiz frequency and placement. J. Educ. Psychol. 103 , 399–414 (2011).

Nungester, R. J. & Duchastel, P. C. Testing versus review: effects on retention. J. Educ. Psychol. 74 , 18–22 (1982).

Dirkx, K. J. H., Kester, L. & Kirschner, P. A. The testing effect for learning principles and procedures from texts. J. Educ. Res. 107 , 357–364 (2014).

Marsh, E. J., Agarwal, P. K. & Roediger, H. L. Memorial consequences of answering SAT II questions. J. Exp. Psychol. Appl. 15 , 1–11 (2009).

Chang, C., Yeh, T. & Barufaldi, J. P. The positive and negative effects of science concept tests on student conceptual understanding. Int. J. Sci. Educ. 32 , 265–282 (2010).

Grimaldi, P. J. & Karpicke, J. D. Guided retrieval practice of educational materials using automated scoring. J. Educ. Psychol. 106 , 58–68 (2014).

Pan, S. C., Gopal, A. & Rickard, T. C. Testing with feedback yields potent, but piecewise, learning of history and biology facts. J. Educ. Psychol. 108 , 563–575 (2016).

Darabi, A., Nelson, D. W. & Palanki, S. Acquisition of troubleshooting skills in a computer simulation: worked example vs. conventional problem solving instructional strategies. Comput. Hum. Behav. 23 , 1809–1819 (2007).

Kang, S. H. K., Gollan, T. H. & Pashler, H. Don’t just repeat after me: retrieval practice is better than imitation for foreign vocabulary learning. Psychon. Bull. Rev. 20 , 1259–1265 (2013).

Carpenter, S. K. & Pashler, H. Testing beyond words: using tests to enhance visuospatial map learning. Psychon. Bull. Rev. 14 , 474–478 (2007).

Carpenter, S. K. & Kelly, J. W. Tests enhance retention and transfer of spatial learning. Psychon. Bull. Rev. 19 , 443–448 (2012).

Kang, S. H. K., McDaniel, M. A. & Pashler, H. Effects of testing on learning of functions. Psychon. Bull. Rev. 18 , 998–1005 (2011).

Jacoby, L. L., Wahlheim, C. N. & Coane, J. H. Test-enhanced learning of natural concepts: effects on recognition memory, classification, and metacognition. J. Exp. Psychol. Learn. Mem. Cogn. 36 , 1441–1451 (2010).

McDaniel, M. A., Anderson, J. L., Derbish, M. H. & Morrisette, N. Testing the testing effect in the classroom. Eur. J. Cognit. Psychol. 19 , 494–513 (2007).

Foss, D. J. & Pirozzolo, J. W. Four semesters investigating frequency of testing, the testing effect, and transfer of training. J. Educ. Psychol. 109 , 1067–1083 (2017).

Wong, S. S. H., Ng, G. J. P., Tempel, T. & Lim, S. W. H. Retrieval practice enhances analogical problem solving. J. Exp. Educ. 87 , 128–138 (2019).

Pan, S. C., Rubin, B. R. & Rickard, T. C. Does testing with feedback improve adult spelling skills relative to copying and reading? J. Exp. Psychol. Appl. 21 , 356–369 (2015).

Coppens, L., Verkoeijen, P. & Rikers, R. Learning Adinkra symbols: the effect of testing. J. Cognit. Psychol. 23 , 351–357 (2011).

Zaromb, F. M. & Roediger, H. L. The testing effect in free recall is associated with enhanced organizational processes. Mem. Cogn. 38 , 995–1008 (2010).

Carpenter, S. K., Pashler, H. & Vul, E. What types of learning are enhanced by a cued recall test? Psychon. Bull. Rev. 13 , 826–830 (2006).

Pan, S. C., Wong, C. M., Potter, Z. E., Mejia, J. & Rickard, T. C. Does test-enhanced learning transfer for triple associates? Mem. Cogn. 44 , 24–36 (2016).

Butler, A. C. & Roediger, H. L. Testing improves long-term retention in a simulated classroom setting. Eur. J. Cognit. Psychol. 19 , 514–527 (2007).

Dobson, J. L. & Linderholm, T. Self-testing promotes superior retention of anatomy and physiology information. Adv. Health Sci. Educ. 20 , 149–161 (2015).

Kromann, C. B., Jensen, M. L. & Ringsted, C. The effect of testing on skills learning. Med. Educ. 43 , 21–27 (2009).

Baghdady, M., Carnahan, H., Lam, E. W. N. & Woods, N. N. Test-enhanced learning and its effect on comprehension and diagnostic accuracy. Med. Educ. 48 , 181–188 (2014).

Freda, N. M. & Lipp, M. J. Test-enhanced learning in competence-based predoctoral orthodontics: a four-year study. J. Dental Educ. 80 , 348–354 (2016).

Tse, C.-S., Balota, D. A. & Roediger, H. L. The benefits and costs of repeated testing on the learning of face–name pairs in healthy older adults. Psychol. Aging 25 , 833–845 (2010).

Meyer, A. N. D. & Logan, J. M. Taking the testing effect beyond the college freshman: benefits for lifelong learning. Psychol. Aging 28 , 142–147 (2013).

Guran, C.-N. A., Lehmann-Grube, J. & Bunzeck, N. Retrieval practice improves recollection-based memory over a seven-day period in younger and older adults. Front. Psychol. 10 , 2997 (2020).

McCabe, J. Metacognitive awareness of learning strategies in undergraduates. Mem. Cogn. 39 , 462–476 (2011).

Carpenter, S. K., Witherby, A. E. & Tauber, S. K. On students’ (mis)judgments of learning and teaching effectiveness. J. Appl. Res. Mem. Cogn. 9 , 137–151 (2020). This review discusses the factors underlying faulty metacognition, and how they can mislead students’ judgements of their own learning as well as the quality of effective teaching .

Chi, M. T. H., Bassok, M., Lewis, M. W., Reimann, P. & Glaser, R. Self-explanations: how students study and use examples in learning to solve problems. Cognit. Sci. 13 , 145–182 (1989).

Gurung, R. A. R. How do students really study (and does it matter)? Teach. Psychol. 32 , 238–241 (2005).

Deslauriers, L., McCarty, L. S., Miller, K., Callaghan, K. & Kestin, G. Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom. Proc. Natl Acad. Sci. USA 116 , 19251–19257 (2019).

Hartwig, M. K., Rohrer, D. & Dedrick, R. F. Scheduling math practice: students’ underappreciation of spacing and interleaving. J. Exp. Psychol. Appl. 28 , 100–113 (2022).

Carpenter, S. K., King-Shepard, Q., & Nokes-Malach, T. J. in In Their Own Words: What Scholars Want You to Know About Why and How to Apply the Science of Learning in Your Academic Setting (eds Overson, C., Hakala, C., Kordonowy, L. & Benassi, V.) (American Psychological Association, in the press).

Kirk-Johnson, A., Galla, B. M. & Fraundorf, S. H. Perceiving effort as poor learning: the misinterpreted-effort hypothesis of how experienced effort and perceived learning relate to study strategy choice. Cognit. Psychol. 115 , 101237 (2019).

Fisher, O. & Oyserman, D. Assessing interpretations of experienced ease and difficulty as motivational constructs. Motiv. Sci. 3 , 133–163 (2017).

Schiefele, U. Interest, learning, and motivation. Educ. Psychol. 26 , 299–323 (1991).

Simons, J., Dewitte, S. & Lens, W. The role of different types of instrumentality in motivation, study strategies, and performance: know why you learn, so you’ll know what you learn! Br. J. Educ. Psychol. 74 , 343–360 (2004).

Pan, S. C., Sana, F., Samani, J., Cooke, J. & Kim, J. A. Learning from errors: students’ and instructors’ practices, attitudes, and beliefs. Memory 28 , 1105–1122 (2020).

Download references

Acknowledgements

This material is based upon work supported by the James S. McDonnell Foundation 21st Century Science Initiative in Understanding Human Cognition, Collaborative Grant 220020483. The authors thank C. Phua for assistance with verifying references.

Author information

Authors and affiliations.

Department of Psychology, Iowa State University, Ames, IA, USA

Shana K. Carpenter

Department of Psychology, National University of Singapore, Singapore City, Singapore

  • Steven C. Pan

Department of Education, Washington University in St. Louis, St. Louis, MO, USA

Andrew C. Butler

Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, MO, USA

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the design of the article. S.K.C. drafted the sections on measuring learning, spacing, successive relearning and future directions; S.C.P. drafted the section on retrieval practice, developed the figures and drafted the tables; A.C.B. drafted the section on metacognition. All authors edited and approved the final draft of the complete manuscript.

Corresponding author

Correspondence to Shana K. Carpenter .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Reviews Psychology thanks Veronica Yan, who co-reviewed with Brendan Schuetze; Mirjam Ebersbach; and Nate Kornell for their contribution to the peer review of this work.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Carpenter, S.K., Pan, S.C. & Butler, A.C. The science of effective learning with spacing and retrieval practice. Nat Rev Psychol 1 , 496–511 (2022). https://doi.org/10.1038/s44159-022-00089-1

Download citation

Accepted : 23 June 2022

Published : 02 August 2022

Issue Date : September 2022

DOI : https://doi.org/10.1038/s44159-022-00089-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Optimizing self-organized study orders: combining refutations and metacognitive prompts improves the use of interleaved practice.

  • Felicitas Biwer
  • Anique de Bruin

npj Science of Learning (2024)

Improved Soft-Skill Competencies of ABA Professionals Following Training and Coaching: A Feasibility Study

  • Zahava L. Friedman
  • Daphna El Roy
  • Angela Broff

Behavior and Social Issues (2024)

A Computational Model of School Achievement

  • Brendan A. Schuetze

Educational Psychology Review (2024)

Becoming Better Learners, Becoming Better Teachers: Augmenting Learning via Cognitive and Motivational Theories

  • Veronica X. Yan
  • Stephany Duany Rea

Human Arenas (2024)

Emerging and Future Directions in Test-Enhanced Learning Research

  • John Dunlosky
  • Kim Ouwehand

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

abstract learning research

  • Tutorial Review
  • Open access
  • Published: 24 January 2018

Teaching the science of learning

  • Yana Weinstein   ORCID: orcid.org/0000-0002-5144-968X 1 ,
  • Christopher R. Madan 2 , 3 &
  • Megan A. Sumeracki 4  

Cognitive Research: Principles and Implications volume  3 , Article number:  2 ( 2018 ) Cite this article

246k Accesses

90 Citations

765 Altmetric

Metrics details

The science of learning has made a considerable contribution to our understanding of effective teaching and learning strategies. However, few instructors outside of the field are privy to this research. In this tutorial review, we focus on six specific cognitive strategies that have received robust support from decades of research: spaced practice, interleaving, retrieval practice, elaboration, concrete examples, and dual coding. We describe the basic research behind each strategy and relevant applied research, present examples of existing and suggested implementation, and make recommendations for further research that would broaden the reach of these strategies.

Significance

Education does not currently adhere to the medical model of evidence-based practice (Roediger, 2013 ). However, over the past few decades, our field has made significant advances in applying cognitive processes to education. From this work, specific recommendations can be made for students to maximize their learning efficiency (Dunlosky, Rawson, Marsh, Nathan, & Willingham, 2013 ; Roediger, Finn, & Weinstein, 2012 ). In particular, a review published 10 years ago identified a limited number of study techniques that have received solid evidence from multiple replications testing their effectiveness in and out of the classroom (Pashler et al., 2007 ). A recent textbook analysis (Pomerance, Greenberg, & Walsh, 2016 ) took the six key learning strategies from this report by Pashler and colleagues, and found that very few teacher-training textbooks cover any of these six principles – and none cover them all, suggesting that these strategies are not systematically making their way into the classroom. This is the case in spite of multiple recent academic (e.g., Dunlosky et al., 2013 ) and general audience (e.g., Dunlosky, 2013 ) publications about these strategies. In this tutorial review, we present the basic science behind each of these six key principles, along with more recent research on their effectiveness in live classrooms, and suggest ideas for pedagogical implementation. The target audience of this review is (a) educators who might be interested in integrating the strategies into their teaching practice, (b) science of learning researchers who are looking for open questions to help determine future research priorities, and (c) researchers in other subfields who are interested in the ways that principles from cognitive psychology have been applied to education.

While the typical teacher may not be exposed to this research during teacher training, a small cohort of teachers intensely interested in cognitive psychology has recently emerged. These teachers are mainly based in the UK, and, anecdotally (e.g., Dennis (2016), personal communication), appear to have taken an interest in the science of learning after reading Make it Stick (Brown, Roediger, & McDaniel, 2014 ; see Clark ( 2016 ) for an enthusiastic review of this book on a teacher’s blog, and “Learning Scientists” ( 2016c ) for a collection). In addition, a grassroots teacher movement has led to the creation of “researchED” – a series of conferences on evidence-based education (researchED, 2013 ). The teachers who form part of this network frequently discuss cognitive psychology techniques and their applications to education on social media (mainly Twitter; e.g., Fordham, 2016 ; Penfound, 2016 ) and on their blogs, such as Evidence Into Practice ( https://evidenceintopractice.wordpress.com/ ), My Learning Journey ( http://reflectionsofmyteaching.blogspot.com/ ), and The Effortful Educator ( https://theeffortfuleducator.com/ ). In general, the teachers who write about these issues pay careful attention to the relevant literature, often citing some of the work described in this review.

These informal writings, while allowing teachers to explore their approach to teaching practice (Luehmann, 2008 ), give us a unique window into the application of the science of learning to the classroom. By examining these blogs, we can not only observe how basic cognitive research is being applied in the classroom by teachers who are reading it, but also how it is being misapplied, and what questions teachers may be posing that have gone unaddressed in the scientific literature. Throughout this review, we illustrate each strategy with examples of how it can be implemented (see Table  1 and Figs.  1 , 2 , 3 , 4 , 5 , 6 and 7 ), as well as with relevant teacher blog posts that reflect on its application, and draw upon this work to pin-point fruitful avenues for further basic and applied research.

Spaced practice schedule for one week. This schedule is designed to represent a typical timetable of a high-school student. The schedule includes four one-hour study sessions, one longer study session on the weekend, and one rest day. Notice that each subject is studied one day after it is covered in school, to create spacing between classes and study sessions. Copyright note: this image was produced by the authors

a Blocked practice and interleaved practice with fraction problems. In the blocked version, students answer four multiplication problems consecutively. In the interleaved version, students answer a multiplication problem followed by a division problem and then an addition problem, before returning to multiplication. For an experiment with a similar setup, see Patel et al. ( 2016 ). Copyright note: this image was produced by the authors. b Illustration of interleaving and spacing. Each color represents a different homework topic. Interleaving involves alternating between topics, rather than blocking. Spacing involves distributing practice over time, rather than massing. Interleaving inherently involves spacing as other tasks naturally “fill” the spaces between interleaved sessions. Copyright note: this image was produced by the authors, adapted from Rohrer ( 2012 )

Concept map illustrating the process and resulting benefits of retrieval practice. Retrieval practice involves the process of withdrawing learned information from long-term memory into working memory, which requires effort. This produces direct benefits via the consolidation of learned information, making it easier to remember later and causing improvements in memory, transfer, and inferences. Retrieval practice also produces indirect benefits of feedback to students and teachers, which in turn can lead to more effective study and teaching practices, with a focus on information that was not accurately retrieved. Copyright note: this figure originally appeared in a blog post by the first and third authors ( http://www.learningscientists.org/blog/2016/4/1-1 )

Illustration of “how” and “why” questions (i.e., elaborative interrogation questions) students might ask while studying the physics of flight. To help figure out how physics explains flight, students might ask themselves the following questions: “How does a plane take off?”; “Why does a plane need an engine?”; “How does the upward force (lift) work?”; “Why do the wings have a curved upper surface and a flat lower surface?”; and “Why is there a downwash behind the wings?”. Copyright note: the image of the plane was downloaded from Pixabay.com and is free to use, modify, and share

Three examples of physics problems that would be categorized differently by novices and experts. The problems in ( a ) and ( c ) look similar on the surface, so novices would group them together into one category. Experts, however, will recognize that the problems in ( b ) and ( c ) both relate to the principle of energy conservation, and so will group those two problems into one category instead. Copyright note: the figure was produced by the authors, based on figures in Chi et al. ( 1981 )

Example of how to enhance learning through use of a visual example. Students might view this visual representation of neural communications with the words provided, or they could draw a similar visual representation themselves. Copyright note: this figure was produced by the authors

Example of word properties associated with visual, verbal, and motor coding for the word “SPOON”. A word can evoke multiple types of representation (“codes” in dual coding theory). Viewing a word will automatically evoke verbal representations related to its component letters and phonemes. Words representing objects (i.e., concrete nouns) will also evoke visual representations, including information about similar objects, component parts of the object, and information about where the object is typically found. In some cases, additional codes can also be evoked, such as motor-related properties of the represented object, where contextual information related to the object’s functional intention and manipulation action may also be processed automatically when reading the word. Copyright note: this figure was produced by the authors and is based on Aylwin ( 1990 ; Fig.  2 ) and Madan and Singhal ( 2012a , Fig.  3 )

Spaced practice

The benefits of spaced (or distributed) practice to learning are arguably one of the strongest contributions that cognitive psychology has made to education (Kang, 2016 ). The effect is simple: the same amount of repeated studying of the same information spaced out over time will lead to greater retention of that information in the long run, compared with repeated studying of the same information for the same amount of time in one study session. The benefits of distributed practice were first empirically demonstrated in the 19 th century. As part of his extensive investigation into his own memory, Ebbinghaus ( 1885/1913 ) found that when he spaced out repetitions across 3 days, he could almost halve the number of repetitions necessary to relearn a series of 12 syllables in one day (Chapter 8). He thus concluded that “a suitable distribution of [repetitions] over a space of time is decidedly more advantageous than the massing of them at a single time” (Section 34). For those who want to read more about Ebbinghaus’s contribution to memory research, Roediger ( 1985 ) provides an excellent summary.

Since then, hundreds of studies have examined spacing effects both in the laboratory and in the classroom (Kang, 2016 ). Spaced practice appears to be particularly useful at large retention intervals: in the meta-analysis by Cepeda, Pashler, Vul, Wixted, and Rohrer ( 2006 ), all studies with a retention interval longer than a month showed a clear benefit of distributed practice. The “new theory of disuse” (Bjork & Bjork, 1992 ) provides a helpful mechanistic explanation for the benefits of spacing to learning. This theory posits that memories have both retrieval strength and storage strength. Whereas retrieval strength is thought to measure the ease with which a memory can be recalled at a given moment, storage strength (which cannot be measured directly) represents the extent to which a memory is truly embedded in the mind. When studying is taking place, both retrieval strength and storage strength receive a boost. However, the extent to which storage strength is boosted depends upon retrieval strength, and the relationship is negative: the greater the current retrieval strength, the smaller the gains in storage strength. Thus, the information learned through “cramming” will be rapidly forgotten due to high retrieval strength and low storage strength (Bjork & Bjork, 2011 ), whereas spacing out learning increases storage strength by allowing retrieval strength to wane before restudy.

Teachers can introduce spacing to their students in two broad ways. One involves creating opportunities to revisit information throughout the semester, or even in future semesters. This does involve some up-front planning, and can be difficult to achieve, given time constraints and the need to cover a set curriculum. However, spacing can be achieved with no great costs if teachers set aside a few minutes per class to review information from previous lessons. The second method involves putting the onus to space on the students themselves. Of course, this would work best with older students – high school and above. Because spacing requires advance planning, it is crucial that the teacher helps students plan their studying. For example, teachers could suggest that students schedule study sessions on days that alternate with the days on which a particular class meets (e.g., schedule review sessions for Tuesday and Thursday when the class meets Monday and Wednesday; see Fig.  1 for a more complete weekly spaced practice schedule). It important to note that the spacing effect refers to information that is repeated multiple times, rather than the idea of studying different material in one long session versus spaced out in small study sessions over time. However, for teachers and particularly for students planning a study schedule, the subtle difference between the two situations (spacing out restudy opportunities, versus spacing out studying of different information over time) may be lost. Future research should address the effects of spacing out studying of different information over time, whether the same considerations apply in this situation as compared to spacing out restudy opportunities, and how important it is for teachers and students to understand the difference between these two types of spaced practice.

It is important to note that students may feel less confident when they space their learning (Bjork, 1999 ) than when they cram. This is because spaced learning is harder – but it is this “desirable difficulty” that helps learning in the long term (Bjork, 1994 ). Students tend to cram for exams rather than space out their learning. One explanation for this is that cramming does “work”, if the goal is only to pass an exam. In order to change students’ minds about how they schedule their studying, it might be important to emphasize the value of retaining information beyond a final exam in one course.

Ideas for how to apply spaced practice in teaching have appeared in numerous teacher blogs (e.g., Fawcett, 2013 ; Kraft, 2015 ; Picciotto, 2009 ). In England in particular, as of 2013, high-school students need to be able to remember content from up to 3 years back on cumulative exams (General Certificate of Secondary Education (GCSE) and A-level exams; see CIFE, 2012 ). A-levels in particular determine what subject students study in university and which programs they are accepted into, and thus shape the path of their academic career. A common approach for dealing with these exams has been to include a “revision” (i.e., studying or cramming) period of a few weeks leading up to the high-stakes cumulative exams. Now, teachers who follow cognitive psychology are advocating a shift of priorities to spacing learning over time across the 3 years, rather than teaching a topic once and then intensely reviewing it weeks before the exam (Cox, 2016a ; Wood, 2017 ). For example, some teachers have suggested using homework assignments as an opportunity for spaced practice by giving students homework on previous topics (Rose, 2014 ). However, questions remain, such as whether spaced practice can ever be effective enough to completely alleviate the need or utility of a cramming period (Cox, 2016b ), and how one can possibly figure out the optimal lag for spacing (Benney, 2016 ; Firth, 2016 ).

There has been considerable research on the question of optimal lag, and much of it is quite complex; two sessions neither too close together (i.e., cramming) nor too far apart are ideal for retention. In a large-scale study, Cepeda, Vul, Rohrer, Wixted, and Pashler ( 2008 ) examined the effects of the gap between study sessions and the interval between study and test across long periods, and found that the optimal gap between study sessions was contingent on the retention interval. Thus, it is not clear how teachers can apply the complex findings on lag to their own classrooms.

A useful avenue of research would be to simplify the research paradigms that are used to study optimal lag, with the goal of creating a flexible, spaced-practice framework that teachers could apply and tailor to their own teaching needs. For example, an Excel macro spreadsheet was recently produced to help teachers plan for lagged lessons (Weinstein-Jones & Weinstein, 2017 ; see Weinstein & Weinstein-Jones ( 2017 ) for a description of the algorithm used in the spreadsheet), and has been used by teachers to plan their lessons (Penfound, 2017 ). However, one teacher who found this tool helpful also wondered whether the more sophisticated plan was any better than his own method of manually selecting poorly understood material from previous classes for later review (Lovell, 2017 ). This direction is being actively explored within personalized online learning environments (Kornell & Finn, 2016 ; Lindsey, Shroyer, Pashler, & Mozer, 2014 ), but teachers in physical classrooms might need less technologically-driven solutions to teach cohorts of students.

It seems teachers would greatly appreciate a set of guidelines for how to implement spacing in the curriculum in the most effective, but also the most efficient manner. While the cognitive field has made great advances in terms of understanding the mechanisms behind spacing, what teachers need more of are concrete evidence-based tools and guidelines for direct implementation in the classroom. These could include more sophisticated and experimentally tested versions of the software described above (Weinstein-Jones & Weinstein, 2017 ), or adaptable templates of spaced curricula. Moreover, researchers need to evaluate the effectiveness of these tools in a real classroom environment, over a semester or academic year, in order to give pedagogically relevant evidence-based recommendations to teachers.

Interleaving

Another scheduling technique that has been shown to increase learning is interleaving. Interleaving occurs when different ideas or problem types are tackled in a sequence, as opposed to the more common method of attempting multiple versions of the same problem in a given study session (known as blocking). Interleaving as a principle can be applied in many different ways. One such way involves interleaving different types of problems during learning, which is particularly applicable to subjects such as math and physics (see Fig.  2 a for an example with fractions, based on a study by Patel, Liu, & Koedinger, 2016 ). For example, in a study with college students, Rohrer and Taylor ( 2007 ) found that shuffling math problems that involved calculating the volume of different shapes resulted in better test performance 1 week later than when students answered multiple problems about the same type of shape in a row. This pattern of results has also been replicated with younger students, for example 7 th grade students learning to solve graph and slope problems (Rohrer, Dedrick, & Stershic, 2015 ). The proposed explanation for the benefit of interleaving is that switching between different problem types allows students to acquire the ability to choose the right method for solving different types of problems rather than learning only the method itself, and not when to apply it.

Do the benefits of interleaving extend beyond problem solving? The answer appears to be yes. Interleaving can be helpful in other situations that require discrimination, such as inductive learning. Kornell and Bjork ( 2008 ) examined the effects of interleaving in a task that might be pertinent to a student of the history of art: the ability to match paintings to their respective painters. Students who studied different painters’ paintings interleaved at study were more successful on a later identification test than were participants who studied the paintings blocked by painter. Birnbaum, Kornell, Bjork, and Bjork ( 2013 ) proposed the discriminative-contrast hypothesis to explain that interleaving enhances learning by allowing the comparison between exemplars of different categories. They found support for this hypothesis in a set of experiments with bird categorization: participants benefited from interleaving and also from spacing, but not when the spacing interrupted side-by-side comparisons of birds from different categories.

Another type of interleaving involves the interleaving of study and test opportunities. This type of interleaving has been applied, once again, to problem solving, whereby students alternate between attempting a problem and viewing a worked example (Trafton & Reiser, 1993 ); this pattern appears to be superior to answering a string of problems in a row, at least with respect to the amount of time it takes to achieve mastery of a procedure (Corbett, Reed, Hoffmann, MacLaren, & Wagner, 2010 ). The benefits of interleaving study and test opportunities – rather than blocking study followed by attempting to answer problems or questions – might arise due to a process known as “test-potentiated learning”. That is, a study opportunity that immediately follows a retrieval attempt may be more fruitful than when that same studying was not preceded by retrieval (Arnold & McDermott, 2013 ).

For problem-based subjects, the interleaving technique is straightforward: simply mix questions on homework and quizzes with previous materials (which takes care of spacing as well); for languages, mix vocabulary themes rather than blocking by theme (Thomson & Mehring, 2016 ). But interleaving as an educational strategy ought to be presented to teachers with some caveats. Research has focused on interleaving material that is somewhat related (e.g., solving different mathematical equations, Rohrer et al., 2015 ), whereas students sometimes ask whether they should interleave material from different subjects – a practice that has not received empirical support (Hausman & Kornell, 2014 ). When advising students how to study independently, teachers should thus proceed with caution. Since it is easy for younger students to confuse this type of unhelpful interleaving with the more helpful interleaving of related information, it may be best for teachers of younger grades to create opportunities for interleaving in homework and quiz assignments rather than putting the onus on the students themselves to make use of the technique. Technology can be very helpful here, with apps such as Quizlet, Memrise, Anki, Synap, Quiz Champ, and many others (see also “Learning Scientists”, 2017 ) that not only allow instructor-created quizzes to be taken by students, but also provide built-in interleaving algorithms so that the burden does not fall on the teacher or the student to carefully plan which items are interleaved when.

An important point to consider is that in educational practice, the distinction between spacing and interleaving can be difficult to delineate. The gap between the scientific and classroom definitions of interleaving is demonstrated by teachers’ own writings about this technique. When they write about interleaving, teachers often extend the term to connote a curriculum that involves returning to topics multiple times throughout the year (e.g., Kirby, 2014 ; see “Learning Scientists” ( 2016a ) for a collection of similar blog posts by several other teachers). The “interleaving” of topics throughout the curriculum produces an effect that is more akin to what cognitive psychologists call “spacing” (see Fig.  2 b for a visual representation of the difference between interleaving and spacing). However, cognitive psychologists have not examined the effects of structuring the curriculum in this way, and open questions remain: does repeatedly circling back to previous topics throughout the semester interrupt the learning of new information? What are some effective techniques for interleaving old and new information within one class? And how does one determine the balance between old and new information?

Retrieval practice

While tests are most often used in educational settings for assessment, a lesser-known benefit of tests is that they actually improve memory of the tested information. If we think of our memories as libraries of information, then it may seem surprising that retrieval (which happens when we take a test) improves memory; however, we know from a century of research that retrieving knowledge actually strengthens it (see Karpicke, Lehman, & Aue, 2014 ). Testing was shown to strengthen memory as early as 100 years ago (Gates, 1917 ), and there has been a surge of research in the last decade on the mnemonic benefits of testing, or retrieval practice . Most of the research on the effectiveness of retrieval practice has been done with college students (see Roediger & Karpicke, 2006 ; Roediger, Putnam, & Smith, 2011 ), but retrieval-based learning has been shown to be effective at producing learning for a wide range of ages, including preschoolers (Fritz, Morris, Nolan, & Singleton, 2007 ), elementary-aged children (e.g., Karpicke, Blunt, & Smith, 2016 ; Karpicke, Blunt, Smith, & Karpicke, 2014 ; Lipko-Speed, Dunlosky, & Rawson, 2014 ; Marsh, Fazio, & Goswick, 2012 ; Ritchie, Della Sala, & McIntosh, 2013 ), middle-school students (e.g., McDaniel, Thomas, Agarwal, McDermott, & Roediger, 2013 ; McDermott, Agarwal, D’Antonio, Roediger, & McDaniel, 2014 ), and high-school students (e.g., McDermott et al., 2014 ). In addition, the effectiveness of retrieval-based learning has been extended beyond simple testing to other activities in which retrieval practice can be integrated, such as concept mapping (Blunt & Karpicke, 2014 ; Karpicke, Blunt, et al., 2014 ; Ritchie et al., 2013 ).

A debate is currently ongoing as to the effectiveness of retrieval practice for more complex materials (Karpicke & Aue, 2015 ; Roelle & Berthold, 2017 ; Van Gog & Sweller, 2015 ). Practicing retrieval has been shown to improve the application of knowledge to new situations (e.g., Butler, 2010 ; Dirkx, Kester, & Kirschner, 2014 ); McDaniel et al., 2013 ; Smith, Blunt, Whiffen, & Karpicke, 2016 ); but see Tran, Rohrer, and Pashler ( 2015 ) and Wooldridge, Bugg, McDaniel, and Liu ( 2014 ), for retrieval practice studies that showed limited or no increased transfer compared to restudy. Retrieval practice effects on higher-order learning may be more sensitive than fact learning to encoding factors, such as the way material is presented during study (Eglington & Kang, 2016 ). In addition, retrieval practice may be more beneficial for higher-order learning if it includes more scaffolding (Fiechter & Benjamin, 2017 ; but see Smith, Blunt, et al., 2016 ) and targeted practice with application questions (Son & Rivas, 2016 ).

How does retrieval practice help memory? Figure  3 illustrates both the direct and indirect benefits of retrieval practice identified by the literature. The act of retrieval itself is thought to strengthen memory (Karpicke, Blunt, et al., 2014 ; Roediger & Karpicke, 2006 ; Smith, Roediger, & Karpicke, 2013 ). For example, Smith et al. ( 2013 ) showed that if students brought information to mind without actually producing it (covert retrieval), they remembered the information just as well as if they overtly produced the retrieved information (overt retrieval). Importantly, both overt and covert retrieval practice improved memory over control groups without retrieval practice, even when feedback was not provided. The fact that bringing information to mind in the absence of feedback or restudy opportunities improves memory leads researchers to conclude that it is the act of retrieval – thinking back to bring information to mind – that improves memory of that information.

The benefit of retrieval practice depends to a certain extent on successful retrieval (see Karpicke, Lehman, et al., 2014 ). For example, in Experiment 4 of Smith et al. ( 2013 ), students successfully retrieved 72% of the information during retrieval practice. Of course, retrieving 72% of the information was compared to a restudy control group, during which students were re-exposed to 100% of the information, creating a bias in favor of the restudy condition. Yet retrieval led to superior memory later compared to the restudy control. However, if retrieval success is extremely low, then it is unlikely to improve memory (e.g., Karpicke, Blunt, et al., 2014 ), particularly in the absence of feedback. On the other hand, if retrieval-based learning situations are constructed in such a way that ensures high levels of success, the act of bringing the information to mind may be undermined, thus making it less beneficial. For example, if a student reads a sentence and then immediately covers the sentence and recites it out loud, they are likely not retrieving the information but rather just keeping the information in their working memory long enough to recite it again (see Smith, Blunt, et al., 2016 for a discussion of this point). Thus, it is important to balance success of retrieval with overall difficulty in retrieving the information (Smith & Karpicke, 2014 ; Weinstein, Nunes, & Karpicke, 2016 ). If initial retrieval success is low, then feedback can help improve the overall benefit of practicing retrieval (Kang, McDermott, & Roediger, 2007 ; Smith & Karpicke, 2014 ). Kornell, Klein, and Rawson ( 2015 ), however, found that it was the retrieval attempt and not the correct production of information that produced the retrieval practice benefit – as long as the correct answer was provided after an unsuccessful attempt, the benefit was the same as for a successful retrieval attempt in this set of studies. From a practical perspective, it would be helpful for teachers to know when retrieval attempts in the absence of success are helpful, and when they are not. There may also be additional reasons beyond retrieval benefits that would push teachers towards retrieval practice activities that produce some success amongst students; for example, teachers may hesitate to give students retrieval practice exercises that are too difficult, as this may negatively affect self-efficacy and confidence.

In addition to the fact that bringing information to mind directly improves memory for that information, engaging in retrieval practice can produce indirect benefits as well (see Roediger et al., 2011 ). For example, research by Weinstein, Gilmore, Szpunar, and McDermott ( 2014 ) demonstrated that when students expected to be tested, the increased test expectancy led to better-quality encoding of new information. Frequent testing can also serve to decrease mind-wandering – that is, thoughts that are unrelated to the material that students are supposed to be studying (Szpunar, Khan, & Schacter, 2013 ).

Practicing retrieval is a powerful way to improve meaningful learning of information, and it is relatively easy to implement in the classroom. For example, requiring students to practice retrieval can be as simple as asking students to put their class materials away and try to write out everything they know about a topic. Retrieval-based learning strategies are also flexible. Instructors can give students practice tests (e.g., short-answer or multiple-choice, see Smith & Karpicke, 2014 ), provide open-ended prompts for the students to recall information (e.g., Smith, Blunt, et al., 2016 ) or ask their students to create concept maps from memory (e.g., Blunt & Karpicke, 2014 ). In one study, Weinstein et al. ( 2016 ) looked at the effectiveness of inserting simple short-answer questions into online learning modules to see whether they improved student performance. Weinstein and colleagues also manipulated the placement of the questions. For some students, the questions were interspersed throughout the module, and for other students the questions were all presented at the end of the module. Initial success on the short-answer questions was higher when the questions were interspersed throughout the module. However, on a later test of learning from that module, the original placement of the questions in the module did not matter for performance. As with spaced practice, where the optimal gap between study sessions is contingent on the retention interval, the optimum difficulty and level of success during retrieval practice may also depend on the retention interval. Both groups of students who answered questions performed better on the delayed test compared to a control group without question opportunities during the module. Thus, the important thing is for instructors to provide opportunities for retrieval practice during learning. Based on previous research, any activity that promotes the successful retrieval of information should improve learning.

Retrieval practice has received a lot of attention in teacher blogs (see “Learning Scientists” ( 2016b ) for a collection). A common theme seems to be an emphasis on low-stakes (Young, 2016 ) and even no-stakes (Cox, 2015 ) testing, the goal of which is to increase learning rather than assess performance. In fact, one well-known charter school in the UK has an official homework policy grounded in retrieval practice: students are to test themselves on subject knowledge for 30 minutes every day in lieu of standard homework (Michaela Community School, 2014 ). The utility of homework, particularly for younger children, is often a hotly debated topic outside of academia (e.g., Shumaker, 2016 ; but see Jones ( 2016 ) for an opposing viewpoint and Cooper ( 1989 ) for the original research the blog posts were based on). Whereas some research shows clear links between homework and academic achievement (Valle et al., 2016 ), other researchers have questioned the effectiveness of homework (Dettmers, Trautwein, & Lüdtke, 2009 ). Perhaps amending homework to involve retrieval practice might make it more effective; this remains an open empirical question.

One final consideration is that of test anxiety. While retrieval practice can be very powerful at improving memory, some research shows that pressure during retrieval can undermine some of the learning benefit. For example, Hinze and Rapp ( 2014 ) manipulated pressure during quizzing to create high-pressure and low-pressure conditions. On the quizzes themselves, students performed equally well. However, those in the high-pressure condition did not perform as well on a criterion test later compared to the low-pressure group. Thus, test anxiety may reduce the learning benefit of retrieval practice. Eliminating all high-pressure tests is probably not possible, but instructors can provide a number of low-stakes retrieval opportunities for students to help increase learning. The use of low-stakes testing can serve to decrease test anxiety (Khanna, 2015 ), and has recently been shown to negate the detrimental impact of stress on learning (Smith, Floerke, & Thomas, 2016 ). This is a particularly important line of inquiry to pursue for future research, because many teachers who are not familiar with the effectiveness of retrieval practice may be put off by the implied pressure of “testing”, which evokes the much maligned high-stakes standardized tests (e.g., McHugh, 2013 ).

Elaboration

Elaboration involves connecting new information to pre-existing knowledge. Anderson ( 1983 , p.285) made the following claim about elaboration: “One of the most potent manipulations that can be performed in terms of increasing a subject’s memory for material is to have the subject elaborate on the to-be-remembered material.” Postman ( 1976 , p. 28) defined elaboration most parsimoniously as “additions to nominal input”, and Hirshman ( 2001 , p. 4369) provided an elaboration on this definition (pun intended!), defining elaboration as “A conscious, intentional process that associates to-be-remembered information with other information in memory.” However, in practice, elaboration could mean many different things. The common thread in all the definitions is that elaboration involves adding features to an existing memory.

One possible instantiation of elaboration is thinking about information on a deeper level. The levels (or “depth”) of processing framework, proposed by Craik and Lockhart ( 1972 ), predicts that information will be remembered better if it is processed more deeply in terms of meaning, rather than shallowly in terms of form. The leves of processing framework has, however, received a number of criticisms (Craik, 2002 ). One major problem with this framework is that it is difficult to measure “depth”. And if we are not able to actually measure depth, then the argument can become circular: is it that something was remembered better because it was studied more deeply, or do we conclude that it must have been studied more deeply because it is remembered better? (See Lockhart & Craik, 1990 , for further discussion of this issue).

Another mechanism by which elaboration can confer a benefit to learning is via improvement in organization (Bellezza, Cheesman, & Reddy, 1977 ; Mandler, 1979 ). By this view, elaboration involves making information more integrated and organized with existing knowledge structures. By connecting and integrating the to-be-learned information with other concepts in memory, students can increase the extent to which the ideas are organized in their minds, and this increased organization presumably facilitates the reconstruction of the past at the time of retrieval.

Elaboration is such a broad term and can include so many different techniques that it is hard to claim that elaboration will always help learning. There is, however, a specific technique under the umbrella of elaboration for which there is relatively strong evidence in terms of effectiveness (Dunlosky et al., 2013 ; Pashler et al., 2007 ). This technique is called elaborative interrogation, and involves students questioning the materials that they are studying (Pressley, McDaniel, Turnure, Wood, & Ahmad, 1987 ). More specifically, students using this technique would ask “how” and “why” questions about the concepts they are studying (see Fig.  4 for an example on the physics of flight). Then, crucially, students would try to answer these questions – either from their materials or, eventually, from memory (McDaniel & Donnelly, 1996 ). The process of figuring out the answer to the questions – with some amount of uncertainty (Overoye & Storm, 2015 ) – can help learning. When using this technique, however, it is important that students check their answers with their materials or with the teacher; when the content generated through elaborative interrogation is poor, it can actually hurt learning (Clinton, Alibali, & Nathan, 2016 ).

Students can also be encouraged to self-explain concepts to themselves while learning (Chi, De Leeuw, Chiu, & LaVancher, 1994 ). This might involve students simply saying out loud what steps they need to perform to solve an equation. Aleven and Koedinger ( 2002 ) conducted two classroom studies in which students were either prompted by a “cognitive tutor” to provide self-explanations during a problem-solving task or not, and found that the self-explanations led to improved performance. According to the authors, this approach could scale well to real classrooms. If possible and relevant, students could even perform actions alongside their self-explanations (Cohen, 1981 ; see also the enactment effect, Hainselin, Picard, Manolli, Vankerkore-Candas, & Bourdin, 2017 ). Instructors can scaffold students in these types of activities by providing self-explanation prompts throughout to-be-learned material (O’Neil et al., 2014 ). Ultimately, the greatest potential benefit of accurate self-explanation or elaboration is that the student will be able to transfer their knowledge to a new situation (Rittle-Johnson, 2006 ).

The technical term “elaborative interrogation” has not made it into the vernacular of educational bloggers (a search on https://educationechochamberuncut.wordpress.com , which consolidates over 3,000 UK-based teacher blogs, yielded zero results for that term). However, a few teachers have blogged about elaboration more generally (e.g., Hobbiss, 2016 ) and deep questioning specifically (e.g., Class Teaching, 2013 ), just without using the specific terminology. This strategy in particular may benefit from a more open dialog between researchers and teachers to facilitate the use of elaborative interrogation in the classroom and to address possible barriers to implementation. In terms of advancing the scientific understanding of elaborative interrogation in a classroom setting, it would be informative to conduct a larger-scale intervention to see whether having students elaborate during reading actually helps their understanding. It would also be useful to know whether the students really need to generate their own elaborative interrogation (“how” and “why”) questions, versus answering questions provided by others. How long should students persist to find the answers? When is the right time to have students engage in this task, given the levels of expertise required to do it well (Clinton et al., 2016 )? Without knowing the answers to these questions, it may be too early for us to instruct teachers to use this technique in their classes. Finally, elaborative interrogation takes a long time. Is this time efficiently spent? Or, would it be better to have the students try to answer a few questions, pool their information as a class, and then move to practicing retrieval of the information?

Concrete examples

Providing supporting information can improve the learning of key ideas and concepts. Specifically, using concrete examples to supplement content that is more conceptual in nature can make the ideas easier to understand and remember. Concrete examples can provide several advantages to the learning process: (a) they can concisely convey information, (b) they can provide students with more concrete information that is easier to remember, and (c) they can take advantage of the superior memorability of pictures relative to words (see “Dual Coding”).

Words that are more concrete are both recognized and recalled better than abstract words (Gorman, 1961 ; e.g., “button” and “bound,” respectively). Furthermore, it has been demonstrated that information that is more concrete and imageable enhances the learning of associations, even with abstract content (Caplan & Madan, 2016 ; Madan, Glaholt, & Caplan, 2010 ; Paivio, 1971 ). Following from this, providing concrete examples during instruction should improve retention of related abstract concepts, rather than the concrete examples alone being remembered better. Concrete examples can be useful both during instruction and during practice problems. Having students actively explain how two examples are similar and encouraging them to extract the underlying structure on their own can also help with transfer. In a laboratory study, Berry ( 1983 ) demonstrated that students performed well when given concrete practice problems, regardless of the use of verbalization (akin to elaborative interrogation), but that verbalization helped students transfer understanding from concrete to abstract problems. One particularly important area of future research is determining how students can best make the link between concrete examples and abstract ideas.

Since abstract concepts are harder to grasp than concrete information (Paivio, Walsh, & Bons, 1994 ), it follows that teachers ought to illustrate abstract ideas with concrete examples. However, care must be taken when selecting the examples. LeFevre and Dixon ( 1986 ) provided students with both concrete examples and abstract instructions and found that when these were inconsistent, students followed the concrete examples rather than the abstract instructions, potentially constraining the application of the abstract concept being taught. Lew, Fukawa-Connelly, Mejí-Ramos, and Weber ( 2016 ) used an interview approach to examine why students may have difficulty understanding a lecture. Responses indicated that some issues were related to understanding the overarching topic rather than the component parts, and to the use of informal colloquialisms that did not clearly follow from the material being taught. Both of these issues could have potentially been addressed through the inclusion of a greater number of relevant concrete examples.

One concern with using concrete examples is that students might only remember the examples – especially if they are particularly memorable, such as fun or gimmicky examples – and will not be able to transfer their understanding from one example to another, or more broadly to the abstract concept. However, there does not seem to be any evidence that fun relevant examples actually hurt learning by harming memory for important information. Instead, fun examples and jokes tend to be more memorable, but this boost in memory for the joke does not seem to come at a cost to memory for the underlying concept (Baldassari & Kelley, 2012 ). However, two important caveats need to be highlighted. First, to the extent that the more memorable content is not relevant to the concepts of interest, learning of the target information can be compromised (Harp & Mayer, 1998 ). Thus, care must be taken to ensure that all examples and gimmicks are, in fact, related to the core concepts that the students need to acquire, and do not contain irrelevant perceptual features (Kaminski & Sloutsky, 2013 ).

The second issue is that novices often notice and remember the surface details of an example rather than the underlying structure. Experts, on the other hand, can extract the underlying structure from examples that have divergent surface features (Chi, Feltovich, & Glaser, 1981 ; see Fig.  5 for an example from physics). Gick and Holyoak ( 1983 ) tried to get students to apply a rule from one problem to another problem that appeared different on the surface, but was structurally similar. They found that providing multiple examples helped with this transfer process compared to only using one example – especially when the examples provided had different surface details. More work is also needed to determine how many examples are sufficient for generalization to occur (and this, of course, will vary with contextual factors and individual differences). Further research on the continuum between concrete/specific examples and more abstract concepts would also be informative. That is, if an example is not concrete enough, it may be too difficult to understand. On the other hand, if the example is too concrete, that could be detrimental to generalization to the more abstract concept (although a diverse set of very concrete examples may be able to help with this). In fact, in a controversial article, Kaminski, Sloutsky, and Heckler ( 2008 ) claimed that abstract examples were more effective than concrete examples. Later rebuttals of this paper contested whether the abstract versus concrete distinction was clearly defined in the original study (see Reed, 2008 , for a collection of letters on the subject). This ideal point along the concrete-abstract continuum might also interact with development.

Finding teacher blog posts on concrete examples proved to be more difficult than for the other strategies in this review. One optimistic possibility is that teachers frequently use concrete examples in their teaching, and thus do not think of this as a specific contribution from cognitive psychology; the one blog post we were able to find that discussed concrete examples suggests that this might be the case (Boulton, 2016 ). The idea of “linking abstract concepts with concrete examples” is also covered in 25% of teacher-training textbooks used in the US, according to the report by Pomerance et al. ( 2016 ); this is the second most frequently covered of the six strategies, after “posing probing questions” (i.e., elaborative interrogation). A useful direction for future research would be to establish how teachers are using concrete examples in their practice, and whether we can make any suggestions for improvement based on research into the science of learning. For example, if two examples are better than one (Bauernschmidt, 2017 ), are additional examples also needed, or are there diminishing returns from providing more examples? And, how can teachers best ensure that concrete examples are consistent with prior knowledge (Reed, 2008 )?

Dual coding

Both the memory literature and folk psychology support the notion of visual examples being beneficial—the adage of “a picture is worth a thousand words” (traced back to an advertising slogan from the 1920s; Meider, 1990 ). Indeed, it is well-understood that more information can be conveyed through a simple illustration than through several paragraphs of text (e.g., Barker & Manji, 1989 ; Mayer & Gallini, 1990 ). Illustrations can be particularly helpful when the described concept involves several parts or steps and is intended for individuals with low prior knowledge (Eitel & Scheiter, 2015 ; Mayer & Gallini, 1990 ). Figure  6 provides a concrete example of this, illustrating how information can flow through neurons and synapses.

In addition to being able to convey information more succinctly, pictures are also more memorable than words (Paivio & Csapo, 1969 , 1973 ). In the memory literature, this is referred to as the picture superiority effect , and dual coding theory was developed in part to explain this effect. Dual coding follows from the notion of text being accompanied by complementary visual information to enhance learning. Paivio ( 1971 , 1986 ) proposed dual coding theory as a mechanistic account for the integration of multiple information “codes” to process information. In this theory, a code corresponds to a modal or otherwise distinct representation of a concept—e.g., “mental images for ‘book’ have visual, tactual, and other perceptual qualities similar to those evoked by the referent objects on which the images are based” (Clark & Paivio, 1991 , p. 152). Aylwin ( 1990 ) provides a clear example of how the word “dog” can evoke verbal, visual, and enactive representations (see Fig.  7 for a similar example for the word “SPOON”, based on Aylwin, 1990 (Fig.  2 ) and Madan & Singhal, 2012a (Fig.  3 )). Codes can also correspond to emotional properties (Clark & Paivio, 1991 ; Paivio, 2013 ). Clark and Paivio ( 1991 ) provide a thorough review of dual coding theory and its relation to education, while Paivio ( 2007 ) provides a comprehensive treatise on dual coding theory. Broadly, dual coding theory suggests that providing multiple representations of the same information enhances learning and memory, and that information that more readily evokes additional representations (through automatic imagery processes) receives a similar benefit.

Paivio and Csapo ( 1973 ) suggest that verbal and imaginal codes have independent and additive effects on memory recall. Using visuals to improve learning and memory has been particularly applied to vocabulary learning (Danan, 1992 ; Sadoski, 2005 ), but has also shown success in other domains such as in health care (Hartland, Biddle, & Fallacaro, 2008 ). To take advantage of dual coding, verbal information should be accompanied by a visual representation when possible. However, while the studies discussed all indicate that the use of multiple representations of information is favorable, it is important to acknowledge that each representation also increases cognitive load and can lead to over-saturation (Mayer & Moreno, 2003 ).

Given that pictures are generally remembered better than words, it is important to ensure that the pictures students are provided with are helpful and relevant to the content they are expected to learn. McNeill, Uttal, Jarvin, and Sternberg ( 2009 ) found that providing visual examples decreased conceptual errors. However, McNeill et al. also found that when students were given visually rich examples, they performed more poorly than students who were not given any visual example, suggesting that the visual details can at times become a distraction and hinder performance. Thus, it is important to consider that images used in teaching are clear and not ambiguous in their meaning (Schwartz, 2007 ).

Further broadening the scope of dual coding theory, Engelkamp and Zimmer ( 1984 ) suggest that motor movements, such as “turning the handle,” can provide an additional motor code that can improve memory, linking studies of motor actions (enactment) with dual coding theory (Clark & Paivio, 1991 ; Engelkamp & Cohen, 1991 ; Madan & Singhal, 2012c ). Indeed, enactment effects appear to primarily occur during learning, rather than during retrieval (Peterson & Mulligan, 2010 ). Along similar lines, Wammes, Meade, and Fernandes ( 2016 ) demonstrated that generating drawings can provide memory benefits beyond what could otherwise be explained by visual imagery, picture superiority, and other memory enhancing effects. Providing convergent evidence, even when overt motor actions are not critical in themselves, words representing functional objects have been shown to enhance later memory (Madan & Singhal, 2012b ; Montefinese, Ambrosini, Fairfield, & Mammarella, 2013 ). This indicates that motoric processes can improve memory similarly to visual imagery, similar to memory differences for concrete vs. abstract words. Further research suggests that automatic motor simulation for functional objects is likely responsible for this memory benefit (Madan, Chen, & Singhal, 2016 ).

When teachers combine visuals and words in their educational practice, however, they may not always be taking advantage of dual coding – at least, not in the optimal manner. For example, a recent discussion on Twitter centered around one teacher’s decision to have 7 th Grade students replace certain words in their science laboratory report with a picture of that word (e.g., the instructions read “using a syringe …” and a picture of a syringe replaced the word; Turner, 2016a ). Other teachers argued that this was not dual coding (Beaven, 2016 ; Williams, 2016 ), because there were no longer two different representations of the information. The first teacher maintained that dual coding was preserved, because this laboratory report with pictures was to be used alongside the original, fully verbal report (Turner, 2016b ). This particular implementation – having students replace individual words with pictures – has not been examined in the cognitive literature, presumably because no benefit would be expected. In any case, we need to be clearer about implementations for dual coding, and more research is needed to clarify how teachers can make use of the benefits conferred by multiple representations and picture superiority.

Critically, dual coding theory is distinct from the notion of “learning styles,” which describe the idea that individuals benefit from instruction that matches their modality preference. While this idea is pervasive and individuals often subjectively feel that they have a preference, evidence indicates that the learning styles theory is not supported by empirical findings (e.g., Kavale, Hirshoren, & Forness, 1998 ; Pashler, McDaniel, Rohrer, & Bjork, 2008 ; Rohrer & Pashler, 2012 ). That is, there is no evidence that instructing students in their preferred learning style leads to an overall improvement in learning (the “meshing” hypothesis). Moreover, learning styles have come to be described as a myth or urban legend within psychology (Coffield, Moseley, Hall, & Ecclestone, 2004 ; Hattie & Yates, 2014 ; Kirschner & van Merriënboer, 2013 ; Kirschner, 2017 ); skepticism about learning styles is a common stance amongst evidence-informed teachers (e.g., Saunders, 2016 ). Providing evidence against the notion of learning styles, Kraemer, Rosenberg, and Thompson-Schill ( 2009 ) found that individuals who scored as “verbalizers” and “visualizers” did not perform any better on experimental trials matching their preference. Instead, it has recently been shown that learning through one’s preferred learning style is associated with elevated subjective judgements of learning, but not objective performance (Knoll, Otani, Skeel, & Van Horn, 2017 ). In contrast to learning styles, dual coding is based on providing additional, complementary forms of information to enhance learning, rather than tailoring instruction to individuals’ preferences.

Genuine educational environments present many opportunities for combining the strategies outlined above. Spacing can be particularly potent for learning if it is combined with retrieval practice. The additive benefits of retrieval practice and spacing can be gained by engaging in retrieval practice multiple times (also known as distributed practice; see Cepeda et al., 2006 ). Interleaving naturally entails spacing if students interleave old and new material. Concrete examples can be both verbal and visual, making use of dual coding. In addition, the strategies of elaboration, concrete examples, and dual coding all work best when used as part of retrieval practice. For example, in the concept-mapping studies mentioned above (Blunt & Karpicke, 2014 ; Karpicke, Blunt, et al., 2014 ), creating concept maps while looking at course materials (e.g., a textbook) was not as effective for later memory as creating concept maps from memory. When practicing elaborative interrogation, students can start off answering the “how” and “why” questions they pose for themselves using class materials, and work their way up to answering them from memory. And when interleaving different problem types, students should be practicing answering them rather than just looking over worked examples.

But while these ideas for strategy combinations have empirical bases, it has not yet been established whether the benefits of the strategies to learning are additive, super-additive, or, in some cases, incompatible. Thus, future research needs to (a) better formalize the definition of each strategy (particularly critical for elaboration and dual coding), (b) identify best practices for implementation in the classroom, (c) delineate the boundary conditions of each strategy, and (d) strategically investigate interactions between the six strategies we outlined in this manuscript.

Aleven, V. A., & Koedinger, K. R. (2002). An effective metacognitive strategy: learning by doing and explaining with a computer-based cognitive tutor. Cognitive Science, 26 , 147–179.

Article   Google Scholar  

Anderson, J. R. (1983). A spreading activation theory of memory. Journal of Verbal Learning and Verbal Behavior, 22 , 261–295.

Arnold, K. M., & McDermott, K. B. (2013). Test-potentiated learning: distinguishing between direct and indirect effects of tests. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39 , 940–945.

PubMed   Google Scholar  

Aylwin, S. (1990). Imagery and affect: big questions, little answers. In P. J. Thompson, D. E. Marks, & J. T. E. Richardson (Eds.), Imagery: Current developments . New York: International Library of Psychology.

Google Scholar  

Baldassari, M. J., & Kelley, M. (2012). Make’em laugh? The mnemonic effect of humor in a speech. Psi Chi Journal of Psychological Research, 17 , 2–9.

Barker, P. G., & Manji, K. A. (1989). Pictorial dialogue methods. International Journal of Man-Machine Studies, 31 , 323–347.

Bauernschmidt, A. (2017). GUEST POST: two examples are better than one. [Blog post]. The Learning Scientists Blog . Retrieved from http://www.learningscientists.org/blog/2017/5/30-1 . Accessed 25 Dec 2017.

Beaven, T. (2016). @doctorwhy @FurtherEdagogy @doc_kristy Right, I thought the whole point of dual coding was to use TWO codes: pics + words of the SAME info? [Tweet]. Retrieved from https://twitter.com/TitaBeaven/status/807504041341308929 . Accessed 25 Dec 2017.

Bellezza, F. S., Cheesman, F. L., & Reddy, B. G. (1977). Organization and semantic elaboration in free recall. Journal of Experimental Psychology: Human Learning and Memory, 3 , 539–550.

Benney, D. (2016). (Trying to apply) spacing in a content heavy subject [Blog post]. Retrieved from https://mrbenney.wordpress.com/2016/10/16/trying-to-apply-spacing-in-science/ . Accessed 25 Dec 2017.

Berry, D. C. (1983). Metacognitive experience and transfer of logical reasoning. Quarterly Journal of Experimental Psychology, 35A , 39–49.

Birnbaum, M. S., Kornell, N., Bjork, E. L., & Bjork, R. A. (2013). Why interleaving enhances inductive learning: the roles of discrimination and retrieval. Memory & Cognition, 41 , 392–402.

Bjork, R. A. (1999). Assessing our own competence: heuristics and illusions. In D. Gopher & A. Koriat (Eds.), Attention and peformance XVII. Cognitive regulation of performance: Interaction of theory and application (pp. 435–459). Cambridge, MA: MIT Press.

Bjork, R. A. (1994). Memory and metamemory considerations in the training of human beings. In J. Metcalfe & A. Shimamura (Eds.), Metacognition: Knowing about knowing (pp. 185–205). Cambridge, MA: MIT Press.

Bjork, R. A., & Bjork, E. L. (1992). A new theory of disuse and an old theory of stimulus fluctuation. From learning processes to cognitive processes: Essays in honor of William K. Estes, 2 , 35–67.

Bjork, E. L., & Bjork, R. A. (2011). Making things hard on yourself, but in a good way: creating desirable difficulties to enhance learning. Psychology and the real world: Essays illustrating fundamental contributions to society , 56–64.

Blunt, J. R., & Karpicke, J. D. (2014). Learning with retrieval-based concept mapping. Journal of Educational Psychology, 106 , 849–858.

Boulton, K. (2016). What does cognitive overload look like in the humanities? [Blog post]. Retrieved from https://educationechochamberuncut.wordpress.com/2016/03/05/what-does-cognitive-overload-look-like-in-the-humanities-kris-boulton-2/ . Accessed 25 Dec 2017.

Brown, P. C., Roediger, H. L., & McDaniel, M. A. (2014). Make it stick . Cambridge, MA: Harvard University Press.

Book   Google Scholar  

Butler, A. C. (2010). Repeated testing produces superior transfer of learning relative to repeated studying. Journal of Experimental Psychology: Learning, Memory, and Cognition, 36 , 1118–1133.

Caplan, J. B., & Madan, C. R. (2016). Word-imageability enhances association-memory by recruiting hippocampal activity. Journal of Cognitive Neuroscience, 28 , 1522–1538.

Article   PubMed   Google Scholar  

Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed practice in verbal recall tasks: a review and quantitative synthesis. Psychological Bulletin, 132 , 354–380.

Cepeda, N. J., Vul, E., Rohrer, D., Wixted, J. T., & Pashler, H. (2008). Spacing effects in learning a temporal ridgeline of optimal retention. Psychological Science, 19 , 1095–1102.

Chi, M. T., De Leeuw, N., Chiu, M. H., & LaVancher, C. (1994). Eliciting self-explanations improves understanding. Cognitive Science, 18 , 439–477.

Chi, M. T., Feltovich, P. J., & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5 , 121–152.

CIFE. (2012). No January A level and other changes. Retrieved from http://www.cife.org.uk/cife-general-news/no-january-a-level-and-other-changes/ . Accessed 25 Dec 2017.

Clark, D. (2016). One book on learning that every teacher, lecturer & trainer should read (7 reasons) [Blog post]. Retrieved from http://donaldclarkplanb.blogspot.com/2016/03/one-book-on-learning-that-every-teacher.html . Accessed 25 Dec 2017.

Clark, J. M., & Paivio, A. (1991). Dual coding theory and education. Educational Psychology Review, 3 , 149–210.

Class Teaching. (2013). Deep questioning [Blog post]. Retrieved from https://classteaching.wordpress.com/2013/07/12/deep-questioning/ . Accessed 25 Dec 2017.

Clinton, V., Alibali, M. W., & Nathan, M. J. (2016). Learning about posterior probability: do diagrams and elaborative interrogation help? The Journal of Experimental Education, 84 , 579–599.

Coffield, F., Moseley, D., Hall, E., & Ecclestone, K. (2004). Learning styles and pedagogy in post-16 learning: a systematic and critical review . London: Learning & Skills Research Centre.

Cohen, R. L. (1981). On the generality of some memory laws. Scandinavian Journal of Psychology, 22 , 267–281.

Cooper, H. (1989). Synthesis of research on homework. Educational Leadership, 47 , 85–91.

Corbett, A. T., Reed, S. K., Hoffmann, R., MacLaren, B., & Wagner, A. (2010). Interleaving worked examples and cognitive tutor support for algebraic modeling of problem situations. In Proceedings of the Thirty-Second Annual Meeting of the Cognitive Science Society (pp. 2882–2887).

Cox, D. (2015). No stakes testing – not telling students their results [Blog post]. Retrieved from https://missdcoxblog.wordpress.com/2015/06/06/no-stakes-testing-not-telling-students-their-results/ . Accessed 25 Dec 2017.

Cox, D. (2016a). Ditch revision. Teach it well [Blog post]. Retrieved from https://missdcoxblog.wordpress.com/2016/01/09/ditch-revision-teach-it-well/ . Accessed 25 Dec 2017.

Cox, D. (2016b). ‘They need to remember this in three years time’: spacing & interleaving for the new GCSEs [Blog post]. Retrieved from https://missdcoxblog.wordpress.com/2016/03/25/they-need-to-remember-this-in-three-years-time-spacing-interleaving-for-the-new-gcses/ . Accessed 25 Dec 2017.

Craik, F. I. (2002). Levels of processing: past, present… future? Memory, 10 , 305–318.

Craik, F. I., & Lockhart, R. S. (1972). Levels of processing: a framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11 , 671–684.

Danan, M. (1992). Reversed subtitling and dual coding theory: new directions for foreign language instruction. Language Learning, 42 , 497–527.

Dettmers, S., Trautwein, U., & Lüdtke, O. (2009). The relationship between homework time and achievement is not universal: evidence from multilevel analyses in 40 countries. School Effectiveness and School Improvement, 20 , 375–405.

Dirkx, K. J., Kester, L., & Kirschner, P. A. (2014). The testing effect for learning principles and procedures from texts. The Journal of Educational Research, 107 , 357–364.

Dunlosky, J. (2013). Strengthening the student toolbox: study strategies to boost learning. American Educator, 37 (3), 12–21.

Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013). Improving students’ learning with effective learning techniques: promising directions from cognitive and educational psychology. Psychological Science in the Public Interest, 14 , 4–58.

Ebbinghaus, H. (1913). Memory (HA Ruger & CE Bussenius, Trans.). New York: Columbia University, Teachers College. (Original work published 1885) . Retrieved from http://psychclassics.yorku.ca/Ebbinghaus/memory8.htm . Accessed 25 Dec 2017.

Eglington, L. G., & Kang, S. H. (2016). Retrieval practice benefits deductive inference. Educational Psychology Review , 1–14.

Eitel, A., & Scheiter, K. (2015). Picture or text first? Explaining sequential effects when learning with pictures and text. Educational Psychology Review, 27 , 153–180.

Engelkamp, J., & Cohen, R. L. (1991). Current issues in memory of action events. Psychological Research, 53 , 175–182.

Engelkamp, J., & Zimmer, H. D. (1984). Motor programme information as a separable memory unit. Psychological Research, 46 , 283–299.

Fawcett, D. (2013). Can I be that little better at……using cognitive science/psychology/neurology to plan learning? [Blog post]. Retrieved from http://reflectionsofmyteaching.blogspot.com/2013/09/can-i-be-that-little-better-atusing.html . Accessed 25 Dec 2017.

Fiechter, J. L., & Benjamin, A. S. (2017). Diminishing-cues retrieval practice: a memory-enhancing technique that works when regular testing doesn’t. Psychonomic Bulletin & Review , 1–9.

Firth, J. (2016). Spacing in teaching practice [Blog post]. Retrieved from http://www.learningscientists.org/blog/2016/4/12-1 . Accessed 25 Dec 2017.

Fordham, M. [mfordhamhistory]. (2016). Is there a meaningful distinction in psychology between ‘thinking’ & ‘critical thinking’? [Tweet]. Retrieved from https://twitter.com/mfordhamhistory/status/809525713623781377 . Accessed 25 Dec 2017.

Fritz, C. O., Morris, P. E., Nolan, D., & Singleton, J. (2007). Expanding retrieval practice: an effective aid to preschool children’s learning. The Quarterly Journal of Experimental Psychology, 60 , 991–1004.

Gates, A. I. (1917). Recitation as a factory in memorizing. Archives of Psychology, 6.

Gick, M. L., & Holyoak, K. J. (1983). Schema induction and analogical transfer. Cognitive Psychology, 15 , 1–38.

Gorman, A. M. (1961). Recognition memory for nouns as a function of abstractedness and frequency. Journal of Experimental Psychology, 61 , 23–39.

Hainselin, M., Picard, L., Manolli, P., Vankerkore-Candas, S., & Bourdin, B. (2017). Hey teacher, don’t leave them kids alone: action is better for memory than reading. Frontiers in Psychology , 8 .

Harp, S. F., & Mayer, R. E. (1998). How seductive details do their damage. Journal of Educational Psychology, 90 , 414–434.

Hartland, W., Biddle, C., & Fallacaro, M. (2008). Audiovisual facilitation of clinical knowledge: A paradigm for dispersed student education based on Paivio’s dual coding theory. AANA Journal, 76 , 194–198.

Hattie, J., & Yates, G. (2014). Visible learning and the science of how we learn . New York: Routledge.

Hausman, H., & Kornell, N. (2014). Mixing topics while studying does not enhance learning. Journal of Applied Research in Memory and Cognition, 3 , 153–160.

Hinze, S. R., & Rapp, D. N. (2014). Retrieval (sometimes) enhances learning: performance pressure reduces the benefits of retrieval practice. Applied Cognitive Psychology, 28 , 597–606.

Hirshman, E. (2001). Elaboration in memory. In N. J. Smelser & P. B. Baltes (Eds.), International encyclopedia of the social & behavioral sciences (pp. 4369–4374). Oxford: Pergamon.

Chapter   Google Scholar  

Hobbiss, M. (2016). Make it meaningful! Elaboration [Blog post]. Retrieved from https://hobbolog.wordpress.com/2016/06/09/make-it-meaningful-elaboration/ . Accessed 25 Dec 2017.

Jones, F. (2016). Homework – is it really that useless? [Blog post]. Retrieved from http://www.learningscientists.org/blog/2016/4/5-1 . Accessed 25 Dec 2017.

Kaminski, J. A., & Sloutsky, V. M. (2013). Extraneous perceptual information interferes with children’s acquisition of mathematical knowledge. Journal of Educational Psychology, 105 (2), 351–363.

Kaminski, J. A., Sloutsky, V. M., & Heckler, A. F. (2008). The advantage of abstract examples in learning math. Science, 320 , 454–455.

Kang, S. H. (2016). Spaced repetition promotes efficient and effective learning policy implications for instruction. Policy Insights from the Behavioral and Brain Sciences, 3 , 12–19.

Kang, S. H. K., McDermott, K. B., & Roediger, H. L. (2007). Test format and corrective feedback modify the effects of testing on long-term retention. European Journal of Cognitive Psychology, 19 , 528–558.

Karpicke, J. D., & Aue, W. R. (2015). The testing effect is alive and well with complex materials. Educational Psychology Review, 27 , 317–326.

Karpicke, J. D., Blunt, J. R., Smith, M. A., & Karpicke, S. S. (2014). Retrieval-based learning: The need for guided retrieval in elementary school children. Journal of Applied Research in Memory and Cognition, 3 , 198–206.

Karpicke, J. D., Lehman, M., & Aue, W. R. (2014). Retrieval-based learning: an episodic context account. In B. H. Ross (Ed.), Psychology of Learning and Motivation (Vol. 61, pp. 237–284). San Diego, CA: Elsevier Academic Press.

Karpicke, J. D., Blunt, J. R., & Smith, M. A. (2016). Retrieval-based learning: positive effects of retrieval practice in elementary school children. Frontiers in Psychology, 7 .

Kavale, K. A., Hirshoren, A., & Forness, S. R. (1998). Meta-analytic validation of the Dunn and Dunn model of learning-style preferences: a critique of what was Dunn. Learning Disabilities Research & Practice, 13 , 75–80.

Khanna, M. M. (2015). Ungraded pop quizzes: test-enhanced learning without all the anxiety. Teaching of Psychology, 42 , 174–178.

Kirby, J. (2014). One scientific insight for curriculum design [Blog post]. Retrieved from https://pragmaticreform.wordpress.com/2014/05/05/scientificcurriculumdesign/ . Accessed 25 Dec 2017.

Kirschner, P. A. (2017). Stop propagating the learning styles myth. Computers & Education, 106 , 166–171.

Kirschner, P. A., & van Merriënboer, J. J. G. (2013). Do learners really know best? Urban legends in education. Educational Psychologist, 48 , 169–183.

Knoll, A. R., Otani, H., Skeel, R. L., & Van Horn, K. R. (2017). Learning style, judgments of learning, and learning of verbal and visual information. British Journal of Psychology, 108 , 544-563.

Kornell, N., & Bjork, R. A. (2008). Learning concepts and categories is spacing the “enemy of induction”? Psychological Science, 19 , 585–592.

Kornell, N., & Finn, B. (2016). Self-regulated learning: an overview of theory and data. In J. Dunlosky & S. Tauber (Eds.), The Oxford Handbook of Metamemory (pp. 325–340). New York: Oxford University Press.

Kornell, N., Klein, P. J., & Rawson, K. A. (2015). Retrieval attempts enhance learning, but retrieval success (versus failure) does not matter. Journal of Experimental Psychology: Learning, Memory, and Cognition, 41 , 283–294.

Kraemer, D. J. M., Rosenberg, L. M., & Thompson-Schill, S. L. (2009). The neural correlates of visual and verbal cognitive styles. Journal of Neuroscience, 29 , 3792–3798.

Article   PubMed   PubMed Central   Google Scholar  

Kraft, N. (2015). Spaced practice and repercussions for teaching. Retrieved from http://nathankraft.blogspot.com/2015/08/spaced-practice-and-repercussions-for.html . Accessed 25 Dec 2017.

Learning Scientists. (2016a). Weekly Digest #3: How teachers implement interleaving in their curriculum [Blog post]. Retrieved from http://www.learningscientists.org/blog/2016/3/28/weekly-digest-3 . Accessed 25 Dec 2017.

Learning Scientists. (2016b). Weekly Digest #13: how teachers implement retrieval in their classrooms [Blog post]. Retrieved from http://www.learningscientists.org/blog/2016/6/5/weekly-digest-13 . Accessed 25 Dec 2017.

Learning Scientists. (2016c). Weekly Digest #40: teachers’ implementation of principles from “Make It Stick” [Blog post]. Retrieved from http://www.learningscientists.org/blog/2016/12/18-1 . Accessed 25 Dec 2017.

Learning Scientists. (2017). Weekly Digest #54: is there an app for that? Studying 2.0 [Blog post]. Retrieved from http://www.learningscientists.org/blog/2017/4/9/weekly-digest-54 . Accessed 25 Dec 2017.

LeFevre, J.-A., & Dixon, P. (1986). Do written instructions need examples? Cognition and Instruction, 3 , 1–30.

Lew, K., Fukawa-Connelly, T., Mejí-Ramos, J. P., & Weber, K. (2016). Lectures in advanced mathematics: Why students might not understand what the mathematics professor is trying to convey. Journal of Research in Mathematics Education, 47 , 162–198.

Lindsey, R. V., Shroyer, J. D., Pashler, H., & Mozer, M. C. (2014). Improving students’ long-term knowledge retention through personalized review. Psychological Science, 25 , 639–647.

Lipko-Speed, A., Dunlosky, J., & Rawson, K. A. (2014). Does testing with feedback help grade-school children learn key concepts in science? Journal of Applied Research in Memory and Cognition, 3 , 171–176.

Lockhart, R. S., & Craik, F. I. (1990). Levels of processing: a retrospective commentary on a framework for memory research. Canadian Journal of Psychology, 44 , 87–112.

Lovell, O. (2017). How do we know what to put on the quiz? [Blog Post]. Retrieved from http://www.ollielovell.com/olliesclassroom/know-put-quiz/ . Accessed 25 Dec 2017.

Luehmann, A. L. (2008). Using blogging in support of teacher professional identity development: a case study. The Journal of the Learning Sciences, 17 , 287–337.

Madan, C. R., Glaholt, M. G., & Caplan, J. B. (2010). The influence of item properties on association-memory. Journal of Memory and Language, 63 , 46–63.

Madan, C. R., & Singhal, A. (2012a). Motor imagery and higher-level cognition: four hurdles before research can sprint forward. Cognitive Processing, 13 , 211–229.

Madan, C. R., & Singhal, A. (2012b). Encoding the world around us: motor-related processing influences verbal memory. Consciousness and Cognition, 21 , 1563–1570.

Madan, C. R., & Singhal, A. (2012c). Using actions to enhance memory: effects of enactment, gestures, and exercise on human memory. Frontiers in Psychology, 3 .

Madan, C. R., Chen, Y. Y., & Singhal, A. (2016). ERPs differentially reflect automatic and deliberate processing of the functional manipulability of objects. Frontiers in Human Neuroscience, 10 .

Mandler, G. (1979). Organization and repetition: organizational principles with special reference to rote learning. In L. G. Nilsson (Ed.), Perspectives on Memory Research (pp. 293–327). New York: Academic Press.

Marsh, E. J., Fazio, L. K., & Goswick, A. E. (2012). Memorial consequences of testing school-aged children. Memory, 20 , 899–906.

Mayer, R. E., & Gallini, J. K. (1990). When is an illustration worth ten thousand words? Journal of Educational Psychology, 82 , 715–726.

Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38 , 43–52.

McDaniel, M. A., & Donnelly, C. M. (1996). Learning with analogy and elaborative interrogation. Journal of Educational Psychology, 88 , 508–519.

McDaniel, M. A., Thomas, R. C., Agarwal, P. K., McDermott, K. B., & Roediger, H. L. (2013). Quizzing in middle-school science: successful transfer performance on classroom exams. Applied Cognitive Psychology, 27 , 360–372.

McDermott, K. B., Agarwal, P. K., D’Antonio, L., Roediger, H. L., & McDaniel, M. A. (2014). Both multiple-choice and short-answer quizzes enhance later exam performance in middle and high school classes. Journal of Experimental Psychology: Applied, 20 , 3–21.

McHugh, A. (2013). High-stakes tests: bad for students, teachers, and education in general [Blog post]. Retrieved from https://teacherbiz.wordpress.com/2013/07/01/high-stakes-tests-bad-for-students-teachers-and-education-in-general/ . Accessed 25 Dec 2017.

McNeill, N. M., Uttal, D. H., Jarvin, L., & Sternberg, R. J. (2009). Should you show me the money? Concrete objects both hurt and help performance on mathematics problems. Learning and Instruction, 19 , 171–184.

Meider, W. (1990). “A picture is worth a thousand words”: from advertising slogan to American proverb. Southern Folklore, 47 , 207–225.

Michaela Community School. (2014). Homework. Retrieved from http://mcsbrent.co.uk/homework-2/ . Accessed 25 Dec 2017.

Montefinese, M., Ambrosini, E., Fairfield, B., & Mammarella, N. (2013). The “subjective” pupil old/new effect: is the truth plain to see? International Journal of Psychophysiology, 89 , 48–56.

O’Neil, H. F., Chung, G. K., Kerr, D., Vendlinski, T. P., Buschang, R. E., & Mayer, R. E. (2014). Adding self-explanation prompts to an educational computer game. Computers In Human Behavior, 30 , 23–28.

Overoye, A. L., & Storm, B. C. (2015). Harnessing the power of uncertainty to enhance learning. Translational Issues in Psychological Science, 1 , 140–148.

Paivio, A. (1971). Imagery and verbal processes . New York: Holt, Rinehart and Winston.

Paivio, A. (1986). Mental representations: a dual coding approach . New York: Oxford University Press.

Paivio, A. (2007). Mind and its evolution: a dual coding theoretical approach . Mahwah: Erlbaum.

Paivio, A. (2013). Dual coding theory, word abstractness, and emotion: a critical review of Kousta et al. (2011). Journal of Experimental Psychology: General, 142 , 282–287.

Paivio, A., & Csapo, K. (1969). Concrete image and verbal memory codes. Journal of Experimental Psychology, 80 , 279–285.

Paivio, A., & Csapo, K. (1973). Picture superiority in free recall: imagery or dual coding? Cognitive Psychology, 5 , 176–206.

Paivio, A., Walsh, M., & Bons, T. (1994). Concreteness effects on memory: when and why? Journal of Experimental Psychology: Learning, Memory, and Cognition, 20 , 1196–1204.

Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008). Learning styles: concepts and evidence. Psychological Science in the Public Interest, 9 , 105–119.

Pashler, H., Bain, P. M., Bottge, B. A., Graesser, A., Koedinger, K., McDaniel, M., & Metcalfe, J. (2007). Organizing instruction and study to improve student learning. IES practice guide. NCER 2007–2004. National Center for Education Research .

Patel, R., Liu, R., & Koedinger, K. (2016). When to block versus interleave practice? Evidence against teaching fraction addition before fraction multiplication. In Proceedings of the 38th Annual Meeting of the Cognitive Science Society, Philadelphia, PA .

Penfound, B. (2017). Journey to interleaved practice #2 [Blog Post]. Retrieved from https://fullstackcalculus.com/2017/02/03/journey-to-interleaved-practice-2/ . Accessed 25 Dec 2017.

Penfound, B. [BryanPenfound]. (2016). Does blocked practice/learning lessen cognitive load? Does interleaved practice/learning provide productive struggle? [Tweet]. Retrieved from https://twitter.com/BryanPenfound/status/808759362244087808 . Accessed 25 Dec 2017.

Peterson, D. J., & Mulligan, N. W. (2010). Enactment and retrieval. Memory & Cognition, 38 , 233–243.

Picciotto, H. (2009). Lagging homework [Blog post]. Retrieved from http://blog.mathedpage.org/2013/06/lagging-homework.html . Accessed 25 Dec 2017.

Pomerance, L., Greenberg, J., & Walsh, K. (2016). Learning about learning: what every teacher needs to know. Retrieved from http://www.nctq.org/dmsView/Learning_About_Learning_Report . Accessed 25 Dec 2017.

Postman, L. (1976). Methodology of human learning. In W. K. Estes (Ed.), Handbook of learning and cognitive processes (Vol. 3). Hillsdale: Erlbaum.

Pressley, M., McDaniel, M. A., Turnure, J. E., Wood, E., & Ahmad, M. (1987). Generation and precision of elaboration: effects on intentional and incidental learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13 , 291–300.

Reed, S. K. (2008). Concrete examples must jibe with experience. Science, 322 , 1632–1633.

researchED. (2013). How it all began. Retrieved from http://www.researched.org.uk/about/our-story/ . Accessed 25 Dec 2017.

Ritchie, S. J., Della Sala, S., & McIntosh, R. D. (2013). Retrieval practice, with or without mind mapping, boosts fact learning in primary school children. PLoS One, 8 (11), e78976.

Rittle-Johnson, B. (2006). Promoting transfer: effects of self-explanation and direct instruction. Child Development, 77 , 1–15.

Roediger, H. L. (1985). Remembering Ebbinghaus. [Retrospective review of the book On Memory , by H. Ebbinghaus]. Contemporary Psychology, 30 , 519–523.

Roediger, H. L. (2013). Applying cognitive psychology to education translational educational science. Psychological Science in the Public Interest, 14 , 1–3.

Roediger, H. L., & Karpicke, J. D. (2006). The power of testing memory: basic research and implications for educational practice. Perspectives on Psychological Science, 1 , 181–210.

Roediger, H. L., Putnam, A. L., & Smith, M. A. (2011). Ten benefits of testing and their applications to educational practice. In J. Mester & B. Ross (Eds.), The psychology of learning and motivation: cognition in education (pp. 1–36). Oxford: Elsevier.

Roediger, H. L., Finn, B., & Weinstein, Y. (2012). Applications of cognitive science to education. In Della Sala, S., & Anderson, M. (Eds.), Neuroscience in education: the good, the bad, and the ugly . Oxford, UK: Oxford University Press.

Roelle, J., & Berthold, K. (2017). Effects of incorporating retrieval into learning tasks: the complexity of the tasks matters. Learning and Instruction, 49 , 142–156.

Rohrer, D. (2012). Interleaving helps students distinguish among similar concepts. Educational Psychology Review, 24(3), 355–367.

Rohrer, D., Dedrick, R. F., & Stershic, S. (2015). Interleaved practice improves mathematics learning. Journal of Educational Psychology, 107 , 900–908.

Rohrer, D., & Pashler, H. (2012). Learning styles: Where’s the evidence? Medical Education, 46 , 34–35.

Rohrer, D., & Taylor, K. (2007). The shuffling of mathematics problems improves learning. Instructional Science, 35 , 481–498.

Rose, N. (2014). Improving the effectiveness of homework [Blog post]. Retrieved from https://evidenceintopractice.wordpress.com/2014/03/20/improving-the-effectiveness-of-homework/ . Accessed 25 Dec 2017.

Sadoski, M. (2005). A dual coding view of vocabulary learning. Reading & Writing Quarterly, 21 , 221–238.

Saunders, K. (2016). It really is time we stopped talking about learning styles [Blog post]. Retrieved from http://martingsaunders.com/2016/10/it-really-is-time-we-stopped-talking-about-learning-styles/ . Accessed 25 Dec 2017.

Schwartz, D. (2007). If a picture is worth a thousand words, why are you reading this essay? Social Psychology Quarterly, 70 , 319–321.

Shumaker, H. (2016). Homework is wrecking our kids: the research is clear, let’s ban elementary homework. Salon. Retrieved from http://www.salon.com/2016/03/05/homework_is_wrecking_our_kids_the_research_is_clear_lets_ban_elementary_homework . Accessed 25 Dec 2017.

Smith, A. M., Floerke, V. A., & Thomas, A. K. (2016). Retrieval practice protects memory against acute stress. Science, 354 , 1046–1048.

Smith, M. A., Blunt, J. R., Whiffen, J. W., & Karpicke, J. D. (2016). Does providing prompts during retrieval practice improve learning? Applied Cognitive Psychology, 30 , 784–802.

Smith, M. A., & Karpicke, J. D. (2014). Retrieval practice with short-answer, multiple-choice, and hybrid formats. Memory, 22 , 784–802.

Smith, M. A., Roediger, H. L., & Karpicke, J. D. (2013). Covert retrieval practice benefits retention as much as overt retrieval practice. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39 , 1712–1725.

Son, J. Y., & Rivas, M. J. (2016). Designing clicker questions to stimulate transfer. Scholarship of Teaching and Learning in Psychology, 2 , 193–207.

Szpunar, K. K., Khan, N. Y., & Schacter, D. L. (2013). Interpolated memory tests reduce mind wandering and improve learning of online lectures. Proceedings of the National Academy of Sciences, 110 , 6313–6317.

Thomson, R., & Mehring, J. (2016). Better vocabulary study strategies for long-term learning. Kwansei Gakuin University Humanities Review, 20 , 133–141.

Trafton, J. G., & Reiser, B. J. (1993). Studying examples and solving problems: contributions to skill acquisition . Technical report, Naval HCI Research Lab, Washington, DC, USA.

Tran, R., Rohrer, D., & Pashler, H. (2015). Retrieval practice: the lack of transfer to deductive inferences. Psychonomic Bulletin & Review, 22 , 135–140.

Turner, K. [doc_kristy]. (2016a). My dual coding (in red) and some y8 work @AceThatTest they really enjoyed practising the technique [Tweet]. Retrieved from https://twitter.com/doc_kristy/status/807220355395977216 . Accessed 25 Dec 2017.

Turner, K. [doc_kristy]. (2016b). @FurtherEdagogy @doctorwhy their work is revision work, they already have the words on a different page, to compliment not replace [Tweet]. Retrieved from https://twitter.com/doc_kristy/status/807360265100599301 . Accessed 25 Dec 2017.

Valle, A., Regueiro, B., Núñez, J. C., Rodríguez, S., Piñeiro, I., & Rosário, P. (2016). Academic goals, student homework engagement, and academic achievement in elementary school. Frontiers in Psychology, 7 .

Van Gog, T., & Sweller, J. (2015). Not new, but nearly forgotten: the testing effect decreases or even disappears as the complexity of learning materials increases. Educational Psychology Review, 27 , 247–264.

Wammes, J. D., Meade, M. E., & Fernandes, M. A. (2016). The drawing effect: evidence for reliable and robust memory benefits in free recall. Quarterly Journal of Experimental Psychology, 69 , 1752–1776.

Weinstein, Y., Gilmore, A. W., Szpunar, K. K., & McDermott, K. B. (2014). The role of test expectancy in the build-up of proactive interference in long-term memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 40 , 1039–1048.

Weinstein, Y., Nunes, L. D., & Karpicke, J. D. (2016). On the placement of practice questions during study. Journal of Experimental Psychology: Applied, 22 , 72–84.

Weinstein, Y., & Weinstein-Jones, F. (2017). Topic and quiz spacing spreadsheet: a planning tool for teachers [Blog Post]. Retrieved from http://www.learningscientists.org/blog/2017/5/11-1 . Accessed 25 Dec 2017.

Weinstein-Jones, F., & Weinstein, Y. (2017). Topic spacing spreadsheet for teachers [Excel macro]. Zenodo. http://doi.org/10.5281/zenodo.573764 . Accessed 25 Dec 2017.

Williams, D. [FurtherEdagogy]. (2016). @doctorwhy @doc_kristy word accompanying the visual? I’m unclear how removing words benefit? Would a flow chart better suit a scientific exp? [Tweet]. Retrieved from https://twitter.com/FurtherEdagogy/status/807356800509104128 . Accessed 25 Dec 2017.

Wood, B. (2017). And now for something a little bit different….[Blog post]. Retrieved from https://justateacherstandinginfrontofaclass.wordpress.com/2017/04/20/and-now-for-something-a-little-bit-different/ . Accessed 25 Dec 2017.

Wooldridge, C. L., Bugg, J. M., McDaniel, M. A., & Liu, Y. (2014). The testing effect with authentic educational materials: a cautionary note. Journal of Applied Research in Memory and Cognition, 3 , 214–221.

Young, C. (2016). Mini-tests. Retrieved from https://colleenyoung.wordpress.com/revision-activities/mini-tests/ . Accessed 25 Dec 2017.

Download references

Acknowledgements

Not applicable.

YW and MAS were partially supported by a grant from The IDEA Center.

Availability of data and materials

Author information, authors and affiliations.

Department of Psychology, University of Massachusetts Lowell, Lowell, MA, USA

Yana Weinstein

Department of Psychology, Boston College, Chestnut Hill, MA, USA

Christopher R. Madan

School of Psychology, University of Nottingham, Nottingham, UK

Department of Psychology, Rhode Island College, Providence, RI, USA

Megan A. Sumeracki

You can also search for this author in PubMed   Google Scholar

Contributions

YW took the lead on writing the “Spaced practice”, “Interleaving”, and “Elaboration” sections. CRM took the lead on writing the “Concrete examples” and “Dual coding” sections. MAS took the lead on writing the “Retrieval practice” section. All authors edited each others’ sections. All authors were involved in the conception and writing of the manuscript. All authors gave approval of the final version.

Corresponding author

Correspondence to Yana Weinstein .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

YW and MAS run a blog, “The Learning Scientists Blog”, which is cited in the tutorial review. The blog does not make money. Free resources on the strategies described in this tutorial review are provided on the blog. Occasionally, YW and MAS are invited by schools/school districts to present research findings from cognitive psychology applied to education.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article.

Weinstein, Y., Madan, C.R. & Sumeracki, M.A. Teaching the science of learning. Cogn. Research 3 , 2 (2018). https://doi.org/10.1186/s41235-017-0087-y

Download citation

Received : 20 December 2016

Accepted : 02 December 2017

Published : 24 January 2018

DOI : https://doi.org/10.1186/s41235-017-0087-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

abstract learning research

  • USC Libraries
  • Research Guides

Organizing Your Social Sciences Research Paper

  • 3. The Abstract
  • Purpose of Guide
  • Design Flaws to Avoid
  • Independent and Dependent Variables
  • Glossary of Research Terms
  • Reading Research Effectively
  • Narrowing a Topic Idea
  • Broadening a Topic Idea
  • Extending the Timeliness of a Topic Idea
  • Academic Writing Style
  • Applying Critical Thinking
  • Choosing a Title
  • Making an Outline
  • Paragraph Development
  • Research Process Video Series
  • Executive Summary
  • The C.A.R.S. Model
  • Background Information
  • The Research Problem/Question
  • Theoretical Framework
  • Citation Tracking
  • Content Alert Services
  • Evaluating Sources
  • Primary Sources
  • Secondary Sources
  • Tiertiary Sources
  • Scholarly vs. Popular Publications
  • Qualitative Methods
  • Quantitative Methods
  • Insiderness
  • Using Non-Textual Elements
  • Limitations of the Study
  • Common Grammar Mistakes
  • Writing Concisely
  • Avoiding Plagiarism
  • Footnotes or Endnotes?
  • Further Readings
  • Generative AI and Writing
  • USC Libraries Tutorials and Other Guides
  • Bibliography

An abstract summarizes, usually in one paragraph of 300 words or less, the major aspects of the entire paper in a prescribed sequence that includes: 1) the overall purpose of the study and the research problem(s) you investigated; 2) the basic design of the study; 3) major findings or trends found as a result of your analysis; and, 4) a brief summary of your interpretations and conclusions.

Writing an Abstract. The Writing Center. Clarion University, 2009; Writing an Abstract for Your Research Paper. The Writing Center, University of Wisconsin, Madison; Koltay, Tibor. Abstracts and Abstracting: A Genre and Set of Skills for the Twenty-first Century . Oxford, UK: Chandos Publishing, 2010;

Importance of a Good Abstract

Sometimes your professor will ask you to include an abstract, or general summary of your work, with your research paper. The abstract allows you to elaborate upon each major aspect of the paper and helps readers decide whether they want to read the rest of the paper. Therefore, enough key information [e.g., summary results, observations, trends, etc.] must be included to make the abstract useful to someone who may want to examine your work.

How do you know when you have enough information in your abstract? A simple rule-of-thumb is to imagine that you are another researcher doing a similar study. Then ask yourself: if your abstract was the only part of the paper you could access, would you be happy with the amount of information presented there? Does it tell the whole story about your study? If the answer is "no" then the abstract likely needs to be revised.

Farkas, David K. “A Scheme for Understanding and Writing Summaries.” Technical Communication 67 (August 2020): 45-60;  How to Write a Research Abstract. Office of Undergraduate Research. University of Kentucky; Staiger, David L. “What Today’s Students Need to Know about Writing Abstracts.” International Journal of Business Communication January 3 (1966): 29-33; Swales, John M. and Christine B. Feak. Abstracts and the Writing of Abstracts . Ann Arbor, MI: University of Michigan Press, 2009.

Structure and Writing Style

I.  Types of Abstracts

To begin, you need to determine which type of abstract you should include with your paper. There are four general types.

Critical Abstract A critical abstract provides, in addition to describing main findings and information, a judgment or comment about the study’s validity, reliability, or completeness. The researcher evaluates the paper and often compares it with other works on the same subject. Critical abstracts are generally 400-500 words in length due to the additional interpretive commentary. These types of abstracts are used infrequently.

Descriptive Abstract A descriptive abstract indicates the type of information found in the work. It makes no judgments about the work, nor does it provide results or conclusions of the research. It does incorporate key words found in the text and may include the purpose, methods, and scope of the research. Essentially, the descriptive abstract only describes the work being summarized. Some researchers consider it an outline of the work, rather than a summary. Descriptive abstracts are usually very short, 100 words or less. Informative Abstract The majority of abstracts are informative. While they still do not critique or evaluate a work, they do more than describe it. A good informative abstract acts as a surrogate for the work itself. That is, the researcher presents and explains all the main arguments and the important results and evidence in the paper. An informative abstract includes the information that can be found in a descriptive abstract [purpose, methods, scope] but it also includes the results and conclusions of the research and the recommendations of the author. The length varies according to discipline, but an informative abstract is usually no more than 300 words in length.

Highlight Abstract A highlight abstract is specifically written to attract the reader’s attention to the study. No pretense is made of there being either a balanced or complete picture of the paper and, in fact, incomplete and leading remarks may be used to spark the reader’s interest. In that a highlight abstract cannot stand independent of its associated article, it is not a true abstract and, therefore, rarely used in academic writing.

II.  Writing Style

Use the active voice when possible , but note that much of your abstract may require passive sentence constructions. Regardless, write your abstract using concise, but complete, sentences. Get to the point quickly and always use the past tense because you are reporting on a study that has been completed.

Abstracts should be formatted as a single paragraph in a block format and with no paragraph indentations. In most cases, the abstract page immediately follows the title page. Do not number the page. Rules set forth in writing manual vary but, in general, you should center the word "Abstract" at the top of the page with double spacing between the heading and the abstract. The final sentences of an abstract concisely summarize your study’s conclusions, implications, or applications to practice and, if appropriate, can be followed by a statement about the need for additional research revealed from the findings.

Composing Your Abstract

Although it is the first section of your paper, the abstract should be written last since it will summarize the contents of your entire paper. A good strategy to begin composing your abstract is to take whole sentences or key phrases from each section of the paper and put them in a sequence that summarizes the contents. Then revise or add connecting phrases or words to make the narrative flow clearly and smoothly. Note that statistical findings should be reported parenthetically [i.e., written in parentheses].

Before handing in your final paper, check to make sure that the information in the abstract completely agrees with what you have written in the paper. Think of the abstract as a sequential set of complete sentences describing the most crucial information using the fewest necessary words. The abstract SHOULD NOT contain:

  • A catchy introductory phrase, provocative quote, or other device to grab the reader's attention,
  • Lengthy background or contextual information,
  • Redundant phrases, unnecessary adverbs and adjectives, and repetitive information;
  • Acronyms or abbreviations,
  • References to other literature [say something like, "current research shows that..." or "studies have indicated..."],
  • Using ellipticals [i.e., ending with "..."] or incomplete sentences,
  • Jargon or terms that may be confusing to the reader,
  • Citations to other works, and
  • Any sort of image, illustration, figure, or table, or references to them.

Abstract. Writing Center. University of Kansas; Abstract. The Structure, Format, Content, and Style of a Journal-Style Scientific Paper. Department of Biology. Bates College; Abstracts. The Writing Center. University of North Carolina; Borko, Harold and Seymour Chatman. "Criteria for Acceptable Abstracts: A Survey of Abstracters' Instructions." American Documentation 14 (April 1963): 149-160; Abstracts. The Writer’s Handbook. Writing Center. University of Wisconsin, Madison; Hartley, James and Lucy Betts. "Common Weaknesses in Traditional Abstracts in the Social Sciences." Journal of the American Society for Information Science and Technology 60 (October 2009): 2010-2018; Koltay, Tibor. Abstracts and Abstracting: A Genre and Set of Skills for the Twenty-first Century. Oxford, UK: Chandos Publishing, 2010; Procter, Margaret. The Abstract. University College Writing Centre. University of Toronto; Riordan, Laura. “Mastering the Art of Abstracts.” The Journal of the American Osteopathic Association 115 (January 2015 ): 41-47; Writing Report Abstracts. The Writing Lab and The OWL. Purdue University; Writing Abstracts. Writing Tutorial Services, Center for Innovative Teaching and Learning. Indiana University; Koltay, Tibor. Abstracts and Abstracting: A Genre and Set of Skills for the Twenty-First Century . Oxford, UK: 2010; Writing an Abstract for Your Research Paper. The Writing Center, University of Wisconsin, Madison.

Writing Tip

Never Cite Just the Abstract!

Citing to just a journal article's abstract does not confirm for the reader that you have conducted a thorough or reliable review of the literature. If the full-text is not available, go to the USC Libraries main page and enter the title of the article [NOT the title of the journal]. If the Libraries have a subscription to the journal, the article should appear with a link to the full-text or to the journal publisher page where you can get the article. If the article does not appear, try searching Google Scholar using the link on the USC Libraries main page. If you still can't find the article after doing this, contact a librarian or you can request it from our free i nterlibrary loan and document delivery service .

  • << Previous: Research Process Video Series
  • Next: Executive Summary >>
  • Last Updated: May 22, 2024 12:03 PM
  • URL: https://libguides.usc.edu/writingguide

The Writing Center • University of North Carolina at Chapel Hill

What this handout is about

This handout provides definitions and examples of the two main types of abstracts: descriptive and informative. It also provides guidelines for constructing an abstract and general tips for you to keep in mind when drafting. Finally, it includes a few examples of abstracts broken down into their component parts.

What is an abstract?

An abstract is a self-contained, short, and powerful statement that describes a larger work. Components vary according to discipline. An abstract of a social science or scientific work may contain the scope, purpose, results, and contents of the work. An abstract of a humanities work may contain the thesis, background, and conclusion of the larger work. An abstract is not a review, nor does it evaluate the work being abstracted. While it contains key words found in the larger work, the abstract is an original document rather than an excerpted passage.

Why write an abstract?

You may write an abstract for various reasons. The two most important are selection and indexing. Abstracts allow readers who may be interested in a longer work to quickly decide whether it is worth their time to read it. Also, many online databases use abstracts to index larger works. Therefore, abstracts should contain keywords and phrases that allow for easy searching.

Say you are beginning a research project on how Brazilian newspapers helped Brazil’s ultra-liberal president Luiz Ignácio da Silva wrest power from the traditional, conservative power base. A good first place to start your research is to search Dissertation Abstracts International for all dissertations that deal with the interaction between newspapers and politics. “Newspapers and politics” returned 569 hits. A more selective search of “newspapers and Brazil” returned 22 hits. That is still a fair number of dissertations. Titles can sometimes help winnow the field, but many titles are not very descriptive. For example, one dissertation is titled “Rhetoric and Riot in Rio de Janeiro.” It is unclear from the title what this dissertation has to do with newspapers in Brazil. One option would be to download or order the entire dissertation on the chance that it might speak specifically to the topic. A better option is to read the abstract. In this case, the abstract reveals the main focus of the dissertation:

This dissertation examines the role of newspaper editors in the political turmoil and strife that characterized late First Empire Rio de Janeiro (1827-1831). Newspaper editors and their journals helped change the political culture of late First Empire Rio de Janeiro by involving the people in the discussion of state. This change in political culture is apparent in Emperor Pedro I’s gradual loss of control over the mechanisms of power. As the newspapers became more numerous and powerful, the Emperor lost his legitimacy in the eyes of the people. To explore the role of the newspapers in the political events of the late First Empire, this dissertation analyzes all available newspapers published in Rio de Janeiro from 1827 to 1831. Newspapers and their editors were leading forces in the effort to remove power from the hands of the ruling elite and place it under the control of the people. In the process, newspapers helped change how politics operated in the constitutional monarchy of Brazil.

From this abstract you now know that although the dissertation has nothing to do with modern Brazilian politics, it does cover the role of newspapers in changing traditional mechanisms of power. After reading the abstract, you can make an informed judgment about whether the dissertation would be worthwhile to read.

Besides selection, the other main purpose of the abstract is for indexing. Most article databases in the online catalog of the library enable you to search abstracts. This allows for quick retrieval by users and limits the extraneous items recalled by a “full-text” search. However, for an abstract to be useful in an online retrieval system, it must incorporate the key terms that a potential researcher would use to search. For example, if you search Dissertation Abstracts International using the keywords “France” “revolution” and “politics,” the search engine would search through all the abstracts in the database that included those three words. Without an abstract, the search engine would be forced to search titles, which, as we have seen, may not be fruitful, or else search the full text. It’s likely that a lot more than 60 dissertations have been written with those three words somewhere in the body of the entire work. By incorporating keywords into the abstract, the author emphasizes the central topics of the work and gives prospective readers enough information to make an informed judgment about the applicability of the work.

When do people write abstracts?

  • when submitting articles to journals, especially online journals
  • when applying for research grants
  • when writing a book proposal
  • when completing the Ph.D. dissertation or M.A. thesis
  • when writing a proposal for a conference paper
  • when writing a proposal for a book chapter

Most often, the author of the entire work (or prospective work) writes the abstract. However, there are professional abstracting services that hire writers to draft abstracts of other people’s work. In a work with multiple authors, the first author usually writes the abstract. Undergraduates are sometimes asked to draft abstracts of books/articles for classmates who have not read the larger work.

Types of abstracts

There are two types of abstracts: descriptive and informative. They have different aims, so as a consequence they have different components and styles. There is also a third type called critical, but it is rarely used. If you want to find out more about writing a critique or a review of a work, see the UNC Writing Center handout on writing a literature review . If you are unsure which type of abstract you should write, ask your instructor (if the abstract is for a class) or read other abstracts in your field or in the journal where you are submitting your article.

Descriptive abstracts

A descriptive abstract indicates the type of information found in the work. It makes no judgments about the work, nor does it provide results or conclusions of the research. It does incorporate key words found in the text and may include the purpose, methods, and scope of the research. Essentially, the descriptive abstract describes the work being abstracted. Some people consider it an outline of the work, rather than a summary. Descriptive abstracts are usually very short—100 words or less.

Informative abstracts

The majority of abstracts are informative. While they still do not critique or evaluate a work, they do more than describe it. A good informative abstract acts as a surrogate for the work itself. That is, the writer presents and explains all the main arguments and the important results and evidence in the complete article/paper/book. An informative abstract includes the information that can be found in a descriptive abstract (purpose, methods, scope) but also includes the results and conclusions of the research and the recommendations of the author. The length varies according to discipline, but an informative abstract is rarely more than 10% of the length of the entire work. In the case of a longer work, it may be much less.

Here are examples of a descriptive and an informative abstract of this handout on abstracts . Descriptive abstract:

The two most common abstract types—descriptive and informative—are described and examples of each are provided.

Informative abstract:

Abstracts present the essential elements of a longer work in a short and powerful statement. The purpose of an abstract is to provide prospective readers the opportunity to judge the relevance of the longer work to their projects. Abstracts also include the key terms found in the longer work and the purpose and methods of the research. Authors abstract various longer works, including book proposals, dissertations, and online journal articles. There are two main types of abstracts: descriptive and informative. A descriptive abstract briefly describes the longer work, while an informative abstract presents all the main arguments and important results. This handout provides examples of various types of abstracts and instructions on how to construct one.

Which type should I use?

Your best bet in this case is to ask your instructor or refer to the instructions provided by the publisher. You can also make a guess based on the length allowed; i.e., 100-120 words = descriptive; 250+ words = informative.

How do I write an abstract?

The format of your abstract will depend on the work being abstracted. An abstract of a scientific research paper will contain elements not found in an abstract of a literature article, and vice versa. However, all abstracts share several mandatory components, and there are also some optional parts that you can decide to include or not. When preparing to draft your abstract, keep the following key process elements in mind:

  • Reason for writing: What is the importance of the research? Why would a reader be interested in the larger work?
  • Problem: What problem does this work attempt to solve? What is the scope of the project? What is the main argument/thesis/claim?
  • Methodology: An abstract of a scientific work may include specific models or approaches used in the larger study. Other abstracts may describe the types of evidence used in the research.
  • Results: Again, an abstract of a scientific work may include specific data that indicates the results of the project. Other abstracts may discuss the findings in a more general way.
  • Implications: What changes should be implemented as a result of the findings of the work? How does this work add to the body of knowledge on the topic?

(This list of elements is adapted with permission from Philip Koopman, “How to Write an Abstract.” )

All abstracts include:

  • A full citation of the source, preceding the abstract.
  • The most important information first.
  • The same type and style of language found in the original, including technical language.
  • Key words and phrases that quickly identify the content and focus of the work.
  • Clear, concise, and powerful language.

Abstracts may include:

  • The thesis of the work, usually in the first sentence.
  • Background information that places the work in the larger body of literature.
  • The same chronological structure as the original work.

How not to write an abstract:

  • Do not refer extensively to other works.
  • Do not add information not contained in the original work.
  • Do not define terms.

If you are abstracting your own writing

When abstracting your own work, it may be difficult to condense a piece of writing that you have agonized over for weeks (or months, or even years) into a 250-word statement. There are some tricks that you could use to make it easier, however.

Reverse outlining:

This technique is commonly used when you are having trouble organizing your own writing. The process involves writing down the main idea of each paragraph on a separate piece of paper– see our short video . For the purposes of writing an abstract, try grouping the main ideas of each section of the paper into a single sentence. Practice grouping ideas using webbing or color coding .

For a scientific paper, you may have sections titled Purpose, Methods, Results, and Discussion. Each one of these sections will be longer than one paragraph, but each is grouped around a central idea. Use reverse outlining to discover the central idea in each section and then distill these ideas into one statement.

Cut and paste:

To create a first draft of an abstract of your own work, you can read through the entire paper and cut and paste sentences that capture key passages. This technique is useful for social science research with findings that cannot be encapsulated by neat numbers or concrete results. A well-written humanities draft will have a clear and direct thesis statement and informative topic sentences for paragraphs or sections. Isolate these sentences in a separate document and work on revising them into a unified paragraph.

If you are abstracting someone else’s writing

When abstracting something you have not written, you cannot summarize key ideas just by cutting and pasting. Instead, you must determine what a prospective reader would want to know about the work. There are a few techniques that will help you in this process:

Identify key terms:

Search through the entire document for key terms that identify the purpose, scope, and methods of the work. Pay close attention to the Introduction (or Purpose) and the Conclusion (or Discussion). These sections should contain all the main ideas and key terms in the paper. When writing the abstract, be sure to incorporate the key terms.

Highlight key phrases and sentences:

Instead of cutting and pasting the actual words, try highlighting sentences or phrases that appear to be central to the work. Then, in a separate document, rewrite the sentences and phrases in your own words.

Don’t look back:

After reading the entire work, put it aside and write a paragraph about the work without referring to it. In the first draft, you may not remember all the key terms or the results, but you will remember what the main point of the work was. Remember not to include any information you did not get from the work being abstracted.

Revise, revise, revise

No matter what type of abstract you are writing, or whether you are abstracting your own work or someone else’s, the most important step in writing an abstract is to revise early and often. When revising, delete all extraneous words and incorporate meaningful and powerful words. The idea is to be as clear and complete as possible in the shortest possible amount of space. The Word Count feature of Microsoft Word can help you keep track of how long your abstract is and help you hit your target length.

Example 1: Humanities abstract

Kenneth Tait Andrews, “‘Freedom is a constant struggle’: The dynamics and consequences of the Mississippi Civil Rights Movement, 1960-1984” Ph.D. State University of New York at Stony Brook, 1997 DAI-A 59/02, p. 620, Aug 1998

This dissertation examines the impacts of social movements through a multi-layered study of the Mississippi Civil Rights Movement from its peak in the early 1960s through the early 1980s. By examining this historically important case, I clarify the process by which movements transform social structures and the constraints movements face when they try to do so. The time period studied includes the expansion of voting rights and gains in black political power, the desegregation of public schools and the emergence of white-flight academies, and the rise and fall of federal anti-poverty programs. I use two major research strategies: (1) a quantitative analysis of county-level data and (2) three case studies. Data have been collected from archives, interviews, newspapers, and published reports. This dissertation challenges the argument that movements are inconsequential. Some view federal agencies, courts, political parties, or economic elites as the agents driving institutional change, but typically these groups acted in response to the leverage brought to bear by the civil rights movement. The Mississippi movement attempted to forge independent structures for sustaining challenges to local inequities and injustices. By propelling change in an array of local institutions, movement infrastructures had an enduring legacy in Mississippi.

Now let’s break down this abstract into its component parts to see how the author has distilled his entire dissertation into a ~200 word abstract.

What the dissertation does This dissertation examines the impacts of social movements through a multi-layered study of the Mississippi Civil Rights Movement from its peak in the early 1960s through the early 1980s. By examining this historically important case, I clarify the process by which movements transform social structures and the constraints movements face when they try to do so.

How the dissertation does it The time period studied in this dissertation includes the expansion of voting rights and gains in black political power, the desegregation of public schools and the emergence of white-flight academies, and the rise and fall of federal anti-poverty programs. I use two major research strategies: (1) a quantitative analysis of county-level data and (2) three case studies.

What materials are used Data have been collected from archives, interviews, newspapers, and published reports.

Conclusion This dissertation challenges the argument that movements are inconsequential. Some view federal agencies, courts, political parties, or economic elites as the agents driving institutional change, but typically these groups acted in response to movement demands and the leverage brought to bear by the civil rights movement. The Mississippi movement attempted to forge independent structures for sustaining challenges to local inequities and injustices. By propelling change in an array of local institutions, movement infrastructures had an enduring legacy in Mississippi.

Keywords social movements Civil Rights Movement Mississippi voting rights desegregation

Example 2: Science Abstract

Luis Lehner, “Gravitational radiation from black hole spacetimes” Ph.D. University of Pittsburgh, 1998 DAI-B 59/06, p. 2797, Dec 1998

The problem of detecting gravitational radiation is receiving considerable attention with the construction of new detectors in the United States, Europe, and Japan. The theoretical modeling of the wave forms that would be produced in particular systems will expedite the search for and analysis of detected signals. The characteristic formulation of GR is implemented to obtain an algorithm capable of evolving black holes in 3D asymptotically flat spacetimes. Using compactification techniques, future null infinity is included in the evolved region, which enables the unambiguous calculation of the radiation produced by some compact source. A module to calculate the waveforms is constructed and included in the evolution algorithm. This code is shown to be second-order convergent and to handle highly non-linear spacetimes. In particular, we have shown that the code can handle spacetimes whose radiation is equivalent to a galaxy converting its whole mass into gravitational radiation in one second. We further use the characteristic formulation to treat the region close to the singularity in black hole spacetimes. The code carefully excises a region surrounding the singularity and accurately evolves generic black hole spacetimes with apparently unlimited stability.

This science abstract covers much of the same ground as the humanities one, but it asks slightly different questions.

Why do this study The problem of detecting gravitational radiation is receiving considerable attention with the construction of new detectors in the United States, Europe, and Japan. The theoretical modeling of the wave forms that would be produced in particular systems will expedite the search and analysis of the detected signals.

What the study does The characteristic formulation of GR is implemented to obtain an algorithm capable of evolving black holes in 3D asymptotically flat spacetimes. Using compactification techniques, future null infinity is included in the evolved region, which enables the unambiguous calculation of the radiation produced by some compact source. A module to calculate the waveforms is constructed and included in the evolution algorithm.

Results This code is shown to be second-order convergent and to handle highly non-linear spacetimes. In particular, we have shown that the code can handle spacetimes whose radiation is equivalent to a galaxy converting its whole mass into gravitational radiation in one second. We further use the characteristic formulation to treat the region close to the singularity in black hole spacetimes. The code carefully excises a region surrounding the singularity and accurately evolves generic black hole spacetimes with apparently unlimited stability.

Keywords gravitational radiation (GR) spacetimes black holes

Works consulted

We consulted these works while writing this handout. This is not a comprehensive list of resources on the handout’s topic, and we encourage you to do your own research to find additional publications. Please do not use this list as a model for the format of your own reference list, as it may not match the citation style you are using. For guidance on formatting citations, please see the UNC Libraries citation tutorial . We revise these tips periodically and welcome feedback.

Belcher, Wendy Laura. 2009. Writing Your Journal Article in Twelve Weeks: A Guide to Academic Publishing Success. Thousand Oaks, CA: Sage Press.

Koopman, Philip. 1997. “How to Write an Abstract.” Carnegie Mellon University. October 1997. http://users.ece.cmu.edu/~koopman/essays/abstract.html .

Lancaster, F.W. 2003. Indexing And Abstracting in Theory and Practice , 3rd ed. London: Facet Publishing.

You may reproduce it for non-commercial use if you use the entire handout and attribute the source: The Writing Center, University of North Carolina at Chapel Hill

Make a Gift

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How We Use Abstract Thinking

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

abstract learning research

MoMo Productions / Getty Images

  • How It Develops

Abstract thinking, also known as abstract reasoning, involves the ability to understand and think about complex concepts that, while real, are not tied to concrete experiences, objects, people, or situations.

Abstract thinking is considered a type of higher-order thinking, usually about ideas and principles that are often symbolic or hypothetical. This type of thinking is more complex than the type of thinking that is centered on memorizing and recalling information and facts.

Examples of Abstract Thinking

Examples of abstract concepts include ideas such as:

  • Imagination

While these things are real, they aren't concrete, physical things that people can experience directly via their traditional senses.

You likely encounter examples of abstract thinking every day. Stand-up comedians use abstract thinking when they observe absurd or illogical behavior in our world and come up with theories as to why people act the way they do.

You use abstract thinking when you're in a philosophy class or when you're contemplating what would be the most ethical way to conduct your business. If you write a poem or an essay, you're also using abstract thinking.

With all of these examples, concepts that are theoretical and intangible are being translated into a joke, a decision, or a piece of art. (You'll notice that creativity and abstract thinking go hand in hand.)

Abstract Thinking vs. Concrete Thinking

One way of understanding abstract thinking is to compare it with concrete thinking. Concrete thinking, also called concrete reasoning, is tied to specific experiences or objects that can be observed directly.

Research suggests that concrete thinkers tend to focus more on the procedures involved in how a task should be performed, while abstract thinkers are more focused on the reasons why a task should be performed.

It is important to remember that you need both concrete and abstract thinking skills to solve problems in day-to-day life. In many cases, you utilize aspects of both types of thinking to come up with solutions.

Other Types of Thinking

Depending on the type of problem we face, we draw from a number of different styles of thinking, such as:

  • Creative thinking : This involves coming up with new ideas, or using existing ideas or objects to come up with a solution or create something new.
  • Convergent thinking : Often called linear thinking, this is when a person follows a logical set of steps to select the best solution from already-formulated ideas.
  • Critical thinking : This is a type of thinking in which a person tests solutions and analyzes any potential drawbacks.
  • Divergent thinking : Often called lateral thinking, this style involves using new thoughts or ideas that are outside of the norm in order to solve problems.

How Abstract Thinking Develops

While abstract thinking is an essential skill, it isn’t something that people are born with. Instead, this cognitive ability develops throughout the course of childhood as children gain new abilities, knowledge, and experiences.

The psychologist Jean Piaget described a theory of cognitive development that outlined this process from birth through adolescence and early adulthood. According to his theory, children go through four distinct stages of intellectual development:

  • Sensorimotor stage : During this early period, children's knowledge is derived primarily from their senses.
  • Preoperational stage : At this point, children develop the ability to think symbolically.
  • Concrete operational stage : At this stage, kids become more logical but their understanding of the world tends to be very concrete.
  • Formal operational stage : The ability to reason about concrete information continues to grow during this period, but abstract thinking skills also emerge.

This period of cognitive development when abstract thinking becomes more apparent typically begins around age 12. It is at this age that children become more skilled at thinking about things from the perspective of another person. They are also better able to mentally manipulate abstract ideas as well as notice patterns and relationships between these concepts.

Uses of Abstract Thinking

Abstract thinking is a skill that is essential for the ability to think critically and solve problems. This type of thinking is also related to what is known as fluid intelligence , or the ability to reason and solve problems in unique ways.

Fluid intelligence involves thinking abstractly about problems without relying solely on existing knowledge.

Abstract thinking is used in a number of ways in different aspects of your daily life. Some examples of times you might use this type of thinking:

  • When you describe something with a metaphor
  • When you talk about something figuratively
  • When you come up with creative solutions to a problem
  • When you analyze a situation
  • When you notice relationships or patterns
  • When you form a theory about why something happens
  • When you think about a problem from another point of view

Research also suggests that abstract thinking plays a role in the actions people take. Abstract thinkers have been found to be more likely to engage in risky behaviors, where concrete thinkers are more likely to avoid risks.

Impact of Abstract Thinking

People who have strong abstract thinking skills tend to score well on intelligence tests. Because this type of thinking is associated with creativity, abstract thinkers also tend to excel in areas that require creativity such as art, writing, and other areas that benefit from divergent thinking abilities.

Abstract thinking can have both positive and negative effects. It can be used as a tool to promote innovative problem-solving, but it can also lead to problems in some cases:

  • Bias : Research also suggests that it can sometimes promote different types of bias . As people seek to understand events, abstract thinking can sometimes cause people to seek out patterns, themes, and relationships that may not exist.
  • Catastrophic thinking : Sometimes these inferences, imagined scenarios, and predictions about the future can lead to feelings of fear and anxiety. Instead of making realistic predictions, people may catastrophize and imagine the worst possible potential outcomes.
  • Anxiety and depression : Research has also found that abstract thinking styles are sometimes associated with worry and rumination . This thinking style is also associated with a range of conditions including depression , anxiety, and post-traumatic stress disorder (PTSD) .

Conditions That Impact Abstract Thinking

The presence of learning disabilities and mental health conditions can affect abstract thinking abilities. Conditions that are linked to impaired abstract thinking skills include:

  • Learning disabilities
  • Schizophrenia
  • Traumatic brain injury (TBI)

The natural aging process can also have an impact on abstract thinking skills. Research suggests that the thinking skills associated with fluid intelligence peak around the ages of 30 or 40 and begin to decline with age.

Tips for Reasoning Abstractly

While some psychologists believe that abstract thinking skills are a natural product of normal development, others suggest that these abilities are influenced by genetics, culture, and experiences. Some people may come by these skills naturally, but you can also strengthen these abilities with practice.

Some strategies that you might use to help improve your abstract thinking skills:

  • Think about why and not just how : Abstract thinkers tend to focus on the meaning of events or on hypothetical outcomes. Instead of concentrating only on the steps needed to achieve a goal, consider some of the reasons why that goal might be valuable or what might happen if you reach that goal.
  • Reframe your thinking : When you are approaching a problem, it can be helpful to purposefully try to think about the problem in a different way. How might someone else approach it? Is there an easier way to accomplish the same thing? Are there any elements you haven't considered?
  • Consider the big picture : Rather than focusing on the specifics of a situation, try taking a step back in order to view the big picture. Where concrete thinkers are more likely to concentrate on the details, abstract thinkers focus on how something relates to other things or how it fits into the grand scheme of things.

Abstract thinking allows people to think about complex relationships, recognize patterns, solve problems, and utilize creativity. While some people tend to be naturally better at this type of reasoning, it is a skill that you can learn to utilize and strengthen with practice. 

It is important to remember that both concrete and abstract thinking are skills that you need to solve problems and function successfully. 

Gilead M, Liberman N, Maril A. From mind to matter: neural correlates of abstract and concrete mindsets . Soc Cogn Affect Neurosci . 2014;9(5):638-45. doi: 10.1093/scan/nst031

American Psychological Association. Creative thinking .

American Psychological Association. Convergent thinking .

American Psychological Association. Critical thinking .

American Psychological Association. Divergent thinking .

Lermer E, Streicher B, Sachs R, Raue M, Frey D. The effect of abstract and concrete thinking on risk-taking behavior in women and men . SAGE Open . 2016;6(3):215824401666612. doi:10.1177/2158244016666127

Namkoong J-E, Henderson MD. Responding to causal uncertainty through abstract thinking . Curr Dir Psychol Sci . 2019;28(6):547-551. doi:10.1177/0963721419859346

White R, Wild J. "Why" or "How": the effect of concrete versus abstract processing on intrusive memories following analogue trauma . Behav Ther . 2016;47(3):404-415. doi:10.1016/j.beth.2016.02.004

Williams DL, Mazefsky CA, Walker JD, Minshew NJ, Goldstein G. Associations between conceptual reasoning, problem solving, and adaptive ability in high-functioning autism . J Autism Dev Disord . 2014 Nov;44(11):2908-20. doi: 10.1007/s10803-014-2190-y

Oh J, Chun JW, Joon Jo H, Kim E, Park HJ, Lee B, Kim JJ. The neural basis of a deficit in abstract thinking in patients with schizophrenia . Psychiatry Res . 2015;234(1):66-73. doi: 10.1016/j.pscychresns.2015.08.007

Hartshorne JK, Germine LT. When does cognitive functioning peak? The asynchronous rise and fall of different cognitive abilities across the life span . Psychol Sci. 2015;26(4):433-43. doi:10.1177/0956797614567339

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Kolb’s Learning Styles and Experiential Learning Cycle

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

David Kolb published his learning styles model in 1984, from which he developed his learning style inventory.

Kolb’s experiential learning theory works on two levels: a four-stage learning cycle and four separate learning styles. Much of Kolb’s theory concerns the learner’s internal cognitive processes.

Kolb states that learning involves the acquisition of abstract concepts that can be applied flexibly in a range of situations. In Kolb’s theory, the impetus for the development of new concepts is provided by new experiences.

“Learning is the process whereby knowledge is created through the transformation of experience” (Kolb, 1984, p. 38).

The Experiential Learning Cycle

Kolb’s experiential learning style theory is typically represented by a four-stage learning cycle in which the learner “touches all the bases”:

learning cycle kolb

The terms “Reflective Cycle” and “Experiential Learning Cycle” are often used interchangeably when referring to this four-stage learning process. The main idea behind both terms is that effective learning occurs through a continuous cycle of experience, reflection, conceptualization, and experimentation.

  • Concrete Experience – the learner encounters a concrete experience. This might be a new experience or situation, or a reinterpretation of existing experience in the light of new concepts.
  • Reflective Observation of the New Experience – the learner reflects on the new experience in the light of their existing knowledge. Of particular importance are any inconsistencies between experience and understanding.
  • Abstract Conceptualization – reflection gives rise to a new idea, or a modification of an existing abstract concept (the person has learned from their experience).
  • Active Experimentation – the newly created or modified concepts give rise to experimentation. The learner applies their idea(s) to the world around them to see what happens.
Effective learning is seen when a person progresses through a cycle of four stages: of (1) having a concrete experience followed by (2) observation of and reflection on that experience which leads to (3) the formation of abstract concepts (analysis) and generalizations (conclusions) which are then (4) used to test a hypothesis in future situations, resulting in new experiences.

Kolb's Learning Cycle

Kolb (1984) views learning as an integrated process, with each stage mutually supporting and feeding into the next. It is possible to enter the cycle at any stage and follow it through its logical sequence.

However, effective learning only occurs when a learner can execute all four stages of the model. Therefore, no one stage of the cycle is effective as a learning procedure on its own.

The process of going through the cycle results in the formation of increasingly complex and abstract ‘mental models’ of whatever the learner is learning about.

Learning Styles

Kolb’s learning theory (1984) sets out four distinct learning styles, which are based on a four-stage learning cycle (see above). Kolb explains that different people naturally prefer a certain single different learning style.

Various factors influence a person’s preferred style. For example, social environment, educational experiences, or the basic cognitive structure of the individual.

Whatever influences the choice of style, the learning style preference itself is actually the product of two pairs of variables, or two separate “choices” that we make, which Kolb presented as lines of an axis, each with “conflicting” modes at either end.

A typical presentation of Kolb’s two continuums is that the east-west axis is called the Processing Continuum (how we approach a task), and the north-south axis is called the Perception Continuum (our emotional response, or how we think or feel about it).

Kolb's Learning Cycle

Kolb believed that we cannot perform both variables on a single axis simultaneously (e.g., think and feel). Our learning style is a product of these two choice decisions.

It’s often easier to see the construction of Kolb’s learning styles in terms of a two-by-two matrix. Each learning style represents a combination of two preferred styles.

The matrix also highlights Kolb’s terminology for the four learning styles; diverging, assimilating, and converging, accommodating:

Knowing a person’s (and your own) learning style enables learning to be orientated according to the preferred method.

That said, everyone responds to and needs the stimulus of all types of learning styles to one extent or another – it’s a matter of using emphasis that fits best with the given situation and a person’s learning style preferences.

Illustration showing a psychological model of the learning process for Kolb

Here are brief descriptions of the four Kolb learning styles:

Diverging (feeling and watching – CE/RO)

These people are able to look at things from different perspectives. They are sensitive. They prefer to watch rather than do, tending to gather information and use imagination to solve problems. They are best at viewing concrete situations from several different viewpoints.

Kolb called this style “diverging” because these people perform better in situations that require ideas-generation, for example, brainstorming. People with a diverging learning style have broad cultural interests and like to gather information.

They are interested in people, tend to be imaginative and emotional, and tend to be strong in the arts. People with the diverging style prefer to work in groups, to listen with an open mind and to receive personal feedback.

Assimilating (watching and thinking – AC/RO)

The assimilating learning preference involves a concise, logical approach. Ideas and concepts are more important than people.

These people require good, clear explanations rather than a practical opportunity. They excel at understanding wide-ranging information and organizing it in a clear, logical format.

People with an assimilating learning style are less focused on people and more interested in ideas and abstract concepts.  People with this style are more attracted to logically sound theories than approaches based on practical value.

This learning style is important for effectiveness in information and science careers. In formal learning situations, people with this style prefer readings, lectures, exploring analytical models, and having time to think things through.

Converging (doing and thinking – AC/AE)

People with a converging learning style can solve problems and will use their learning to find solutions to practical issues. They prefer technical tasks, and are less concerned with people and interpersonal aspects.

People with a converging learning style are best at finding practical uses for ideas and theories. They can solve problems and make decisions by finding solutions to questions and problems.

People with a converging learning style are more attracted to technical tasks and problems than social or interpersonal issues. A converging learning style enables specialist and technology abilities.

People with a converging style like to experiment with new ideas, to simulate, and to work with practical applications.

Accommodating (doing and feeling – CE/AE)

The Accommodating learning style is “hands-on,” and relies on intuition rather than logic. These people use other people’s analysis, and prefer to take a practical, experiential approach. They are attracted to new challenges and experiences, and to carrying out plans.

They commonly act on “gut” instinct rather than logical analysis. People with an accommodating learning style will tend to rely on others for information than carry out their own analysis. This learning style is prevalent within the general population.

Educational Implications

Both Kolb’s (1984) learning stages and the cycle could be used by teachers to critically evaluate the learning provision typically available to students, and to develop more appropriate learning opportunities.

Kolb

Educators should ensure that activities are designed and carried out in ways that offer each learner the chance to engage in the manner that suits them best.

Also, individuals can be helped to learn more effectively by the identification of their lesser preferred learning styles and the strengthening of these through the application of the experiential learning cycle.

Ideally, activities and material should be developed in ways that draw on abilities from each stage of the experiential learning cycle and take the students through the whole process in sequence.

Kolb, D. A. (1976). The Learning Style Inventory: Technical Manual . Boston, MA: McBer.

Kolb, D.A. (1981). Learning styles and disciplinary differences, in: A.W. Chickering (Ed.) The Modern American College (pp. 232–255). San Francisco, LA: Jossey-Bass.

Kolb, D. A. (1984). Experiential learning: Experience as the source of learning and development (Vol. 1). Englewood Cliffs, NJ: Prentice-Hall.

Kolb, D. A., & Fry, R. (1975). Toward an applied theory of experiential learning. In C. Cooper (Ed.), Studies of group process (pp. 33–57). New York: Wiley.

Kolb, D. A., Rubin, I. M., & McIntyre, J. M. (1984). Organizational psychology: readings on human behavior in organizations . Englewood Cliffs, NJ: Prentice-Hall.

Further Reading

  • How to Write a Psychology Essay
  • David Kolb’s Website
  • Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008). Learning styles: Concepts and evidence. Psychological science in the public interest, 9(3) , 105-119.

Print Friendly, PDF & Email

Related Articles

Aversion Therapy & Examples of Aversive Conditioning

Learning Theories

Aversion Therapy & Examples of Aversive Conditioning

Albert Bandura’s Social Learning Theory

Learning Theories , Psychology , Social Science

Albert Bandura’s Social Learning Theory

Behaviorism In Psychology

Learning Theories , Psychology

Behaviorism In Psychology

Bandura’s Bobo Doll Experiment on Social Learning

Famous Experiments , Learning Theories

Bandura’s Bobo Doll Experiment on Social Learning

Bloom’s Taxonomy of Learning

Bloom’s Taxonomy of Learning

Jerome Bruner’s Theory Of Learning And Cognitive Development

Child Psychology , Learning Theories

Jerome Bruner’s Theory Of Learning And Cognitive Development

Writing an Abstract for Your Research Paper

Definition and Purpose of Abstracts

An abstract is a short summary of your (published or unpublished) research paper, usually about a paragraph (c. 6-7 sentences, 150-250 words) long. A well-written abstract serves multiple purposes:

  • an abstract lets readers get the gist or essence of your paper or article quickly, in order to decide whether to read the full paper;
  • an abstract prepares readers to follow the detailed information, analyses, and arguments in your full paper;
  • and, later, an abstract helps readers remember key points from your paper.

It’s also worth remembering that search engines and bibliographic databases use abstracts, as well as the title, to identify key terms for indexing your published paper. So what you include in your abstract and in your title are crucial for helping other researchers find your paper or article.

If you are writing an abstract for a course paper, your professor may give you specific guidelines for what to include and how to organize your abstract. Similarly, academic journals often have specific requirements for abstracts. So in addition to following the advice on this page, you should be sure to look for and follow any guidelines from the course or journal you’re writing for.

The Contents of an Abstract

Abstracts contain most of the following kinds of information in brief form. The body of your paper will, of course, develop and explain these ideas much more fully. As you will see in the samples below, the proportion of your abstract that you devote to each kind of information—and the sequence of that information—will vary, depending on the nature and genre of the paper that you are summarizing in your abstract. And in some cases, some of this information is implied, rather than stated explicitly. The Publication Manual of the American Psychological Association , which is widely used in the social sciences, gives specific guidelines for what to include in the abstract for different kinds of papers—for empirical studies, literature reviews or meta-analyses, theoretical papers, methodological papers, and case studies.

Here are the typical kinds of information found in most abstracts:

  • the context or background information for your research; the general topic under study; the specific topic of your research
  • the central questions or statement of the problem your research addresses
  • what’s already known about this question, what previous research has done or shown
  • the main reason(s) , the exigency, the rationale , the goals for your research—Why is it important to address these questions? Are you, for example, examining a new topic? Why is that topic worth examining? Are you filling a gap in previous research? Applying new methods to take a fresh look at existing ideas or data? Resolving a dispute within the literature in your field? . . .
  • your research and/or analytical methods
  • your main findings , results , or arguments
  • the significance or implications of your findings or arguments.

Your abstract should be intelligible on its own, without a reader’s having to read your entire paper. And in an abstract, you usually do not cite references—most of your abstract will describe what you have studied in your research and what you have found and what you argue in your paper. In the body of your paper, you will cite the specific literature that informs your research.

When to Write Your Abstract

Although you might be tempted to write your abstract first because it will appear as the very first part of your paper, it’s a good idea to wait to write your abstract until after you’ve drafted your full paper, so that you know what you’re summarizing.

What follows are some sample abstracts in published papers or articles, all written by faculty at UW-Madison who come from a variety of disciplines. We have annotated these samples to help you see the work that these authors are doing within their abstracts.

Choosing Verb Tenses within Your Abstract

The social science sample (Sample 1) below uses the present tense to describe general facts and interpretations that have been and are currently true, including the prevailing explanation for the social phenomenon under study. That abstract also uses the present tense to describe the methods, the findings, the arguments, and the implications of the findings from their new research study. The authors use the past tense to describe previous research.

The humanities sample (Sample 2) below uses the past tense to describe completed events in the past (the texts created in the pulp fiction industry in the 1970s and 80s) and uses the present tense to describe what is happening in those texts, to explain the significance or meaning of those texts, and to describe the arguments presented in the article.

The science samples (Samples 3 and 4) below use the past tense to describe what previous research studies have done and the research the authors have conducted, the methods they have followed, and what they have found. In their rationale or justification for their research (what remains to be done), they use the present tense. They also use the present tense to introduce their study (in Sample 3, “Here we report . . .”) and to explain the significance of their study (In Sample 3, This reprogramming . . . “provides a scalable cell source for. . .”).

Sample Abstract 1

From the social sciences.

Reporting new findings about the reasons for increasing economic homogamy among spouses

Gonalons-Pons, Pilar, and Christine R. Schwartz. “Trends in Economic Homogamy: Changes in Assortative Mating or the Division of Labor in Marriage?” Demography , vol. 54, no. 3, 2017, pp. 985-1005.

“The growing economic resemblance of spouses has contributed to rising inequality by increasing the number of couples in which there are two high- or two low-earning partners. [Annotation for the previous sentence: The first sentence introduces the topic under study (the “economic resemblance of spouses”). This sentence also implies the question underlying this research study: what are the various causes—and the interrelationships among them—for this trend?] The dominant explanation for this trend is increased assortative mating. Previous research has primarily relied on cross-sectional data and thus has been unable to disentangle changes in assortative mating from changes in the division of spouses’ paid labor—a potentially key mechanism given the dramatic rise in wives’ labor supply. [Annotation for the previous two sentences: These next two sentences explain what previous research has demonstrated. By pointing out the limitations in the methods that were used in previous studies, they also provide a rationale for new research.] We use data from the Panel Study of Income Dynamics (PSID) to decompose the increase in the correlation between spouses’ earnings and its contribution to inequality between 1970 and 2013 into parts due to (a) changes in assortative mating, and (b) changes in the division of paid labor. [Annotation for the previous sentence: The data, research and analytical methods used in this new study.] Contrary to what has often been assumed, the rise of economic homogamy and its contribution to inequality is largely attributable to changes in the division of paid labor rather than changes in sorting on earnings or earnings potential. Our findings indicate that the rise of economic homogamy cannot be explained by hypotheses centered on meeting and matching opportunities, and they show where in this process inequality is generated and where it is not.” (p. 985) [Annotation for the previous two sentences: The major findings from and implications and significance of this study.]

Sample Abstract 2

From the humanities.

Analyzing underground pulp fiction publications in Tanzania, this article makes an argument about the cultural significance of those publications

Emily Callaci. “Street Textuality: Socialism, Masculinity, and Urban Belonging in Tanzania’s Pulp Fiction Publishing Industry, 1975-1985.” Comparative Studies in Society and History , vol. 59, no. 1, 2017, pp. 183-210.

“From the mid-1970s through the mid-1980s, a network of young urban migrant men created an underground pulp fiction publishing industry in the city of Dar es Salaam. [Annotation for the previous sentence: The first sentence introduces the context for this research and announces the topic under study.] As texts that were produced in the underground economy of a city whose trajectory was increasingly charted outside of formalized planning and investment, these novellas reveal more than their narrative content alone. These texts were active components in the urban social worlds of the young men who produced them. They reveal a mode of urbanism otherwise obscured by narratives of decolonization, in which urban belonging was constituted less by national citizenship than by the construction of social networks, economic connections, and the crafting of reputations. This article argues that pulp fiction novellas of socialist era Dar es Salaam are artifacts of emergent forms of male sociability and mobility. In printing fictional stories about urban life on pilfered paper and ink, and distributing their texts through informal channels, these writers not only described urban communities, reputations, and networks, but also actually created them.” (p. 210) [Annotation for the previous sentences: The remaining sentences in this abstract interweave other essential information for an abstract for this article. The implied research questions: What do these texts mean? What is their historical and cultural significance, produced at this time, in this location, by these authors? The argument and the significance of this analysis in microcosm: these texts “reveal a mode or urbanism otherwise obscured . . .”; and “This article argues that pulp fiction novellas. . . .” This section also implies what previous historical research has obscured. And through the details in its argumentative claims, this section of the abstract implies the kinds of methods the author has used to interpret the novellas and the concepts under study (e.g., male sociability and mobility, urban communities, reputations, network. . . ).]

Sample Abstract/Summary 3

From the sciences.

Reporting a new method for reprogramming adult mouse fibroblasts into induced cardiac progenitor cells

Lalit, Pratik A., Max R. Salick, Daryl O. Nelson, Jayne M. Squirrell, Christina M. Shafer, Neel G. Patel, Imaan Saeed, Eric G. Schmuck, Yogananda S. Markandeya, Rachel Wong, Martin R. Lea, Kevin W. Eliceiri, Timothy A. Hacker, Wendy C. Crone, Michael Kyba, Daniel J. Garry, Ron Stewart, James A. Thomson, Karen M. Downs, Gary E. Lyons, and Timothy J. Kamp. “Lineage Reprogramming of Fibroblasts into Proliferative Induced Cardiac Progenitor Cells by Defined Factors.” Cell Stem Cell , vol. 18, 2016, pp. 354-367.

“Several studies have reported reprogramming of fibroblasts into induced cardiomyocytes; however, reprogramming into proliferative induced cardiac progenitor cells (iCPCs) remains to be accomplished. [Annotation for the previous sentence: The first sentence announces the topic under study, summarizes what’s already known or been accomplished in previous research, and signals the rationale and goals are for the new research and the problem that the new research solves: How can researchers reprogram fibroblasts into iCPCs?] Here we report that a combination of 11 or 5 cardiac factors along with canonical Wnt and JAK/STAT signaling reprogrammed adult mouse cardiac, lung, and tail tip fibroblasts into iCPCs. The iCPCs were cardiac mesoderm-restricted progenitors that could be expanded extensively while maintaining multipo-tency to differentiate into cardiomyocytes, smooth muscle cells, and endothelial cells in vitro. Moreover, iCPCs injected into the cardiac crescent of mouse embryos differentiated into cardiomyocytes. iCPCs transplanted into the post-myocardial infarction mouse heart improved survival and differentiated into cardiomyocytes, smooth muscle cells, and endothelial cells. [Annotation for the previous four sentences: The methods the researchers developed to achieve their goal and a description of the results.] Lineage reprogramming of adult somatic cells into iCPCs provides a scalable cell source for drug discovery, disease modeling, and cardiac regenerative therapy.” (p. 354) [Annotation for the previous sentence: The significance or implications—for drug discovery, disease modeling, and therapy—of this reprogramming of adult somatic cells into iCPCs.]

Sample Abstract 4, a Structured Abstract

Reporting results about the effectiveness of antibiotic therapy in managing acute bacterial sinusitis, from a rigorously controlled study

Note: This journal requires authors to organize their abstract into four specific sections, with strict word limits. Because the headings for this structured abstract are self-explanatory, we have chosen not to add annotations to this sample abstract.

Wald, Ellen R., David Nash, and Jens Eickhoff. “Effectiveness of Amoxicillin/Clavulanate Potassium in the Treatment of Acute Bacterial Sinusitis in Children.” Pediatrics , vol. 124, no. 1, 2009, pp. 9-15.

“OBJECTIVE: The role of antibiotic therapy in managing acute bacterial sinusitis (ABS) in children is controversial. The purpose of this study was to determine the effectiveness of high-dose amoxicillin/potassium clavulanate in the treatment of children diagnosed with ABS.

METHODS : This was a randomized, double-blind, placebo-controlled study. Children 1 to 10 years of age with a clinical presentation compatible with ABS were eligible for participation. Patients were stratified according to age (<6 or ≥6 years) and clinical severity and randomly assigned to receive either amoxicillin (90 mg/kg) with potassium clavulanate (6.4 mg/kg) or placebo. A symptom survey was performed on days 0, 1, 2, 3, 5, 7, 10, 20, and 30. Patients were examined on day 14. Children’s conditions were rated as cured, improved, or failed according to scoring rules.

RESULTS: Two thousand one hundred thirty-five children with respiratory complaints were screened for enrollment; 139 (6.5%) had ABS. Fifty-eight patients were enrolled, and 56 were randomly assigned. The mean age was 6630 months. Fifty (89%) patients presented with persistent symptoms, and 6 (11%) presented with nonpersistent symptoms. In 24 (43%) children, the illness was classified as mild, whereas in the remaining 32 (57%) children it was severe. Of the 28 children who received the antibiotic, 14 (50%) were cured, 4 (14%) were improved, 4(14%) experienced treatment failure, and 6 (21%) withdrew. Of the 28children who received placebo, 4 (14%) were cured, 5 (18%) improved, and 19 (68%) experienced treatment failure. Children receiving the antibiotic were more likely to be cured (50% vs 14%) and less likely to have treatment failure (14% vs 68%) than children receiving the placebo.

CONCLUSIONS : ABS is a common complication of viral upper respiratory infections. Amoxicillin/potassium clavulanate results in significantly more cures and fewer failures than placebo, according to parental report of time to resolution.” (9)

Some Excellent Advice about Writing Abstracts for Basic Science Research Papers, by Professor Adriano Aguzzi from the Institute of Neuropathology at the University of Zurich:

abstract learning research

Academic and Professional Writing

This is an accordion element with a series of buttons that open and close related content panels.

Analysis Papers

Reading Poetry

A Short Guide to Close Reading for Literary Analysis

Using Literary Quotations

Play Reviews

Writing a Rhetorical Précis to Analyze Nonfiction Texts

Incorporating Interview Data

Grant Proposals

Planning and Writing a Grant Proposal: The Basics

Additional Resources for Grants and Proposal Writing

Job Materials and Application Essays

Writing Personal Statements for Ph.D. Programs

  • Before you begin: useful tips for writing your essay
  • Guided brainstorming exercises
  • Get more help with your essay
  • Frequently Asked Questions

Resume Writing Tips

CV Writing Tips

Cover Letters

Business Letters

Proposals and Dissertations

Resources for Proposal Writers

Resources for Dissertators

Research Papers

Planning and Writing Research Papers

Quoting and Paraphrasing

Writing Annotated Bibliographies

Creating Poster Presentations

Thank-You Notes

Advice for Students Writing Thank-You Notes to Donors

Reading for a Review

Critical Reviews

Writing a Review of Literature

Scientific Reports

Scientific Report Format

Sample Lab Assignment

Writing for the Web

Writing an Effective Blog Post

Writing for Social Media: A Guide for Academics

Loading metrics

Open Access

Peer-reviewed

Research Article

Abstract concept learning in a simple neural network inspired by the insect brain

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliations Department of Computer Science, University of Sheffield, Sheffield, UK, Sheffield Robotics, University of Sheffield, Sheffield, UK

ORCID logo

Roles Conceptualization, Formal analysis, Investigation, Validation, Writing – original draft, Writing – review & editing

Roles Investigation

Affiliation Department of Biological Sciences, Macquarie University, Sydney, Australia

Roles Writing – review & editing

Roles Funding acquisition, Project administration, Supervision, Writing – original draft, Writing – review & editing

Roles Funding acquisition, Writing – original draft, Writing – review & editing

  • Alex J. Cope, 
  • Eleni Vasilaki, 
  • Dorian Minors, 
  • Chelsea Sabo, 
  • James A. R. Marshall, 
  • Andrew B. Barron

PLOS

  • Published: September 17, 2018
  • https://doi.org/10.1371/journal.pcbi.1006435
  • See the preprint
  • Reader Comments

Fig 1

The capacity to learn abstract concepts such as ‘sameness’ and ‘difference’ is considered a higher-order cognitive function, typically thought to be dependent on top-down neocortical processing. It is therefore surprising that honey bees apparantly have this capacity. Here we report a model of the structures of the honey bee brain that can learn sameness and difference, as well as a range of complex and simple associative learning tasks. Our model is constrained by the known connections and properties of the mushroom body, including the protocerebral tract, and provides a good fit to the learning rates and performances of real bees in all tasks, including learning sameness and difference. The model proposes a novel mechanism for learning the abstract concepts of ‘sameness’ and ‘difference’ that is compatible with the insect brain, and is not dependent on top-down or executive control processing.

Author summary

Is it necessary to have advanced neural mechanisms to learn abstract concepts such as sameness or difference? Such tasks are usually considered a higher order cognitive capacity, dependent on complex cognitive processes located in the mammalian neocortex. It has always been astonishing therefore that honey bees have been shown capable of learning sameness and difference, and other relational concepts. To explore how an animal like a bee might do this here we present a simple neural network model that is capable of learning sameness and difference and is constrained by the known neural systems of the insect brain, and that lacks any advanced neural mechanisms. The circuit model we propose was able to replicate bees’ performance in concept learning and a range of other associative learning tasks when tested in simulations. Our model proposes a revision of what is assumed necessary for learning abstract concepts. We caution against ranking cognitive abilities by anthropomorphic assumptions of their complexity and argue that by application of neural modelling, it can be shown that comparatively simple neural structures are sufficient to explain different cognitive capacities, and the range of animals that might be capable of them.

Citation: Cope AJ, Vasilaki E, Minors D, Sabo C, Marshall JAR, Barron AB (2018) Abstract concept learning in a simple neural network inspired by the insect brain. PLoS Comput Biol 14(9): e1006435. https://doi.org/10.1371/journal.pcbi.1006435

Editor: Joseph Ayers, Northeastern University, UNITED STATES

Received: April 24, 2018; Accepted: August 15, 2018; Published: September 17, 2018

Copyright: © 2018 Cope et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All files are available on GitHub at ( https://github.com/BrainsOnBoard/bee-concept-learning ).

Funding: JARM and EV acknowledge support from the Engineering and Physical Sciences Research Council (grant numbers EP/J019534/1 and EP/P006094/1), ( https://epsrc.ukri.org/ ). JARM and ABB acknowledge support from a Royal Society International Exchanges Grant ( https://royalsociety.org/ ). ABB is supported by an Australian Research Council Future Fellowship Grant 140100452 and Australian Research Council Discovery Project Grant DP150101172 ( http://www.arc.gov.au/ ). The funders played no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Abstract concepts involve the relationships between things. Two simple and classic examples of abstract concepts are ‘sameness’ and ‘difference’. These categorise the relative similarity of things: they are properties of a relationship between objects, but they are independent of, and unrelated to, the features of the objects themselves. The capacity to identify and act on abstract relationships is a higher-order cognitive capacity, and one that is considered critical for any operation involving equivalence or general quantitative comparison [ 1 – 4 ]. The capacity to recognise abstract concepts such as sameness has even been considered to form the “very keel and backbone of our thinking” [ 5 ]. Several non-verbal animals have been shown to be able to recognise ‘sameness’ and ‘difference’ including, notably, the honey bee [ 6 – 9 ].

The ability of the honey bee to recognise ‘sameness’ and ‘difference’ is interesting, as the learning of abstract concepts is interpreted as a property of the mammalian neocortex, or of regions of the avian pallium [ 10 – 12 ] and to be a form of top-down executive modulation of lower-order learning mechanisms (complex neural processing affecting less complex processing, see [ 13 ] for more details) [ 4 , 12 ]. This interpretation has been reinforced by the finding that activity of neurons in the prefrontal cortex of rhesus monkeys ( Macaca mulatta ) correlates with success in recognising sameness in tasks [ 11 , 12 ]. The honey bee, however, has nothing like a prefrontal cortex in its much smaller brain.

In this paper we use a modelling approach to explore how an animal like a honey bee might be able to solve an abstract concept learning task. To consider this issue we must outline in more detail how learning of sameness and difference has been demonstrated in honey bees, and originally in other animals.

A family of ‘match-to-sample’ tasks has been developed to evaluate sameness and difference learning in non-verbal animals. In these tasks animals are shown a sample stimulus followed, after a delay, by two stimuli: one that matches the sample and one that does not. Sometimes delays of varying duration have been imposed between the presentation of the sample and matching stimuli to test duration of the ‘working memory’ required to perform the task [ 1 , 14 ]. This working memory concept is likened to a neural scratchpad that can store a short term memory of a fixed number of items, previously seen but no longer present [ 15 ]. Tests in which animals are trained to choose matching stimuli are described as Match-to-Sample (MTS) or Delayed-Match-To-Sample (DMTS) tasks, and tests in which animals are trained to choose the non-matching stimulus are Not-Match-To-Sample (NMTS) or Delayed-Not-Match-To-Sample (DNMTS) tasks.

On their own, match-to-sample tasks are not sufficient to show concept learning of sameness or difference. For this it is necessary to show, having been trained to select matching or non-matching stimuli, that the animal can apply the concept of sameness or difference in a new context [ 4 ]. Typically this is done by training animals with one set of stimuli and testing whether they can perform the task with a new set of stimuli [ 6 – 9 ]; this is referred to as a transfer test.

In a landmark study [ 8 ] showed that honey bees can learn both sameness and difference. They could learn both DMTS and DNMTS tasks and generalise performance in both tasks to tests with new, previously unseen, stimuli [ 8 ]. In this study free-flying bees were trained and tested using a Y-maze in which the sample and matching stimuli were experienced sequentially during flight, with the sample at the entrance to the maze and the match stimuli at each of the y-maze arms. Bees could solve and generalise both DMTS and DNMTS tasks when trained with visual stimuli, and could even transfer the concept of sameness learned in an olfactory DMTS task to a visual DMTS task, showing cross-modal transfer of the learned concept of sameness [ 8 ]. Bees took 60 trials to learn these tasks [ 8 ]; this is much longer than learning a simple olfactory or visual associative learning task, which can be learned by bees in 3 trials [ 16 ]. Their performance in DMTS and DNMTS was not perfect either; the population average for performance in test and transfer tests was around 75%, but they could clearly perform at better than chance levels [ 8 ] in both.

The concept of working memory is crucial for solving a DMTS/DNMTS task, as information about the sample stimulus is no longer available externally to the animal when choosing between the match stimuli. If there is no neural information that can identify the match then the task cannot be solved. We therefore must identify in the honeybee a candidate for providing this information in order to produce a model that can solve the task.

A previous model by [ 17 ] demonstrates DMTS and DNMTS with transfer, however the model contains many biologically unfounded mechanisms that are solely added for the purpose of solving these tasks, and the outcome of these additions disagrees with neurophysiological, and behavioural evidence. We instead take an approach of constraining our model strongly to established neurophysiology and neuronanatomy, and demonstrating behaviour that matches that of real bees. We will compare this model to the model presented here further in the Discussion. As the problem (without transfer) is a binary XOR, it can be solved by a network with at least one hidden layer, however this would not solve the generalisation aspect of the problem.

The honey bee brain is structured as discrete regions of neuropil (zones of synaptic contact). These are well described, as are the major tracts connecting them [ 18 ]. The learning pathways have been particularly intensely studied ( e.g . [ 19 – 22 ]). The mushroom bodies ( corpora pedunculata ) receive processed olfactory, visual and mechanosensory input [ 23 ] and are a locus of multimodal associative learning in honey bees [ 19 ]. They are essential for reversal and configural learning [ 4 , 24 , 25 ]. Avarguès-Weber and Giurfa [ 4 ] have argued the mushroom bodies to be the most likely brain region supporting concept learning, because of their roles in stimulus identification, classification and elemental learning [ 19 , 22 , 26 ]. Yet it is not clear how mushroom bodies and associated structures might be able to learn abstract concepts that are independent of any of the specific features of learned stimuli and, crucially, how the identity of the sample stimulus could be represented. Solving such a problem requires two computational components. First, a means of storing the identity of the sample stimulus, a form of working memory; second, a mechanism that can learn to use this stored identity to influence the behaviour at the decision point. Below we propose a model of the circuitry of the honey bee mushroom bodies that can perform these computations and is able to solve DMTS and DNMTS tasks.

Key model principles: A circuit model inspired by the honey bee mushroom bodies

We explored whether a neural circuit model, inspired and constrained by the known connections of the honey bee mushroom bodies, is capable of learning sameness and difference in a DMTS and DNMTS task ( Fig 1 ). Full details of the models can be found in Methods. A sufficiency table for the key mechanisms and the following results is shown in Table 1 .

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

A Neuroanatomy: MB Mushroom Bodies; AL Antennal Lobe glomeruli (circles); ME & LO Medulla and Lobula optic neuropils. The relevant neural pathways are shown and labelled for comparison with the model. B Reduced model; neuron classes indicated at righthand side of sub-figure. C Full model, showing the model connectivity and indicating the approximate relative numbers of each neuron type. Colour coding and labels are preserved throughout all the diagrams for clarity. Excitatory and inhibitory connections indicated as in figure legend. Key of neuron types: KC, Kenyon Cells; PCT, Protocerebellar Tract neurons; IN, Input Neurons (olfactory or visual); EN, Extrinsic MB Neurons from the GO and NOGO subpopulations, where the subpopulation with the highest sum activity defines the behavioural choice in the experimental protocol ( Fig 4 ).

https://doi.org/10.1371/journal.pcbi.1006435.g001

thumbnail

https://doi.org/10.1371/journal.pcbi.1006435.t001

The mushroom body has previously been modelled as an associative network consisting of three neural network layers [ 26 , 27 ], comprised of input neurons (IN) providing processed olfactory, visual and mechanosensory inputs [ 23 , 28 ], an expansive middle layer of Kenyon cells (KC) which enables sparse-coding of sensory information for effective stimulus classification [ 22 ], and finally mushroom body extrinsic neurons (EN) which output to premotor regions of the brain and can be considered (at this level of abstraction) to activate different possible behavioural responses [ 22 , 26 ]. Here, for simplicity, we consider the EN as simply two subpopulations controlling either ‘go’ or ‘no-go’ behavioural responses only, which allow choice between different options via sequential presentations where ‘go’ chooses the currently presented option. Connections between the KC output and ENs are modifiable by synaptic plasticity [ 26 , 29 – 31 ] supporting learned changes in behavioural responses to stimuli.

As outlined in the introduction, we require two computational mechanisms for solving the DMTS/DNMTS task. First is a means of storing the identity of the sample stimulus. Second is learning to use this identity to drive behaviour and solve the task. Moreover this learning must generalise to other stimuli. The computational complexity of this problem should not be underestimated; either the means of storing the identity of the sample, or the behavioural learning, must generalise to other stimulus sets. The bees were not given any reward with the transfer stimuli in [ 8 ]’s study, so no post-training learning mechanism can explain the transfer performance. In addition, during the course of the experiment of [ 8 ] each of the two stimuli were used as the match, i.e. for stimuli A and B the stimulus at the maze entrance were alternated between A and B throughout the training phase of the experiment. This requires, therefore, that the bees have a sense of stimulus ‘novelty’, and can associate novelty with a behaviour: either approach for DNMTS, or avoid for DMTS. With one training set the problem is solvable as delayed paired non-elemental learning tasks (e.g. responding to stimuli A and B together, but not individually), however with the transfer of learning to new stimulus sets such an approach does not solve the whole task.

There is one feature of the Kenyon Cells which can fulfill this computational requirement for novelty detection, that of sensory accommodation. In honey bees, even in the absence of reward or punishment, the KC show a stark decrease in activity between initial and repeated stimulus presentations of up to 50%, an effect that persists over several minutes [ 32 ]. This effect is also found in Drosophila melanogaster [ 33 ], where there is additionally a set of mushroom body output neurons that show even starker decreases in response to repeated stimuli, and which respond to many stimuli with stimulus specific decreases (thus making them clear novelty detectors), however such a neuron has not been found in bees to date. This response decrease in Kenyon Cells found in flies and bees is sufficient to influence behaviour during a trial but, given the decay time of this effect, not likely to influence subsequent trials. The mechanism behind this accommodation property is not known, and therefore we are only able to model phenomenologically, which we do by reducing the strength of the KC synapses for the sample stimulus by a fixed factor, tuned to reproduce the reduction in total KC output found by Szyszka et al. [ 32 ] (see Fig 2 panel E). However it should be noted that stimulus-specific adaptation is shown in many species and brain areas, and can be explained by short-term plasticity mechanisms [ 34 – 36 ] and architectural constraints only; see for instance [ 37 ]. Exploiting the faculty of the units to adapt to repeated stimuli allows the network to distinguish between repeated stimuli (“same”) and non-repeated, i.e. novel “different”) stimuli.

thumbnail

A & B The average percentage of correct choices made by the model and real bees within blocks of ten trials as the task is learned (lines), along with the transfer of learning onto novel stimulus sets (bars). Both versions of the model reproduce the pattern of learning acquisition for DMTS (Full: N = 338, Reduced: N = 360) and DNMTS (Full & Reduced: N = 360) found when testing real bees (test for learning: P <0.0001), along with the transfer of learning (P <0.0001). For DMTS Giurfa A & B are the data from Experiments 1 & 2 respectively from Giurfa et al. [ 8 ], and for DNMTS Giurfa A & B are the data from Experiments 5 & 6 respectively from the same source. For an explanation of the initial offsets from chance for the model please see the text for panel D. C The blockade of plasticity from the MB and PCT pathways shows that the PCT pathway is necessary and sufficient for sameness and difference learning in the full model. All non-overlapping SEM error bars are significantly different. D PCT pathway learning in the absence of associative learning leads to preference for non-matching stimuli following pre-training, demonstrating that learning in the associative pathway changes the form of the sameness and difference acquisition curves. The equivalent offsets and error ranges for the first two blocks of Giurfa Experiments 1, 2, 5 & 6 along with the averages for DMTS and DNMTS for these blocks are shown alongside the model data for comparison as overlapping grey boxes—overlapping boxes create darker regions, thus the area of greatest darkness is the point where the most of the error ranges overlap. E The average activity of the model KC neurons when presented with repeated stimuli.

https://doi.org/10.1371/journal.pcbi.1006435.g002

Having identified our first computational mechanism, a memory trace in the form of reduced KC output for the repeated stimulus, we need only to identify the second, a learning mechanism that can use this reduced KC output to drive behaviour to choose the correct (matching or non-matching) arm of the y-maze. If this learning mechanism exists at the output synapses of the KC it is either specific to the stimulus—if using a pre-postsynaptic learning rule—and therefore cannot transfer, or it utilises a postsynaptic-only learning rule. Initially the postsynaptic learning rule appears a plausible solution, however we must consider that bees can learn both DMTS and DNMTS, and that learning can only occur when the bee chooses to ‘go’. This creates a contradiction, as postsynaptic learning will proportionally raise both the weaker (repeated) stimulus activity, as well as the stronger (non-repeated) stimulus activity in the GO EN subpopulation. To select ‘go’ the GO activity for the currently presented stimulus must be larger than the activity in the NOGO subpopulation, which is fixed. Therefore we face the contradiction that in the DMTS case the weaker stimulus response must be higher than the stronger one in the GO subpopulation with respect to their NOGO subpopulation counterpart responses, yet in the DNMTS case the converse must apply. No single postsynaptic learning rule can fulfil this requirement.

A separate set of neurons that can act as a relay between the KC and behaviour is therefore required to solve both DMTS and DNMTS tasks. A plausible candidate is the inhibitory neurons that form the protocerebellar tract (PCT). These neurons have been implicated in both non-elemental olfactory learning [ 25 ] and regulatory processes at the KC input regions. They also project to the KC output regions [ 38 – 41 ], where there are reward-linked neuromodulators and learning-related changes [ 20 , 42 ]. These neurons are few in number in comparison to the KC population, and some take input from large numbers of KC [ 43 ]. We therefore propose that, in addition to their posited role in modulating and regulating the input to the KC based on overall KC activity [ 43 ], these neurons could also regulate and modulate the activity of the EN populations at the KC output regions. Such a role would allow, via synaptic plasticity, a single summation of activity from the KCs to differentially affect both their inputs and outputs. If we assume a high threshold for activity for the PCT neurons (again consistent with their proposed role) such that repeated stimuli would not activate the PCT neurons but non-repeated stimuli would, it is then possible for synaptic plasticity from the PCT neurons to the EN to solve the DMTS and DNMTS tasks and, vitally, transfer that learning to novel stimuli. We do not propose that this is the purpose of these neurons, but instead that it is a consequence of their regulatory role.

We present two models inspired by the anatomy and properties of the honey bee brain that are computationally capable of learning in DMTS and DNMTS tasks, and the generalisation of this learning to novel stimuli ( Fig 1 ).

Our first, reduced, model is a simple demonstration that the key principles outlined above can solve DMTS and DNMTS tasks, and generalise the learning to novel stimulus sets. By simplifying the model in this way the computational principles are readily apparent. Such a simple model, however, cannot demonstrate that associative learning in the KC to EN synapses does not interfere with learning in the PCT to EN synapses or vice versa. For this we present a full model that includes the associative learning pathway from the KC to the EN, and demonstrate that this model can not only solve DMTS and DNMTS with transfer to novel stimuli, but can also solve a suite of associative learning tasks in which the MB have been implicated. The results of computational experiments performed with these models are presented below. The full model addresses the interaction of the PCT to EN learning and the KC to EN learning, as well as suggesting a possible computational role of the PCT to EN synaptic pathway in regulating the behavioural choices driven by the MB output, which we present in the Discussion.

A reduced model of the core computational principles produces sameness and difference learning, and transfers this learning to novel stimuli

The reduced model is shown in Fig 1 Panel B and model equations are presented in Methods. The input nodes S 1 and S 2 represent the two alternative stimuli, where we have reduced the sparse KC representation into two non-overlapping single nodes for simplicity, and as such we do not need to model the IN input neurons separately. Node I (which corresponds to the PCT neurons, again reduced to a single node for simplicity) represents the inhibitory input to the output neurons GO and NOGO. Nodes S 1 and S 2 project to nodes I and to GO and NOGO with fixed excitatory weighted connection. Finally, node I projects to GO and NOGO with plastic inhibitory weighted connections. Node I is thresholded so that it only responds to novel stimuli.

Fig 2 panels A and B show the performance of the reduced model bees (as well as the full model) for task learning and transfer to novel input stimuli, alongside experimental data from [ 8 ]. While the reduced model solves the transfer of sameness and difference learning the pretraining process strongly biases the model towards non-repeated stimuli, proportional to the number of pretraining trials. Notably, this bias in the reduced model is different to that found in the full model, which we discuss below.

The model operates by adjusting the weights between the I and the GO to change the likelihood of choosing the non-matching stimulus. Since only connections from the I (representing the PCT neuron) to GO neurons are changed, the I to GO/NOGO weights are initialised to half the maximum weight value. Note that the I node is only active for the non-repeated stimulus, and this pathway has no effect for repeated stimuli. This means that if the weights are increased then non-repeated stimuli will have greater inhibition to the GO neuron, and therefore be less likely to be chosen. If the weights are decreased then non-repeated stimuli will have less inhibition to the GO neuron and therefore will be more likely to be chosen. As the conditions for changing the weights are only met when the non-repeated stimulus is chosen for ‘go‘, this means that the model only learns on unsuccessful trial for DMTS (increasing the weight), or successful trials for DNMTS (decreasing the weight).

A full model is capable of sameness and difference learning, and transfers this learning to novel stimuli

The full model is shown in Fig 1 Panel C and model equations are presented in Methods. Fig 2 panel D shows the performance of the full model for the first block of learning following from different numbers of pretraining repetitions. When only the PCT pathway is plastic there is a large bias towards the non-repeated stimulus due to the pretraining, as found in the reduced model. This bias is reduced by the presence of the associative learning pathway, and the bias is independent of the number of pretraining trials for more than 5 trials. It should be noted that the experimental data [ 8 ] show indications of such a bias, in line with the results from the full model. The reduced model therefore requires fewer pretraining trials than the full model to produce a similar bias, which leads to the reduced model having large maladaptive behavioural biases for non-repeated stimuli if all stimuli are rewarded. This is important, as it suggests a role for the PCT pathway in modulating the behavioural choice of the bee. This possible role is explored further in the Discussion.

Fig 2 panels A and B show the performance of the model bees compared with the performance of real bees from [ 8 ], and the reduced model. In both cases the trends found in the performance of the model bees match the trends found in the real bees for both task learning and transfer to novel stimuli. It is important to note the different forms of the learning in the DMTS and DNMTS paradigms, with DNMTS slower to learn. This is a direct consequence of the inhibitory nature of the PCT neurons; excitatory neurons performing the role of the PCT neurons in the model would lead to a reversal of this feature, with DMTS learning more slowly.

Learning in the PCT pathway of the full model is essential for transfer of learning to novel stimuli

We next sought to confirm that learning in the PCT neuron to EN pathway enabled generalisable learning of sameness and difference. Computational modelling provides powerful tools with which to do this, by comparing model performance when different elements are suppressed with the full model. We selectively suppressed the KC associative learning pathway, the PCT pathway learning, and all learning in the model. When a learning mechanism is suppressed this means that the synaptic weights stay the same throughout the training, but the pathway is otherwise active.

The results are summarised in Fig 2 panel C. It can clearly be seen that within our model learning in the PCT pathway is necessary for transfer of the sameness and difference learning to novel stimuli. Associative learning via the KC pathway alone has no effect on the transfer task performance compared to the fully learning-suppressed model. Unsuppressed associative learning leads to a preference for the matched stimulus, which has weaker KC activity, but this learning is specific to the trained stimuli, and does not transfer to novel stimuli.

Validation: The full model is capable of performing a range of conditioning tasks

Many models have reproduced the input neuron to Kenyon Cell to Extrinsic neuron pathway [ 26 , 27 , 44 , 45 ], and these models demonstrate many forms of elementary and complex associative learning that have been attributed to the mushroom bodies. It is therefore important to demonstrate that in our model the PCT neuron pathway does not affect the reproduction of such learning behaviours. We therefore tested elemental and non-elemental associative learning undertaken by conditioning the Proboscis Extension Reflex (PER) in restrained bees, and reversal learning in free flying bees, as described in Methods. Our model is capable of reproducing the results found in experiments involving real bees, with the model’s acquisition curves showing similar to the performance to the real bees. The results are shown in Fig 3 , and details of the experimental paradigms used can be found in Methods.

thumbnail

With modification of only the experimental protocol, our full model can successfully perform a range of conditioning tasks which can be performed by restrained (using the Proboscis Extention Reflex (PER) paradigm) and free flying bees. Performance closely matches experimental data with real bees (e.g. A : [ 46 ], B : [ 47 ], C & D : [ 48 ]).

https://doi.org/10.1371/journal.pcbi.1006435.g003

We have presented a simple neural model that is capable of learning the concepts of sameness and difference in Delayed Match to Sample (DMTS) and Delayed Not Match to Sample (DNMTS) tasks. Our model is inspired by the known neurobiology of the honey bee, and is capable of reproducing the performance of honey bees in a simulation of DMTS and DNMTS tasks. Our model therefore proposes a hypothesis for how animals like the honey bee might be able apparently to learn abstract concepts.

Abstract concept learning is typically described as a higher-order cognitive capacity [ 1 , 4 ], and one that is dependent on a top-down modulation of simpler learning mechanisms by a higher cognitive process [ 49 ]. By contrast our model proposes a solution to sameness and difference learning in DMTS-style tasks with no top-down structure (that is, no complex mechanisms imposing themselves on simpler processes). The actions of the PCT neurons are integrated with the KC learning pathway and provide a parallel processing pathway sensitive to stimulus magnitude, rather than a top-down imposition of a learned concept of sameness or difference ( Fig 1 ). This is a radical new proposal for how an abstract concept might be learned by an animal brain.

The first question we must ask when constructing a model regards plausibility. Our model ( Fig 1 ) shows a close match to the neuroanatomical data for the mushroom bodies. Several computational requirements of our model match with experimental data, notably the sensory accommodation in the response of the KC neurons. Previous neural models based on this structure have proposed mechanisms for various forms of associative learning, including extinction of learning, and positive and negative patterning [ 17 , 26 , 45 ]. Our model is also capable of solving a range of stimulus-specific learning tasks, including patterning ( Fig 3 ). No plausible previous model of the MB or the insect brain has been capable of learning abstract concepts, however.

As mentioned in the Introduction, a previous model by [ 17 ] demonstrates DMTS and DNMTS with transfer. Their motivation is the creation of a model for robotic implementation, rather than reproduction of behavioural observations from honey bees. While we suggest a role for the PCT neurons given experimental evidence of changes in the response of Kenyon Cells to repeasted stimuli, Arena et al.’s model assumes resonance between brain regions that is dependent upon the time after stimulus onset and the addition of specific neurons for ‘Match’ and ‘Non-match’; there is no biological evidence for either of these assumptions. Furthermore, the outcome of these additions is an increase in Kenyon cell firing in response to repeated stimuli; this is in opposition to neurophysiological evidence from multiple insect species, including honey bees [ 32 , 33 ]. In addition, Arena et al‘s proposed mechanism does not replicate the difficulty honey bees have in learning DMTS/DNMTS tasks, exhibiting learning in three trials, as opposed to 60 in real bees. In contrast, our model captures the rate and form of the learning found in real honey bees.

To enable a capacity for learning the stimulus-independent abstract concept of sameness or difference our model uniquely includes two interacting pathways. The KC pathway of the mushroom bodies retains stimulus-specific information and supports stimulus-dependent learning. The PCT pathway responds to summed activity across the KC population and is therefore largely independent of any stimulus-specific information. This allows information on stimulus magnitude, independent of stimulus specifics, to influence learning. Including a sensory accommodation property to the KCs [ 32 ] makes summed activity in the KCs in response to a stimulus sensitive to repetition, and therefore stimuli encountered successively (same) cause a different magnitude of KC response to novel stimuli (different) irrespective of stimulus specifics. This model is capable of learning sameness and difference rules in a simulation of the Y-maze DMTS and DNMTS tasks applied to honey bees ( Fig 2 ), but in theory it could also solve other abstract concepts related to stimulus magnitude such as quantitative comparisons [ 4 , 50 ].

Our model demonstrates a bias towards non-repeated stimuli, induced by the combination of sensory accommodation in the KC neurons and PCT learning during the pretraining phase, and largely mitigated by associative learning in the KC to EN synapses. This bias (see Fig 2 ) is indicated in the data from [ 8 ], and could be confirmed by further experimentation.

We note, however, that our model only supports a rather limited form of concept learning of sameness and difference. Learning in the model is dependent on sensory accommodation of the KCs to repeated stimuli [ 32 ]. This effect is transient, and hence the capacity to learn sameness or difference will be limited to situations with a relatively short delay between sample and matching stimuli. This limitation holds for honey bee learning of DMTS tasks [ 51 ], but many higher vertebrates do not have this limitation [ 52 ]. For example, in capuchins learning of sameness and difference is independent of time between sample and match [ 53 ]. We would expect that for animals with larger brains and a developed neocortex (or equivalent) many other neural mechanisms are likely to be at play to reinforce and enhance concept learning, enabling performance that exceeds that demonstrated for honey bees. Monkey pre-frontal cortex (PFC) neurons demonstrate considerable stimulus-specificity in matching tasks, and different regions appear to have different roles in coding the salience of these stimuli [ 54 , 55 ]. Recurrent neural activity between these selective PFC neurons and lower-order neural mechanisms could support such time independence. Language-trained primates did particularly well on complex identity matching tasks and the ability to form a language-related mental representation of a concept might be the reason [ 56 – 58 ].

Wright and Katz [ 1 ] have utilised a more elaborate form of a MTS task in which vertebrates simultaneously learn to respond to sameness and difference, and are trained with large sets of stimuli rather than just two. They argue this gives less-ambiguous evidence of true concept learning since both sameness and difference are learned during training, and the large size of the training stimulus set encourages true generalisation of the concept. In theory our model could also solve this form of task, but it is unlikely a honey bee could. Capuchins, rhesus and pigeons required hundreds of learning trials to learn and generalise the sameness and difference concepts [ 1 ]. Bees would not live long enough to complete this training,

Finally as a consequence of our model we question whether it is necessary to consider abstract concept learning to be a higher cognitive process. Mechanisms necessary to support it may not be much more complex than those needed for simple associative learning. This is important because many behavioural scientists still adhere to something like Lloyd Morgan’s Canon [ 59 ], which proposes that “in no case is an animal activity to be interpreted in terms of higher psychological processes if it can be fairly interpreted in terms of processes which stand lower in the scale of psychological evolution and development” ([ 59 ] p59). Yet the Canon is therefore reliant on an unambiguous stratification of cognitive processes according to evolutionary history and development [ 60 ]. If abstract concept learning is in fact developmentally quite simple, evolutionarily old and phylogenetically widespread, then Morgan’s Canon would simply beg the question of why even more animals do not have this capacity [ 61 ]. We argue that far more information on the precise neural mechanisms of different cognitive processes, and the distribution of cognitive abilities across animal groups, is needed in order to properly rank capacities as higher or lower. In other words, so-called higher cognitive processes may be solved by relatively simple structures.

Model parameter selection

Many of the parameters of the model were fixed by the neuroanatomy of the honey bee, as well as the previous values and procedures described in [ 26 ], with the following modifications.

First, we increased the sparseness of the connectivity from the PN to the KC.

Second, the reduction in the magnitude of the KC output to repeated stimuli was tuned to replicate the magnitude of reduction described in [ 32 ].

Third, the learning rates were set so that acquisition of a single stimulus is rapid. In addition there are two ratios from this initial value that must be set. These are the ratio of the speed of excitatory associative learning in the Kenyon Cell to Extrinsic Neuron pathway to the inhibitory learning in the Protocerebellar Tract to Extrinsic Neuron pathway, and the ratio of the speed of acquisition when rewarded to the speed to extinction when no reward is given. We conservatively set both of these ratios to 2:1, with excitatory learning faster than inhibitory learning, and extinction faster than acquisition. The learning parameters used in the reduced model are used in the fitting of the full model.

Finally, we tuned the threshold value for the PCT neurons so that they only responded to a new stimulus, and not a repeated one.

A full list of the parameters can be found in Table 2 .

thumbnail

https://doi.org/10.1371/journal.pcbi.1006435.t002

Reduced model

The reduced model is shown in Fig 1 , and described in the text in Results. Here are the equations governing the model.

abstract learning research

Additional neuronal inputs with similar connectivity as S 1 and S 2 , not shown explicitly in the diagram, are also present in the model simulations, and constructing the equations for these simply requires substitution of S i for T i in Eqs 1 , 2 and 3 . These represent the transfer stimuli and can be used following training to demonstrate transfer of learning. Details of training the model can be found in the Experiment subsection of the Methods.

The full model is shown in Fig 1 . Our model builds on a well established abstraction of the mushroom body circuit (see [ 26 , 27 , 44 ]) to model simple learning tasks.

abstract learning research

Similarly to the reduced model, a decision is made when the GO EN subpopulation activity is greater than the NOGO EN population by a bias Rd , where d increases every time a NOGO decision is made by 10.0, and R is a uniform random number in the range [−0.5, 0.5]. To prevent early decisions the sum of the whole EN population activity must be greater than 0.1.

Our challenge is to reproduce Giurfa et al’s data demonstrating bees solving DMTS and DNMTS tasks [ 8 ]. To aid exploration of our model we simplify the task it must face, while retaining the key elements of the problem as faced by the honeybee. We therefore embody our model in a world described by a state machine. This simple world sidesteps several navigation problems associated with the real world, however we believe that for the sufficiency proof we present here such simplifications are acceptable—the ability of the honeybee to form distinct and consistent neural representations of the training set stimuli as it flies through the maze is a prerequisite of solving the task, and is therefore assumed.

The experimental paradigm for our Y-maze task is shown in Fig 2 . The model bee is moved between a set of states which describe different locations in the Y-maze apparatus: at the entrance, in the central chamber facing the left arm, in the central chamber facing the right arm, in the left arm, in the right arm. When at the entrance or in the main chamber the bee is presented with a sensory input corresponding to one of the test stimuli. We can then set the test stimuli presented to match the requirements of a given trial (e.g. entrance (A), main chamber left (A), main chamber right (B) for DMTS when rewarding the left arm, or DNMTS when rewarding the right arm).

Experimental environment

abstract learning research

The IN neurons are divided into non-overlapping groups of 8 neurons, each representing a stimulus. These are:

  • Z: Stimulus for pretraining
  • A: Stimulus for training pair
  • B: Stimulus for training pair
  • C: Stimulus for transfer test pair
  • D: Stimulus for transfer test pair
  • E: Stimulus for second transfer test pair
  • F: Stimulus for second transfer test pair

abstract learning research

DMTS / DNMTS experimental procedure

Models as animals..

We use the ‘models as animals’ principle to reproduce the experimental findings of [ 8 ], creating many variants of the model which perform differently in the experiments. To do this we change the random seed used to generate the connectivity c ij between the IN and the KC neurons. For these experiments we use 360 model bee variants, each of which is individually tested, as this matches the number of bees in [ 8 ].

Pretraining familiarisation.

As is undertaken in the experiments with real bees, we first familiarise our naive model bees with the experimental apparatus. This is done by first training ten rewarded repetitions of the bee entering the Y-maze with a stimulus not used in the experiment. In these cases the model does not choose between go and no-go, it is assumed that the first repetition represents the model finding the Y-maze and being heavily enough rewarded to complete the remaining repetitions. Following these ten repetitions the bee is trained with ten repetitions to travel to each of the two arms of the Y-maze. This procedure ensures that the bees will enter the maze and the two arms when the training begins, allowing them to learn the task.

The training procedure comprises 60 trials in total, divided into blocks of 10 trials. The protocol involves a repeated set of four trials: two trials with each stimulus at the maze entrance, with each of these two trials having the stimulus at the maze entrance on different arms of the apparatus. In the case of match-to-sample the entrance stimulus is rewarded and the non-entrance stimulus is punished, and vice versa for not-match-to-stimulus.

Transfer test.

For the transfer test we do not provide a reward or punishment, and test the models using the procedure for Training, substituting the transfer test stimuli for the training stimuli. Two sets of transfer stimuli are used, and four repetitions (left and right arm with each stimulus) are used for each set of stimuli.

Testing performance of the full model in other conditioning tasks

In addition to solving the DMTS and DNMTS tasks, we must validate that our proposed model can also perform a set of conditioning tasks that are associated with the mushroom bodies in bees, without our additional PCT circuits affecting performance. Importantly, these tasks are all performed with exactly the same model parameters that are used in the DMTS and DNMTS tasks, yet match the timescales and relative performances found in experiments performed on real bees. We choose four tasks, which comprise olfactory learning experiments using the proboscis extention reflex (PER) that are performed on restrained bees as well as visual learning experiments performed with free flying bees ( Fig 4 ).

thumbnail

The model bee is moved between a set of states which describe different locations in the Y-maze apparatus ( A ): at the entrance ( B ), in the central chamber facing the left arm( D ), in the central chamber facing the right arm ( D ), in the left arm or in the right arm ( E,F ). When at the entrance or in the main chamber the bee is presented with a sensory input corresponding to one of the test stimuli; GO selection leads the bee to enter the maze when at the entrance, and to enter an arm and experience a potential reward when facing that arm; NOGO leads the bee to delay entering the maze, or to choose another maze arm uniformly at random, respectively. We can then set the test stimuli presented to match the requirements of a given trial (e.g. entrance (A), main chamber left (A), main chamber right (B) for DMTS when rewarding the left arm, or DNMTS when rewarding the right arm).

https://doi.org/10.1371/journal.pcbi.1006435.g004

Differential learning / reversal experimental procedure ( Fig 4 , panel B).

These experiments follow the same protocol as the DMTS experiments, except that for the first fifteen trials one stimulus is always rewarded when the associated arm is chosen (no reward or punishment is given for choosing the non-associated arm), and subsequent to trial fifteen the other stimulus is rewarded when the associated arm is chosen. No pretraining or transfer trials are performed and the data is analysed for each trial rather than in blocks of 10 due to the speed of learning acquisition. 200 virtual bees are used for this experiment (see Fig 4 , panel B for results, to be compared with [ 47 ]).

Proboscis Extension Reflex (PER) experiments.

The Proboscis Extension Reflex (PER) is a classical conditioning experimental paradigm used with restrained bees. In this paradigm the bees are immobilised in small metal tubes with only the head and antennae exposed. Olfactory stimuli (conditioned stimuli) are then presented to the restrained bees in association with a sucrose solution reward (unconditioned stimulus) (see [ 46 ] for full details).

For the PER experiments we separate the IN neurons as described in Section 1, however as the bees are restrained for these experiments we present odors following a pre-defined protocol, and the choices of the bee do not affect this protocol.

Single odor learning experimental procedure ( Fig 4 , panel A).

In the single odor experiments we use the procedure outlined in Bitterman et al [ 46 ]. In this procedure acquisition and testing occur simultaneously. The real bees are presented an odor, and after a delay rewarded with sucrose solution. If the animal extends its proboscis within the delay period it is rewarded directly and considered to have responded, if it does not the PER is invoked by touching the sucrose solution to the antennae and the animal is rewarded but considered not to have responded. To match this protocol the performance of the model was recorded at each trial, with NOGO considered a failure to respond to the stimulus, and GO a response. At each trial a reward was given regardless of the model’s performance.

Positive / negative patterning learning experimental procedure ( Fig 4 , panels C & D).

In these experiments we follow the protocol described in [ 48 ]. We divide the training into blocks, each containing four presentations of an odor or odor combination. For positive patterning we do not reward individual odors A and B, but reward the combination AB (A-B-AB+). In negative patterning we reward the odors A and B, but not the combination AB (A+B+AB-). In both cases the combined odor is presented twice for each presentation of the individual odors, so a block for positive patterning is [A-,AB+,B-,AB+] for example, while for negative patterning a block is [A+,AB-,B+,AB-]. Performance is assessed as for the single odor learning experiment, with the two combined odor responses averaged within each block.

Software and implementation

The reduced model was simulated in GNU Octave [ 64 ]. The full model was created and simulated using the SpineML toolchain [ 65 ] and the SpineCreator graphical user interface [ 66 ]. These tools are open source and installation and usage information can be found on the SpineML website at http://spineml.github.io/ . Input vectors for the IN neurons and the state engine for navigatation of the Y-maze apparatus are simulated using a custom script written in the Python programming language (Python Software Foundation, https://www.python.org/ ) interfaced to the model over a TCP/IP connection.

Statistical tests were performed as in [ 8 ] using 2x2 X 2 tests performed in R [ 67 ] using the chisq.test() function.

The code is available online at http://github.com/BrainsOnBoard/bee-concept-learning .

Acknowledgments

We thank Martin Giurfa, Thomas Nowotny and James Bennett for their constructive comments on the manuscript.

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 2. Piaget J, Inhelder B. The psychology of the child. Basic books; 1969.
  • 3. Daehler MW, Greco C. Memory in very young children. In: Cognitive Learning and Memory in Children. Springer New York; 1985. p. 49–79. Available from: http://link.springer.com/10.1007/978-1-4613-9544-7_2 .
  • 5. James W. The principles of psychology. vol. 1. Holt; 1890.
  • 15. Baddeley AD, Hitch G. Working Memory. In: Psychology of Learning and Motivation. Elsevier; 1974. p. 47–89. Available from: http://linkinghub.elsevier.com/retrieve/pii/S0079742108604521 .
  • 18. Strausfeld NJ. Arthropod brains: evolution, functional elegance, and historical significance. Belknap Press of Harvard University Press Cambridge; 2012.
  • 57. Premack D, Premack AJ. The mind of an ape; 1983.
  • 59. Lloyd Morgan C. An introduction to comparative psychology. London: W Scott Publishing Co; 1903.
  • 60. Sober E. Ockham’s razors. Cambridge University Press; 2015.
  • 64. John W Eaton David Bateman SH, Wehbring R. {GNU Octave} version 4.0.0 manual: a high-level interactive language for numerical computations. GNUOctave; 2015. Available from: http://www.gnu.org/software/octave/doc/interpreter .
  • 66. Cope AJ, Richmond P, James SS, Gurney K, Allerton DJ. SpineCreator: A graphical user interface for the creation of layered neural models. In-press. 2015
  • 67. R Core Team. R: A Language and Environment for Statistical Computing; 2013. Available from: http://www.r-project.org/ .

Advertisement

Advertisement

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

  • Review Article
  • Published: 18 August 2021
  • Volume 2 , article number  420 , ( 2021 )

Cite this article

abstract learning research

  • Iqbal H. Sarker   ORCID: orcid.org/0000-0003-1740-5517 1 , 2  

201k Accesses

690 Citations

24 Altmetric

Explore all metrics

Deep learning (DL), a branch of machine learning (ML) and artificial intelligence (AI) is nowadays considered as a core technology of today’s Fourth Industrial Revolution (4IR or Industry 4.0). Due to its learning capabilities from data, DL technology originated from artificial neural network (ANN), has become a hot topic in the context of computing, and is widely applied in various application areas like healthcare, visual recognition, text analytics, cybersecurity, and many more. However, building an appropriate DL model is a challenging task, due to the dynamic nature and variations in real-world problems and data. Moreover, the lack of core understanding turns DL methods into black-box machines that hamper development at the standard level. This article presents a structured and comprehensive view on DL techniques including a taxonomy considering various types of real-world tasks like supervised or unsupervised. In our taxonomy, we take into account deep networks for supervised or discriminative learning , unsupervised or generative learning as well as hybrid learning and relevant others. We also summarize real-world application areas where deep learning techniques can be used. Finally, we point out ten potential aspects for future generation DL modeling with research directions . Overall, this article aims to draw a big picture on DL modeling that can be used as a reference guide for both academia and industry professionals.

Similar content being viewed by others

abstract learning research

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Development and application of artificial neural network.

abstract learning research

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

Avoid common mistakes on your manuscript.

Introduction

In the late 1980s, neural networks became a prevalent topic in the area of Machine Learning (ML) as well as Artificial Intelligence (AI), due to the invention of various efficient learning methods and network structures [ 52 ]. Multilayer perceptron networks trained by “Backpropagation” type algorithms, self-organizing maps, and radial basis function networks were such innovative methods [ 26 , 36 , 37 ]. While neural networks are successfully used in many applications, the interest in researching this topic decreased later on. After that, in 2006, “Deep Learning” (DL) was introduced by Hinton et al. [ 41 ], which was based on the concept of artificial neural network (ANN). Deep learning became a prominent topic after that, resulting in a rebirth in neural network research, hence, some times referred to as “new-generation neural networks”. This is because deep networks, when properly trained, have produced significant success in a variety of classification and regression challenges [ 52 ].

Nowadays, DL technology is considered as one of the hot topics within the area of machine learning, artificial intelligence as well as data science and analytics, due to its learning capabilities from the given data. Many corporations including Google, Microsoft, Nokia, etc., study it actively as it can provide significant results in different classification and regression problems and datasets [ 52 ]. In terms of working domain, DL is considered as a subset of ML and AI, and thus DL can be seen as an AI function that mimics the human brain’s processing of data. The worldwide popularity of “Deep learning” is increasing day by day, which is shown in our earlier paper [ 96 ] based on the historical data collected from Google trends [ 33 ]. Deep learning differs from standard machine learning in terms of efficiency as the volume of data increases, discussed briefly in Section “ Why Deep Learning in Today's Research and Applications? ”. DL technology uses multiple layers to represent the abstractions of data to build computational models. While deep learning takes a long time to train a model due to a large number of parameters, it takes a short amount of time to run during testing as compared to other machine learning algorithms [ 127 ].

While today’s Fourth Industrial Revolution (4IR or Industry 4.0) is typically focusing on technology-driven “automation, smart and intelligent systems”, DL technology, which is originated from ANN, has become one of the core technologies to achieve the goal [ 103 , 114 ]. A typical neural network is mainly composed of many simple, connected processing elements or processors called neurons, each of which generates a series of real-valued activations for the target outcome. Figure 1 shows a schematic representation of the mathematical model of an artificial neuron, i.e., processing element, highlighting input ( \(X_i\) ), weight ( w ), bias ( b ), summation function ( \(\sum\) ), activation function ( f ) and corresponding output signal ( y ). Neural network-based DL technology is now widely applied in many fields and research areas such as healthcare, sentiment analysis, natural language processing, visual recognition, business intelligence, cybersecurity, and many more that have been summarized in the latter part of this paper.

figure 1

Schematic representation of the mathematical model of an artificial neuron (processing element), highlighting input ( \(X_i\) ), weight ( w ), bias ( b ), summation function ( \(\sum\) ), activation function ( f ) and output signal ( y )

Although DL models are successfully applied in various application areas, mentioned above, building an appropriate model of deep learning is a challenging task, due to the dynamic nature and variations of real-world problems and data. Moreover, DL models are typically considered as “black-box” machines that hamper the standard development of deep learning research and applications. Thus for clear understanding, in this paper, we present a structured and comprehensive view on DL techniques considering the variations in real-world problems and tasks. To achieve our goal, we briefly discuss various DL techniques and present a taxonomy by taking into account three major categories: (i) deep networks for supervised or discriminative learning that is utilized to provide a discriminative function in supervised deep learning or classification applications; (ii) deep networks for unsupervised or generative learning that are used to characterize the high-order correlation properties or features for pattern analysis or synthesis, thus can be used as preprocessing for the supervised algorithm; and (ii) deep networks for hybrid learning that is an integration of both supervised and unsupervised model and relevant others. We take into account such categories based on the nature and learning capabilities of different DL techniques and how they are used to solve problems in real-world applications [ 97 ]. Moreover, identifying key research issues and prospects including effective data representation, new algorithm design, data-driven hyper-parameter learning, and model optimization, integrating domain knowledge, adapting resource-constrained devices, etc. is one of the key targets of this study, which can lead to “Future Generation DL-Modeling”. Thus the goal of this paper is set to assist those in academia and industry as a reference guide, who want to research and develop data-driven smart and intelligent systems based on DL techniques.

The overall contribution of this paper is summarized as follows:

This article focuses on different aspects of deep learning modeling, i.e., the learning capabilities of DL techniques in different dimensions such as supervised or unsupervised tasks, to function in an automated and intelligent manner, which can play as a core technology of today’s Fourth Industrial Revolution (Industry 4.0).

We explore a variety of prominent DL techniques and present a taxonomy by taking into account the variations in deep learning tasks and how they are used for different purposes. In our taxonomy, we divide the techniques into three major categories such as deep networks for supervised or discriminative learning, unsupervised or generative learning, as well as deep networks for hybrid learning, and relevant others.

We have summarized several potential real-world application areas of deep learning, to assist developers as well as researchers in broadening their perspectives on DL techniques. Different categories of DL techniques highlighted in our taxonomy can be used to solve various issues accordingly.

Finally, we point out and discuss ten potential aspects with research directions for future generation DL modeling in terms of conducting future research and system development.

This paper is organized as follows. Section “ Why Deep Learning in Today's Research and Applications? ” motivates why deep learning is important to build data-driven intelligent systems. In Section“ Deep Learning Techniques and Applications ”, we present our DL taxonomy by taking into account the variations of deep learning tasks and how they are used in solving real-world issues and briefly discuss the techniques with summarizing the potential application areas. In Section “ Research Directions and Future Aspects ”, we discuss various research issues of deep learning-based modeling and highlight the promising topics for future research within the scope of our study. Finally, Section “ Concluding Remarks ” concludes this paper.

Why Deep Learning in Today’s Research and Applications?

The main focus of today’s Fourth Industrial Revolution (Industry 4.0) is typically technology-driven automation, smart and intelligent systems, in various application areas including smart healthcare, business intelligence, smart cities, cybersecurity intelligence, and many more [ 95 ]. Deep learning approaches have grown dramatically in terms of performance in a wide range of applications considering security technologies, particularly, as an excellent solution for uncovering complex architecture in high-dimensional data. Thus, DL techniques can play a key role in building intelligent data-driven systems according to today’s needs, because of their excellent learning capabilities from historical data. Consequently, DL can change the world as well as humans’ everyday life through its automation power and learning from experience. DL technology is therefore relevant to artificial intelligence [ 103 ], machine learning [ 97 ] and data science with advanced analytics [ 95 ] that are well-known areas in computer science, particularly, today’s intelligent computing. In the following, we first discuss regarding the position of deep learning in AI, or how DL technology is related to these areas of computing.

The Position of Deep Learning in AI

Nowadays, artificial intelligence (AI), machine learning (ML), and deep learning (DL) are three popular terms that are sometimes used interchangeably to describe systems or software that behaves intelligently. In Fig. 2 , we illustrate the position of deep Learning, comparing with machine learning and artificial intelligence. According to Fig. 2 , DL is a part of ML as well as a part of the broad area AI. In general, AI incorporates human behavior and intelligence to machines or systems [ 103 ], while ML is the method to learn from data or experience [ 97 ], which automates analytical model building. DL also represents learning methods from data where the computation is done through multi-layer neural networks and processing. The term “Deep” in the deep learning methodology refers to the concept of multiple levels or stages through which data is processed for building a data-driven model.

figure 2

An illustration of the position of deep learning (DL), comparing with machine learning (ML) and artificial intelligence (AI)

Thus, DL can be considered as one of the core technology of AI, a frontier for artificial intelligence, which can be used for building intelligent systems and automation. More importantly, it pushes AI to a new level, termed “Smarter AI”. As DL are capable of learning from data, there is a strong relation of deep learning with “Data Science” [ 95 ] as well. Typically, data science represents the entire process of finding meaning or insights in data in a particular problem domain, where DL methods can play a key role for advanced analytics and intelligent decision-making [ 104 , 106 ]. Overall, we can conclude that DL technology is capable to change the current world, particularly, in terms of a powerful computational engine and contribute to technology-driven automation, smart and intelligent systems accordingly, and meets the goal of Industry 4.0.

Understanding Various Forms of Data

As DL models learn from data, an in-depth understanding and representation of data are important to build a data-driven intelligent system in a particular application area. In the real world, data can be in various forms, which typically can be represented as below for deep learning modeling:

Sequential Data Sequential data is any kind of data where the order matters, i,e., a set of sequences. It needs to explicitly account for the sequential nature of input data while building the model. Text streams, audio fragments, video clips, time-series data, are some examples of sequential data.

Image or 2D Data A digital image is made up of a matrix, which is a rectangular array of numbers, symbols, or expressions arranged in rows and columns in a 2D array of numbers. Matrix, pixels, voxels, and bit depth are the four essential characteristics or fundamental parameters of a digital image.

Tabular Data A tabular dataset consists primarily of rows and columns. Thus tabular datasets contain data in a columnar format as in a database table. Each column (field) must have a name and each column may only contain data of the defined type. Overall, it is a logical and systematic arrangement of data in the form of rows and columns that are based on data properties or features. Deep learning models can learn efficiently on tabular data and allow us to build data-driven intelligent systems.

The above-discussed data forms are common in the real-world application areas of deep learning. Different categories of DL techniques perform differently depending on the nature and characteristics of data, discussed briefly in Section “ Deep Learning Techniques and Applications ” with a taxonomy presentation. However, in many real-world application areas, the standard machine learning techniques, particularly, logic-rule or tree-based techniques [ 93 , 101 ] perform significantly depending on the application nature. Figure 3 also shows the performance comparison of DL and ML modeling considering the amount of data. In the following, we highlight several cases, where deep learning is useful to solve real-world problems, according to our main focus in this paper.

DL Properties and Dependencies

A DL model typically follows the same processing stages as machine learning modeling. In Fig. 4 , we have shown a deep learning workflow to solve real-world problems, which consists of three processing steps, such as data understanding and preprocessing, DL model building, and training, and validation and interpretation. However, unlike the ML modeling [ 98 , 108 ], feature extraction in the DL model is automated rather than manual. K-nearest neighbor, support vector machines, decision tree, random forest, naive Bayes, linear regression, association rules, k-means clustering, are some examples of machine learning techniques that are commonly used in various application areas [ 97 ]. On the other hand, the DL model includes convolution neural network, recurrent neural network, autoencoder, deep belief network, and many more, discussed briefly with their potential application areas in Section 3 . In the following, we discuss the key properties and dependencies of DL techniques, that are needed to take into account before started working on DL modeling for real-world applications.

figure 3

An illustration of the performance comparison between deep learning (DL) and other machine learning (ML) algorithms, where DL modeling from large amounts of data can increase the performance

Data Dependencies Deep learning is typically dependent on a large amount of data to build a data-driven model for a particular problem domain. The reason is that when the data volume is small, deep learning algorithms often perform poorly [ 64 ]. In such circumstances, however, the performance of the standard machine-learning algorithms will be improved if the specified rules are used [ 64 , 107 ].

Hardware Dependencies The DL algorithms require large computational operations while training a model with large datasets. As the larger the computations, the more the advantage of a GPU over a CPU, the GPU is mostly used to optimize the operations efficiently. Thus, to work properly with the deep learning training, GPU hardware is necessary. Therefore, DL relies more on high-performance machines with GPUs than standard machine learning methods [ 19 , 127 ].

Feature Engineering Process Feature engineering is the process of extracting features (characteristics, properties, and attributes) from raw data using domain knowledge. A fundamental distinction between DL and other machine-learning techniques is the attempt to extract high-level characteristics directly from data [ 22 , 97 ]. Thus, DL decreases the time and effort required to construct a feature extractor for each problem.

Model Training and Execution time In general, training a deep learning algorithm takes a long time due to a large number of parameters in the DL algorithm; thus, the model training process takes longer. For instance, the DL models can take more than one week to complete a training session, whereas training with ML algorithms takes relatively little time, only seconds to hours [ 107 , 127 ]. During testing, deep learning algorithms take extremely little time to run [ 127 ], when compared to certain machine learning methods.

Black-box Perception and Interpretability Interpretability is an important factor when comparing DL with ML. It’s difficult to explain how a deep learning result was obtained, i.e., “black-box”. On the other hand, the machine-learning algorithms, particularly, rule-based machine learning techniques [ 97 ] provide explicit logic rules (IF-THEN) for making decisions that are easily interpretable for humans. For instance, in our earlier works, we have presented several machines learning rule-based techniques [ 100 , 102 , 105 ], where the extracted rules are human-understandable and easier to interpret, update or delete according to the target applications.

The most significant distinction between deep learning and regular machine learning is how well it performs when data grows exponentially. An illustration of the performance comparison between DL and standard ML algorithms has been shown in Fig. 3 , where DL modeling can increase the performance with the amount of data. Thus, DL modeling is extremely useful when dealing with a large amount of data because of its capacity to process vast amounts of features to build an effective data-driven model. In terms of developing and training DL models, it relies on parallelized matrix and tensor operations as well as computing gradients and optimization. Several, DL libraries and resources [ 30 ] such as PyTorch [ 82 ] (with a high-level API called Lightning) and TensorFlow [ 1 ] (which also offers Keras as a high-level API) offers these core utilities including many pre-trained models, as well as many other necessary functions for implementation and DL model building.

figure 4

A typical DL workflow to solve real-world problems, which consists of three sequential stages (i) data understanding and preprocessing (ii) DL model building and training (iii) validation and interpretation

Deep Learning Techniques and Applications

In this section, we go through the various types of deep neural network techniques, which typically consider several layers of information-processing stages in hierarchical structures to learn. A typical deep neural network contains multiple hidden layers including input and output layers. Figure 5 shows a general structure of a deep neural network ( \(hidden \; layer=N\) and N \(\ge\) 2) comparing with a shallow network ( \(hidden \; layer=1\) ). We also present our taxonomy on DL techniques based on how they are used to solve various problems, in this section. However, before exploring the details of the DL techniques, it’s useful to review various types of learning tasks such as (i) Supervised: a task-driven approach that uses labeled training data, (ii) Unsupervised: a data-driven process that analyzes unlabeled datasets, (iii) Semi-supervised: a hybridization of both the supervised and unsupervised methods, and (iv) Reinforcement: an environment driven approach, discussed briefly in our earlier paper [ 97 ]. Thus, to present our taxonomy, we divide DL techniques broadly into three major categories: (i) deep networks for supervised or discriminative learning; (ii) deep networks for unsupervised or generative learning; and (ii) deep networks for hybrid learning combing both and relevant others, as shown in Fig. 6 . In the following, we briefly discuss each of these techniques that can be used to solve real-world problems in various application areas according to their learning capabilities.

figure 5

A general architecture of a a shallow network with one hidden layer and b a deep neural network with multiple hidden layers

figure 6

A taxonomy of DL techniques, broadly divided into three major categories (i) deep networks for supervised or discriminative learning, (ii) deep networks for unsupervised or generative learning, and (ii) deep networks for hybrid learning and relevant others

Deep Networks for Supervised or Discriminative Learning

This category of DL techniques is utilized to provide a discriminative function in supervised or classification applications. Discriminative deep architectures are typically designed to give discriminative power for pattern classification by describing the posterior distributions of classes conditioned on visible data [ 21 ]. Discriminative architectures mainly include Multi-Layer Perceptron (MLP), Convolutional Neural Networks (CNN or ConvNet), Recurrent Neural Networks (RNN), along with their variants. In the following, we briefly discuss these techniques.

Multi-layer Perceptron (MLP)

Multi-layer Perceptron (MLP), a supervised learning approach [ 83 ], is a type of feedforward artificial neural network (ANN). It is also known as the foundation architecture of deep neural networks (DNN) or deep learning. A typical MLP is a fully connected network that consists of an input layer that receives input data, an output layer that makes a decision or prediction about the input signal, and one or more hidden layers between these two that are considered as the network’s computational engine [ 36 , 103 ]. The output of an MLP network is determined using a variety of activation functions, also known as transfer functions, such as ReLU (Rectified Linear Unit), Tanh, Sigmoid, and Softmax [ 83 , 96 ]. To train MLP employs the most extensively used algorithm “Backpropagation” [ 36 ], a supervised learning technique, which is also known as the most basic building block of a neural network. During the training process, various optimization approaches such as Stochastic Gradient Descent (SGD), Limited Memory BFGS (L-BFGS), and Adaptive Moment Estimation (Adam) are applied. MLP requires tuning of several hyperparameters such as the number of hidden layers, neurons, and iterations, which could make solving a complicated model computationally expensive. However, through partial fit, MLP offers the advantage of learning non-linear models in real-time or online [ 83 ].

Convolutional Neural Network (CNN or ConvNet)

The Convolutional Neural Network (CNN or ConvNet) [ 65 ] is a popular discriminative deep learning architecture that learns directly from the input without the need for human feature extraction. Figure 7 shows an example of a CNN including multiple convolutions and pooling layers. As a result, the CNN enhances the design of traditional ANN like regularized MLP networks. Each layer in CNN takes into account optimum parameters for a meaningful output as well as reduces model complexity. CNN also uses a ‘dropout’ [ 30 ] that can deal with the problem of over-fitting, which may occur in a traditional network.

figure 7

An example of a convolutional neural network (CNN or ConvNet) including multiple convolution and pooling layers

CNNs are specifically intended to deal with a variety of 2D shapes and are thus widely employed in visual recognition, medical image analysis, image segmentation, natural language processing, and many more [ 65 , 96 ]. The capability of automatically discovering essential features from the input without the need for human intervention makes it more powerful than a traditional network. Several variants of CNN are exist in the area that includes visual geometry group (VGG) [ 38 ], AlexNet [ 62 ], Xception [ 17 ], Inception [ 116 ], ResNet [ 39 ], etc. that can be used in various application domains according to their learning capabilities.

Recurrent Neural Network (RNN) and its Variants

A Recurrent Neural Network (RNN) is another popular neural network, which employs sequential or time-series data and feeds the output from the previous step as input to the current stage [ 27 , 74 ]. Like feedforward and CNN, recurrent networks learn from training input, however, distinguish by their “memory”, which allows them to impact current input and output through using information from previous inputs. Unlike typical DNN, which assumes that inputs and outputs are independent of one another, the output of RNN is reliant on prior elements within the sequence. However, standard recurrent networks have the issue of vanishing gradients, which makes learning long data sequences challenging. In the following, we discuss several popular variants of the recurrent network that minimizes the issues and perform well in many real-world application domains.

Long short-term memory (LSTM) This is a popular form of RNN architecture that uses special units to deal with the vanishing gradient problem, which was introduced by Hochreiter et al. [ 42 ]. A memory cell in an LSTM unit can store data for long periods and the flow of information into and out of the cell is managed by three gates. For instance, the ‘Forget Gate’ determines what information from the previous state cell will be memorized and what information will be removed that is no longer useful, while the ‘Input Gate’ determines which information should enter the cell state and the ‘Output Gate’ determines and controls the outputs. As it solves the issues of training a recurrent network, the LSTM network is considered one of the most successful RNN.

Bidirectional RNN/LSTM Bidirectional RNNs connect two hidden layers that run in opposite directions to a single output, allowing them to accept data from both the past and future. Bidirectional RNNs, unlike traditional recurrent networks, are trained to predict both positive and negative time directions at the same time. A Bidirectional LSTM, often known as a BiLSTM, is an extension of the standard LSTM that can increase model performance on sequence classification issues [ 113 ]. It is a sequence processing model comprising of two LSTMs: one takes the input forward and the other takes it backward. Bidirectional LSTM in particular is a popular choice in natural language processing tasks.

Gated recurrent units (GRUs) A Gated Recurrent Unit (GRU) is another popular variant of the recurrent network that uses gating methods to control and manage information flow between cells in the neural network, introduced by Cho et al. [ 16 ]. The GRU is like an LSTM, however, has fewer parameters, as it has a reset gate and an update gate but lacks the output gate, as shown in Fig. 8 . Thus, the key difference between a GRU and an LSTM is that a GRU has two gates (reset and update gates) whereas an LSTM has three gates (namely input, output and forget gates). The GRU’s structure enables it to capture dependencies from large sequences of data in an adaptive manner, without discarding information from earlier parts of the sequence. Thus GRU is a slightly more streamlined variant that often offers comparable performance and is significantly faster to compute [ 18 ]. Although GRUs have been shown to exhibit better performance on certain smaller and less frequent datasets [ 18 , 34 ], both variants of RNN have proven their effectiveness while producing the outcome.

figure 8

Basic structure of a gated recurrent unit (GRU) cell consisting of reset and update gates

Overall, the basic property of a recurrent network is that it has at least one feedback connection, which enables activations to loop. This allows the networks to do temporal processing and sequence learning, such as sequence recognition or reproduction, temporal association or prediction, etc. Following are some popular application areas of recurrent networks such as prediction problems, machine translation, natural language processing, text summarization, speech recognition, and many more.

Deep Networks for Generative or Unsupervised Learning

This category of DL techniques is typically used to characterize the high-order correlation properties or features for pattern analysis or synthesis, as well as the joint statistical distributions of the visible data and their associated classes [ 21 ]. The key idea of generative deep architectures is that during the learning process, precise supervisory information such as target class labels is not of concern. As a result, the methods under this category are essentially applied for unsupervised learning as the methods are typically used for feature learning or data generating and representation [ 20 , 21 ]. Thus generative modeling can be used as preprocessing for the supervised learning tasks as well, which ensures the discriminative model accuracy. Commonly used deep neural network techniques for unsupervised or generative learning are Generative Adversarial Network (GAN), Autoencoder (AE), Restricted Boltzmann Machine (RBM), Self-Organizing Map (SOM), and Deep Belief Network (DBN) along with their variants.

Generative Adversarial Network (GAN)

A Generative Adversarial Network (GAN), designed by Ian Goodfellow [ 32 ], is a type of neural network architecture for generative modeling to create new plausible samples on demand. It involves automatically discovering and learning regularities or patterns in input data so that the model may be used to generate or output new examples from the original dataset. As shown in Fig. 9 , GANs are composed of two neural networks, a generator G that creates new data having properties similar to the original data, and a discriminator D that predicts the likelihood of a subsequent sample being drawn from actual data rather than data provided by the generator. Thus in GAN modeling, both the generator and discriminator are trained to compete with each other. While the generator tries to fool and confuse the discriminator by creating more realistic data, the discriminator tries to distinguish the genuine data from the fake data generated by G .

figure 9

Schematic structure of a standard generative adversarial network (GAN)

Generally, GAN network deployment is designed for unsupervised learning tasks, but it has also proven to be a better solution for semi-supervised and reinforcement learning as well depending on the task [ 3 ]. GANs are also used in state-of-the-art transfer learning research to enforce the alignment of the latent feature space [ 66 ]. Inverse models, such as Bidirectional GAN (BiGAN) [ 25 ] can also learn a mapping from data to the latent space, similar to how the standard GAN model learns a mapping from a latent space to the data distribution. The potential application areas of GAN networks are healthcare, image analysis, data augmentation, video generation, voice generation, pandemics, traffic control, cybersecurity, and many more, which are increasing rapidly. Overall, GANs have established themselves as a comprehensive domain of independent data expansion and as a solution to problems requiring a generative solution.

Auto-Encoder (AE) and Its Variants

An auto-encoder (AE) [ 31 ] is a popular unsupervised learning technique in which neural networks are used to learn representations. Typically, auto-encoders are used to work with high-dimensional data, and dimensionality reduction explains how a set of data is represented. Encoder, code, and decoder are the three parts of an autoencoder. The encoder compresses the input and generates the code, which the decoder subsequently uses to reconstruct the input. The AEs have recently been used to learn generative data models [ 69 ]. The auto-encoder is widely used in many unsupervised learning tasks, e.g., dimensionality reduction, feature extraction, efficient coding, generative modeling, denoising, anomaly or outlier detection, etc. [ 31 , 132 ]. Principal component analysis (PCA) [ 99 ], which is also used to reduce the dimensionality of huge data sets, is essentially similar to a single-layered AE with a linear activation function. Regularized autoencoders such as sparse, denoising, and contractive are useful for learning representations for later classification tasks [ 119 ], while variational autoencoders can be used as generative models [ 56 ], discussed below.

Sparse Autoencoder (SAE) A sparse autoencoder [ 73 ] has a sparsity penalty on the coding layer as a part of its training requirement. SAEs may have more hidden units than inputs, but only a small number of hidden units are permitted to be active at the same time, resulting in a sparse model. Figure 10 shows a schematic structure of a sparse autoencoder with several active units in the hidden layer. This model is thus obliged to respond to the unique statistical features of the training data following its constraints.

Denoising Autoencoder (DAE) A denoising autoencoder is a variant on the basic autoencoder that attempts to improve representation (to extract useful features) by altering the reconstruction criterion, and thus reduces the risk of learning the identity function [ 31 , 119 ]. In other words, it receives a corrupted data point as input and is trained to recover the original undistorted input as its output through minimizing the average reconstruction error over the training data, i.e, cleaning the corrupted input, or denoising. Thus, in the context of computing, DAEs can be considered as very powerful filters that can be utilized for automatic pre-processing. A denoising autoencoder, for example, could be used to automatically pre-process an image, thereby boosting its quality for recognition accuracy.

Contractive Autoencoder (CAE) The idea behind a contractive autoencoder, proposed by Rifai et al. [ 90 ], is to make the autoencoders robust of small changes in the training dataset. In its objective function, a CAE includes an explicit regularizer that forces the model to learn an encoding that is robust to small changes in input values. As a result, the learned representation’s sensitivity to the training input is reduced. While DAEs encourage the robustness of reconstruction as discussed above, CAEs encourage the robustness of representation.

Variational Autoencoder (VAE) A variational autoencoder [ 55 ] has a fundamentally unique property that distinguishes it from the classical autoencoder discussed above, which makes this so effective for generative modeling. VAEs, unlike the traditional autoencoders which map the input onto a latent vector, map the input data into the parameters of a probability distribution, such as the mean and variance of a Gaussian distribution. A VAE assumes that the source data has an underlying probability distribution and then tries to discover the distribution’s parameters. Although this approach was initially designed for unsupervised learning, its use has been demonstrated in other domains such as semi-supervised learning [ 128 ] and supervised learning [ 51 ].

figure 10

Schematic structure of a sparse autoencoder (SAE) with several active units (filled circle) in the hidden layer

Although, the earlier concept of AE was typically for dimensionality reduction or feature learning mentioned above, recently, AEs have been brought to the forefront of generative modeling, even the generative adversarial network is one of the popular methods in the area. The AEs have been effectively employed in a variety of domains, including healthcare, computer vision, speech recognition, cybersecurity, natural language processing, and many more. Overall, we can conclude that auto-encoder and its variants can play a significant role as unsupervised feature learning with neural network architecture.

Kohonen Map or Self-Organizing Map (SOM)

A Self-Organizing Map (SOM) or Kohonen Map [ 59 ] is another form of unsupervised learning technique for creating a low-dimensional (usually two-dimensional) representation of a higher-dimensional data set while maintaining the topological structure of the data. SOM is also known as a neural network-based dimensionality reduction algorithm that is commonly used for clustering [ 118 ]. A SOM adapts to the topological form of a dataset by repeatedly moving its neurons closer to the data points, allowing us to visualize enormous datasets and find probable clusters. The first layer of a SOM is the input layer, and the second layer is the output layer or feature map. Unlike other neural networks that use error-correction learning, such as backpropagation with gradient descent [ 36 ], SOMs employ competitive learning, which uses a neighborhood function to retain the input space’s topological features. SOM is widely utilized in a variety of applications, including pattern identification, health or medical diagnosis, anomaly detection, and virus or worm attack detection [ 60 , 87 ]. The primary benefit of employing a SOM is that this can make high-dimensional data easier to visualize and analyze to understand the patterns. The reduction of dimensionality and grid clustering makes it easy to observe similarities in the data. As a result, SOMs can play a vital role in developing a data-driven effective model for a particular problem domain, depending on the data characteristics.

Restricted Boltzmann Machine (RBM)

A Restricted Boltzmann Machine (RBM) [ 75 ] is also a generative stochastic neural network capable of learning a probability distribution across its inputs. Boltzmann machines typically consist of visible and hidden nodes and each node is connected to every other node, which helps us understand irregularities by learning how the system works in normal circumstances. RBMs are a subset of Boltzmann machines that have a limit on the number of connections between the visible and hidden layers [ 77 ]. This restriction permits training algorithms like the gradient-based contrastive divergence algorithm to be more efficient than those for Boltzmann machines in general [ 41 ]. RBMs have found applications in dimensionality reduction, classification, regression, collaborative filtering, feature learning, topic modeling, and many others. In the area of deep learning modeling, they can be trained either supervised or unsupervised, depending on the task. Overall, the RBMs can recognize patterns in data automatically and develop probabilistic or stochastic models, which are utilized for feature selection or extraction, as well as forming a deep belief network.

Deep Belief Network (DBN)

A Deep Belief Network (DBN) [ 40 ] is a multi-layer generative graphical model of stacking several individual unsupervised networks such as AEs or RBMs, that use each network’s hidden layer as the input for the next layer, i.e, connected sequentially. Thus, we can divide a DBN into (i) AE-DBN which is known as stacked AE, and (ii) RBM-DBN that is known as stacked RBM, where AE-DBN is composed of autoencoders and RBM-DBN is composed of restricted Boltzmann machines, discussed earlier. The ultimate goal is to develop a faster-unsupervised training technique for each sub-network that depends on contrastive divergence [ 41 ]. DBN can capture a hierarchical representation of input data based on its deep structure. The primary idea behind DBN is to train unsupervised feed-forward neural networks with unlabeled data before fine-tuning the network with labeled input. One of the most important advantages of DBN, as opposed to typical shallow learning networks, is that it permits the detection of deep patterns, which allows for reasoning abilities and the capture of the deep difference between normal and erroneous data [ 89 ]. A continuous DBN is simply an extension of a standard DBN that allows a continuous range of decimals instead of binary data. Overall, the DBN model can play a key role in a wide range of high-dimensional data applications due to its strong feature extraction and classification capabilities and become one of the significant topics in the field of neural networks.

In summary, the generative learning techniques discussed above typically allow us to generate a new representation of data through exploratory analysis. As a result, these deep generative networks can be utilized as preprocessing for supervised or discriminative learning tasks, as well as ensuring model accuracy, where unsupervised representation learning can allow for improved classifier generalization.

Deep Networks for Hybrid Learning and Other Approaches

In addition to the above-discussed deep learning categories, hybrid deep networks and several other approaches such as deep transfer learning (DTL) and deep reinforcement learning (DRL) are popular, which are discussed in the following.

Hybrid Deep Neural Networks

Generative models are adaptable, with the capacity to learn from both labeled and unlabeled data. Discriminative models, on the other hand, are unable to learn from unlabeled data yet outperform their generative counterparts in supervised tasks. A framework for training both deep generative and discriminative models simultaneously can enjoy the benefits of both models, which motivates hybrid networks.

Hybrid deep learning models are typically composed of multiple (two or more) deep basic learning models, where the basic model is a discriminative or generative deep learning model discussed earlier. Based on the integration of different basic generative or discriminative models, the below three categories of hybrid deep learning models might be useful for solving real-world problems. These are as follows:

Hybrid \(Model\_1\) : An integration of different generative or discriminative models to extract more meaningful and robust features. Examples could be CNN+LSTM, AE+GAN, and so on.

Hybrid \(Model\_2\) : An integration of generative model followed by a discriminative model. Examples could be DBN+MLP, GAN+CNN, AE+CNN, and so on.

Hybrid \(Model\_3\) : An integration of generative or discriminative model followed by a non-deep learning classifier. Examples could be AE+SVM, CNN+SVM, and so on.

Thus, in a broad sense, we can conclude that hybrid models can be either classification-focused or non-classification depending on the target use. However, most of the hybrid learning-related studies in the area of deep learning are classification-focused or supervised learning tasks, summarized in Table 1 . The unsupervised generative models with meaningful representations are employed to enhance the discriminative models. The generative models with useful representation can provide more informative and low-dimensional features for discrimination, and they can also enable to enhance the training data quality and quantity, providing additional information for classification.

Deep Transfer Learning (DTL)

Transfer Learning is a technique for effectively using previously learned model knowledge to solve a new task with minimum training or fine-tuning. In comparison to typical machine learning techniques [ 97 ], DL takes a large amount of training data. As a result, the need for a substantial volume of labeled data is a significant barrier to address some essential domain-specific tasks, particularly, in the medical sector, where creating large-scale, high-quality annotated medical or health datasets is both difficult and costly. Furthermore, the standard DL model demands a lot of computational resources, such as a GPU-enabled server, even though researchers are working hard to improve it. As a result, Deep Transfer Learning (DTL), a DL-based transfer learning method, might be helpful to address this issue. Figure 11 shows a general structure of the transfer learning process, where knowledge from the pre-trained model is transferred into a new DL model. It’s especially popular in deep learning right now since it allows to train deep neural networks with very little data [ 126 ].

figure 11

A general structure of transfer learning process, where knowledge from pre-trained model is transferred into new DL model

Transfer learning is a two-stage approach for training a DL model that consists of a pre-training step and a fine-tuning step in which the model is trained on the target task. Since deep neural networks have gained popularity in a variety of fields, a large number of DTL methods have been presented, making it crucial to categorize and summarize them. Based on the techniques used in the literature, DTL can be classified into four categories [ 117 ]. These are (i) instances-based deep transfer learning that utilizes instances in source domain by appropriate weight, (ii) mapping-based deep transfer learning that maps instances from two domains into a new data space with better similarity, (iii) network-based deep transfer learning that reuses the partial of network pre-trained in the source domain, and (iv) adversarial based deep transfer learning that uses adversarial technology to find transferable features that both suitable for two domains. Due to its high effectiveness and practicality, adversarial-based deep transfer learning has exploded in popularity in recent years. Transfer learning can also be classified into inductive, transductive, and unsupervised transfer learning depending on the circumstances between the source and target domains and activities [ 81 ]. While most current research focuses on supervised learning, how deep neural networks can transfer knowledge in unsupervised or semi-supervised learning may gain further interest in the future. DTL techniques are useful in a variety of fields including natural language processing, sentiment classification, visual recognition, speech recognition, spam filtering, and relevant others.

Deep Reinforcement Learning (DRL)

Reinforcement learning takes a different approach to solving the sequential decision-making problem than other approaches we have discussed so far. The concepts of an environment and an agent are often introduced first in reinforcement learning. The agent can perform a series of actions in the environment, each of which has an impact on the environment’s state and can result in possible rewards (feedback) - “positive” for good sequences of actions that result in a “good” state, and “negative” for bad sequences of actions that result in a “bad” state. The purpose of reinforcement learning is to learn good action sequences through interaction with the environment, typically referred to as a policy.

figure 12

Schematic structure of deep reinforcement learning (DRL) highlighting a deep neural network

Deep reinforcement learning (DRL or deep RL) [ 9 ] integrates neural networks with a reinforcement learning architecture to allow the agents to learn the appropriate actions in a virtual environment, as shown in Fig. 12 . In the area of reinforcement learning, model-based RL is based on learning a transition model that enables for modeling of the environment without interacting with it directly, whereas model-free RL methods learn directly from interactions with the environment. Q-learning is a popular model-free RL technique for determining the best action-selection policy for any (finite) Markov Decision Process (MDP) [ 86 , 97 ]. MDP is a mathematical framework for modeling decisions based on state, action, and rewards [ 86 ]. In addition, Deep Q-Networks, Double DQN, Bi-directional Learning, Monte Carlo Control, etc. are used in the area [ 50 , 97 ]. In DRL methods it incorporates DL models, e.g. Deep Neural Networks (DNN), based on MDP principle [ 71 ], as policy and/or value function approximators. CNN for example can be used as a component of RL agents to learn directly from raw, high-dimensional visual inputs. In the real world, DRL-based solutions can be used in several application areas including robotics, video games, natural language processing, computer vision, and relevant others.

figure 13

Several potential real-world application areas of deep learning

Deep Learning Application Summary

During the past few years, deep learning has been successfully applied to numerous problems in many application areas. These include natural language processing, sentiment analysis, cybersecurity, business, virtual assistants, visual recognition, healthcare, robotics, and many more. In Fig. 13 , we have summarized several potential real-world application areas of deep learning. Various deep learning techniques according to our presented taxonomy in Fig. 6 that includes discriminative learning, generative learning, as well as hybrid models, discussed earlier, are employed in these application areas. In Table 1 , we have also summarized various deep learning tasks and techniques that are used to solve the relevant tasks in several real-world applications areas. Overall, from Fig. 13 and Table 1 , we can conclude that the future prospects of deep learning modeling in real-world application areas are huge and there are lots of scopes to work. In the next section, we also summarize the research issues in deep learning modeling and point out the potential aspects for future generation DL modeling.

Research Directions and Future Aspects

While existing methods have established a solid foundation for deep learning systems and research, this section outlines the below ten potential future research directions based on our study.

Automation in Data Annotation According to the existing literature, discussed in Section 3 , most of the deep learning models are trained through publicly available datasets that are annotated. However, to build a system for a new problem domain or recent data-driven system, raw data from relevant sources are needed to collect. Thus, data annotation, e.g., categorization, tagging, or labeling of a large amount of raw data, is important for building discriminative deep learning models or supervised tasks, which is challenging. A technique with the capability of automatic and dynamic data annotation, rather than manual annotation or hiring annotators, particularly, for large datasets, could be more effective for supervised learning as well as minimizing human effort. Therefore, a more in-depth investigation of data collection and annotation methods, or designing an unsupervised learning-based solution could be one of the primary research directions in the area of deep learning modeling.

Data Preparation for Ensuring Data Quality As discussed earlier throughout the paper, the deep learning algorithms highly impact data quality, and availability for training, and consequently on the resultant model for a particular problem domain. Thus, deep learning models may become worthless or yield decreased accuracy if the data is bad, such as data sparsity, non-representative, poor-quality, ambiguous values, noise, data imbalance, irrelevant features, data inconsistency, insufficient quantity, and so on for training. Consequently, such issues in data can lead to poor processing and inaccurate findings, which is a major problem while discovering insights from data. Thus deep learning models also need to adapt to such rising issues in data, to capture approximated information from observations. Therefore, effective data pre-processing techniques are needed to design according to the nature of the data problem and characteristics, to handling such emerging challenges, which could be another research direction in the area.

Black-box Perception and Proper DL/ML Algorithm Selection In general, it’s difficult to explain how a deep learning result is obtained or how they get the ultimate decisions for a particular model. Although DL models achieve significant performance while learning from large datasets, as discussed in Section 2 , this “black-box” perception of DL modeling typically represents weak statistical interpretability that could be a major issue in the area. On the other hand, ML algorithms, particularly, rule-based machine learning techniques provide explicit logic rules (IF-THEN) for making decisions that are easier to interpret, update or delete according to the target applications [ 97 , 100 , 105 ]. If the wrong learning algorithm is chosen, unanticipated results may occur, resulting in a loss of effort as well as the model’s efficacy and accuracy. Thus by taking into account the performance, complexity, model accuracy, and applicability, selecting an appropriate model for the target application is challenging, and in-depth analysis is needed for better understanding and decision making.

Deep Networks for Supervised or Discriminative Learning: According to our designed taxonomy of deep learning techniques, as shown in Fig. 6 , discriminative architectures mainly include MLP, CNN, and RNN, along with their variants that are applied widely in various application domains. However, designing new techniques or their variants of such discriminative techniques by taking into account model optimization, accuracy, and applicability, according to the target real-world application and the nature of the data, could be a novel contribution, which can also be considered as a major future aspect in the area of supervised or discriminative learning.

Deep Networks for Unsupervised or Generative Learning As discussed in Section 3 , unsupervised learning or generative deep learning modeling is one of the major tasks in the area, as it allows us to characterize the high-order correlation properties or features in data, or generating a new representation of data through exploratory analysis. Moreover, unlike supervised learning [ 97 ], it does not require labeled data due to its capability to derive insights directly from the data as well as data-driven decision making. Consequently, it thus can be used as preprocessing for supervised learning or discriminative modeling as well as semi-supervised learning tasks, which ensure learning accuracy and model efficiency. According to our designed taxonomy of deep learning techniques, as shown in Fig. 6 , generative techniques mainly include GAN, AE, SOM, RBM, DBN, and their variants. Thus, designing new techniques or their variants for an effective data modeling or representation according to the target real-world application could be a novel contribution, which can also be considered as a major future aspect in the area of unsupervised or generative learning.

Hybrid/Ensemble Modeling and Uncertainty Handling According to our designed taxonomy of DL techniques, as shown in Fig 6 , this is considered as another major category in deep learning tasks. As hybrid modeling enjoys the benefits of both generative and discriminative learning, an effective hybridization can outperform others in terms of performance as well as uncertainty handling in high-risk applications. In Section 3 , we have summarized various types of hybridization, e.g., AE+CNN/SVM. Since a group of neural networks is trained with distinct parameters or with separate sub-sampling training datasets, hybridization or ensembles of such techniques, i.e., DL with DL/ML, can play a key role in the area. Thus designing effective blended discriminative and generative models accordingly rather than naive method, could be an important research opportunity to solve various real-world issues including semi-supervised learning tasks and model uncertainty.

Dynamism in Selecting Threshold/ Hyper-parameters Values, and Network Structures with Computational Efficiency In general, the relationship among performance, model complexity, and computational requirements is a key issue in deep learning modeling and applications. A combination of algorithmic advancements with improved accuracy as well as maintaining computational efficiency, i.e., achieving the maximum throughput while consuming the least amount of resources, without significant information loss, can lead to a breakthrough in the effectiveness of deep learning modeling in future real-world applications. The concept of incremental approaches or recency-based learning [ 100 ] might be effective in several cases depending on the nature of target applications. Moreover, assuming the network structures with a static number of nodes and layers, hyper-parameters values or threshold settings, or selecting them by the trial-and-error process may not be effective in many cases, as it can be changed due to the changes in data. Thus, a data-driven approach to select them dynamically could be more effective while building a deep learning model in terms of both performance and real-world applicability. Such type of data-driven automation can lead to future generation deep learning modeling with additional intelligence, which could be a significant future aspect in the area as well as an important research direction to contribute.

Lightweight Deep Learning Modeling for Next-Generation Smart Devices and Applications: In recent years, the Internet of Things (IoT) consisting of billions of intelligent and communicating things and mobile communications technologies have become popular to detect and gather human and environmental information (e.g. geo-information, weather data, bio-data, human behaviors, and so on) for a variety of intelligent services and applications. Every day, these ubiquitous smart things or devices generate large amounts of data, requiring rapid data processing on a variety of smart mobile devices [ 72 ]. Deep learning technologies can be incorporate to discover underlying properties and to effectively handle such large amounts of sensor data for a variety of IoT applications including health monitoring and disease analysis, smart cities, traffic flow prediction, and monitoring, smart transportation, manufacture inspection, fault assessment, smart industry or Industry 4.0, and many more. Although deep learning techniques discussed in Section 3 are considered as powerful tools for processing big data, lightweight modeling is important for resource-constrained devices, due to their high computational cost and considerable memory overhead. Thus several techniques such as optimization, simplification, compression, pruning, generalization, important feature extraction, etc. might be helpful in several cases. Therefore, constructing the lightweight deep learning techniques based on a baseline network architecture to adapt the DL model for next-generation mobile, IoT, or resource-constrained devices and applications, could be considered as a significant future aspect in the area.

Incorporating Domain Knowledge into Deep Learning Modeling Domain knowledge, as opposed to general knowledge or domain-independent knowledge, is knowledge of a specific, specialized topic or field. For instance, in terms of natural language processing, the properties of the English language typically differ from other languages like Bengali, Arabic, French, etc. Thus integrating domain-based constraints into the deep learning model could produce better results for such particular purpose. For instance, a task-specific feature extractor considering domain knowledge in smart manufacturing for fault diagnosis can resolve the issues in traditional deep-learning-based methods [ 28 ]. Similarly, domain knowledge in medical image analysis [ 58 ], financial sentiment analysis [ 49 ], cybersecurity analytics [ 94 , 103 ] as well as conceptual data model in which semantic information, (i.e., meaningful for a system, rather than merely correlational) [ 45 , 121 , 131 ] is included, can play a vital role in the area. Transfer learning could be an effective way to get started on a new challenge with domain knowledge. Moreover, contextual information such as spatial, temporal, social, environmental contexts [ 92 , 104 , 108 ] can also play an important role to incorporate context-aware computing with domain knowledge for smart decision making as well as building adaptive and intelligent context-aware systems. Therefore understanding domain knowledge and effectively incorporating them into the deep learning model could be another research direction.

Designing General Deep Learning Framework for Target Application Domains One promising research direction for deep learning-based solutions is to develop a general framework that can handle data diversity, dimensions, stimulation types, etc. The general framework would require two key capabilities: the attention mechanism that focuses on the most valuable parts of input signals, and the ability to capture latent feature that enables the framework to capture the distinctive and informative features. Attention models have been a popular research topic because of their intuition, versatility, and interpretability, and employed in various application areas like computer vision, natural language processing, text or image classification, sentiment analysis, recommender systems, user profiling, etc [ 13 , 80 ]. Attention mechanism can be implemented based on learning algorithms such as reinforcement learning that is capable of finding the most useful part through a policy search [ 133 , 134 ]. Similarly, CNN can be integrated with suitable attention mechanisms to form a general classification framework, where CNN can be used as a feature learning tool for capturing features in various levels and ranges. Thus, designing a general deep learning framework considering attention as well as a latent feature for target application domains could be another area to contribute.

To summarize, deep learning is a fairly open topic to which academics can contribute by developing new methods or improving existing methods to handle the above-mentioned concerns and tackle real-world problems in a variety of application areas. This can also help the researchers conduct a thorough analysis of the application’s hidden and unexpected challenges to produce more reliable and realistic outcomes. Overall, we can conclude that addressing the above-mentioned issues and contributing to proposing effective and efficient techniques could lead to “Future Generation DL” modeling as well as more intelligent and automated applications.

Concluding Remarks

In this article, we have presented a structured and comprehensive view of deep learning technology, which is considered a core part of artificial intelligence as well as data science. It starts with a history of artificial neural networks and moves to recent deep learning techniques and breakthroughs in different applications. Then, the key algorithms in this area, as well as deep neural network modeling in various dimensions are explored. For this, we have also presented a taxonomy considering the variations of deep learning tasks and how they are used for different purposes. In our comprehensive study, we have taken into account not only the deep networks for supervised or discriminative learning but also the deep networks for unsupervised or generative learning, and hybrid learning that can be used to solve a variety of real-world issues according to the nature of problems.

Deep learning, unlike traditional machine learning and data mining algorithms, can produce extremely high-level data representations from enormous amounts of raw data. As a result, it has provided an excellent solution to a variety of real-world problems. A successful deep learning technique must possess the relevant data-driven modeling depending on the characteristics of raw data. The sophisticated learning algorithms then need to be trained through the collected data and knowledge related to the target application before the system can assist with intelligent decision-making. Deep learning has shown to be useful in a wide range of applications and research areas such as healthcare, sentiment analysis, visual recognition, business intelligence, cybersecurity, and many more that are summarized in the paper.

Finally, we have summarized and discussed the challenges faced and the potential research directions, and future aspects in the area. Although deep learning is considered a black-box solution for many applications due to its poor reasoning and interpretability, addressing the challenges or future aspects that are identified could lead to future generation deep learning modeling and smarter systems. This can also help the researchers for in-depth analysis to produce more reliable and realistic outcomes. Overall, we believe that our study on neural networks and deep learning-based advanced analytics points in a promising path and can be utilized as a reference guide for future research and implementations in relevant application domains by both academic and industry professionals.

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin Ma, Ghemawat S, Irving G, Isard M, et al. Tensorflow: a system for large-scale machine learning. In: 12th { USENIX } Symposium on operating systems design and implementation ({ OSDI } 16), 2016; p. 265–283.

Abdel-Basset M, Hawash H, Chakrabortty RK, Ryan M. Energy-net: a deep learning approach for smart energy management in iot-based smart cities. IEEE Internet of Things J. 2021.

Aggarwal A, Mittal M, Battineni G. Generative adversarial network: an overview of theory and applications. Int J Inf Manag Data Insights. 2021; p. 100004.

Al-Qatf M, Lasheng Y, Al-Habib M, Al-Sabahi K. Deep learning approach combining sparse autoencoder with svm for network intrusion detection. IEEE Access. 2018;6:52843–56.

Article   Google Scholar  

Ale L, Sheta A, Li L, Wang Y, Zhang N. Deep learning based plant disease detection for smart agriculture. In: 2019 IEEE Globecom Workshops (GC Wkshps), 2019; p. 1–6. IEEE.

Amarbayasgalan T, Lee JY, Kim KR, Ryu KH. Deep autoencoder based neural networks for coronary heart disease risk prediction. In: Heterogeneous data management, polystores, and analytics for healthcare. Springer; 2019. p. 237–48.

Anuradha J, et al. Big data based stock trend prediction using deep cnn with reinforcement-lstm model. Int J Syst Assur Eng Manag. 2021; p. 1–11.

Aqib M, Mehmood R, Albeshri A, Alzahrani A. Disaster management in smart cities by forecasting traffic plan using deep learning and gpus. In: International Conference on smart cities, infrastructure, technologies and applications. Springer; 2017. p. 139–54.

Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA. Deep reinforcement learning: a brief survey. IEEE Signal Process Mag. 2017;34(6):26–38.

Aslan MF, Unlersen MF, Sabanci K, Durdu A. Cnn-based transfer learning-bilstm network: a novel approach for covid-19 infection detection. Appl Soft Comput. 2021;98:106912.

Bu F, Wang X. A smart agriculture iot system based on deep reinforcement learning. Futur Gener Comput Syst. 2019;99:500–7.

Chang W-J, Chen L-B, Hsu C-H, Lin C-P, Yang T-C. A deep learning-based intelligent medicine recognition system for chronic patients. IEEE Access. 2019;7:44441–58.

Chaudhari S, Mithal V, Polatkan Gu, Ramanath R. An attentive survey of attention models. arXiv preprint arXiv:1904.02874, 2019.

Chaudhuri N, Gupta G, Vamsi V, Bose I. On the platform but will they buy? predicting customers’ purchase behavior using deep learning. Decis Support Syst. 2021; p. 113622.

Chen D, Wawrzynski P, Lv Z. Cyber security in smart cities: a review of deep learning-based applications and case studies. Sustain Cities Soc. 2020; p. 102655.

Cho K, Van MB, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.

Chollet F. Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, 2017; p. 1251–258.

Chung J, Gulcehre C, Cho KH, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.

Coelho IM, Coelho VN, da Eduardo J, Luz S, Ochi LS, Guimarães FG, Rios E. A gpu deep learning metaheuristic based model for time series forecasting. Appl Energy. 2017;201:412–8.

Da'u A, Salim N. Recommendation system based on deep learning methods: a systematic review and new directions. Artif Intel Rev. 2020;53(4):2709–48.

Deng L. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Trans Signal Inf Process. 2014; p. 3.

Deng L, Dong Yu. Deep learning: methods and applications. Found Trends Signal Process. 2014;7(3–4):197–387.

Article   MathSciNet   MATH   Google Scholar  

Deng S, Li R, Jin Y, He H. Cnn-based feature cross and classifier for loan default prediction. In: 2020 International Conference on image, video processing and artificial intelligence, volume 11584, page 115841K. International Society for Optics and Photonics, 2020.

Dhyani M, Kumar R. An intelligent chatbot using deep learning with bidirectional rnn and attention model. Mater Today Proc. 2021;34:817–24.

Donahue J, Krähenbühl P, Darrell T. Adversarial feature learning. arXiv preprint arXiv:1605.09782, 2016.

Du K-L, Swamy MNS. Neural networks and statistical learning. Berlin: Springer Science & Business Media; 2013.

MATH   Google Scholar  

Dupond S. A thorough review on the current advance of neural network structures. Annu Rev Control. 2019;14:200–30.

Google Scholar  

Feng J, Yao Y, Lu S, Liu Y. Domain knowledge-based deep-broad learning framework for fault diagnosis. IEEE Trans Ind Electron. 2020;68(4):3454–64.

Garg S, Kaur K, Kumar N, Rodrigues JJPC. Hybrid deep-learning-based anomaly detection scheme for suspicious flow detection in sdn: a social multimedia perspective. IEEE Trans Multimed. 2019;21(3):566–78.

Géron A. Hands-on machine learning with Scikit-Learn, Keras. In: and TensorFlow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media; 2019.

Goodfellow I, Bengio Y, Courville A, Bengio Y. Deep learning, vol. 1. Cambridge: MIT Press; 2016.

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Advances in neural information processing systems. 2014; p. 2672–680.

Google trends. 2021. https://trends.google.com/trends/ .

Gruber N, Jockisch A. Are gru cells more specific and lstm cells more sensitive in motive classification of text? Front Artif Intell. 2020;3:40.

Gu B, Ge R, Chen Y, Luo L, Coatrieux G. Automatic and robust object detection in x-ray baggage inspection using deep convolutional neural networks. IEEE Trans Ind Electron. 2020.

Han J, Pei J, Kamber M. Data mining: concepts and techniques. Amsterdam: Elsevier; 2011.

Haykin S. Neural networks and learning machines, 3/E. London: Pearson Education; 2010.

He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2015;37(9):1904–16.

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, 2016; p. 770–78.

Hinton GE. Deep belief networks. Scholarpedia. 2009;4(5):5947.

Hinton GE, Osindero S, Teh Y-W. A fast learning algorithm for deep belief nets. Neural Comput. 2006;18(7):1527–54.

Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.

Huang C-J, Kuo P-H. A deep cnn-lstm model for particulate matter (pm2. 5) forecasting in smart cities. Sensors. 2018;18(7):2220.

Huang H-H, Fukuda M, Nishida T. Toward rnn based micro non-verbal behavior generation for virtual listener agents. In: International Conference on human-computer interaction, 2019; p. 53–63. Springer.

Hulsebos M, Hu K, Bakker M, Zgraggen E, Satyanarayan A, Kraska T, Demiralp Ça, Hidalgo C. Sherlock: a deep learning approach to semantic data type detection. In: Proceedings of the 25th ACM SIGKDD International Conference on knowledge discovery & data mining, 2019; p. 1500–508.

Imamverdiyev Y, Abdullayeva F. Deep learning method for denial of service attack detection based on restricted Boltzmann machine. Big Data. 2018;6(2):159–69.

Islam MZ, Islam MM, Asraf A. A combined deep cnn-lstm network for the detection of novel coronavirus (covid-19) using x-ray images. Inf Med Unlock. 2020;20:100412.

Ismail WN, Hassan MM, Alsalamah HA, Fortino G. Cnn-based health model for regular health factors analysis in internet-of-medical things environment. IEEE. Access. 2020;8:52541–9.

Jangid H, Singhal S, Shah RR, Zimmermann R. Aspect-based financial sentiment analysis using deep learning. In: Companion Proceedings of the The Web Conference 2018, 2018; p. 1961–966.

Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J Artif Intell Res. 1996;4:237–85.

Kameoka H, Li L, Inoue S, Makino S. Supervised determined source separation with multichannel variational autoencoder. Neural Comput. 2019;31(9):1891–914.

Karhunen J, Raiko T, Cho KH. Unsupervised deep learning: a short review. In: Advances in independent component analysis and learning machines. 2015; p. 125–42.

Kawde P, Verma GK. Deep belief network based affect recognition from physiological signals. In: 2017 4th IEEE Uttar Pradesh Section International Conference on electrical, computer and electronics (UPCON), 2017; p. 587–92. IEEE.

Kim J-Y, Seok-Jun B, Cho S-B. Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf Sci. 2018;460:83–102.

Kingma DP, Welling M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.

Kingma DP, Welling M. An introduction to variational autoencoders. arXiv preprint arXiv:1906.02691, 2019.

Kiran PKR, Bhasker B. Dnnrec: a novel deep learning based hybrid recommender system. Expert Syst Appl. 2020.

Kloenne M, Niehaus S, Lampe L, Merola A, Reinelt J, Roeder I, Scherf N. Domain-specific cues improve robustness of deep learning-based segmentation of ct volumes. Sci Rep. 2020;10(1):1–9.

Kohonen T. The self-organizing map. Proc IEEE. 1990;78(9):1464–80.

Kohonen T. Essentials of the self-organizing map. Neural Netw. 2013;37:52–65.

Kök İ, Şimşek MU, Özdemir S. A deep learning model for air quality prediction in smart cities. In: 2017 IEEE International Conference on Big Data (Big Data), 2017; p. 1983–990. IEEE.

Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. 2012; p. 1097–105.

Latif S, Rana R, Younis S, Qadir J, Epps J. Transfer learning for improving speech emotion classification accuracy. arXiv preprint arXiv:1801.06353, 2018.

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.

LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.

Li B, François-Lavet V, Doan T, Pineau J. Domain adversarial reinforcement learning. arXiv preprint arXiv:2102.07097, 2021.

Li T-HS, Kuo P-H, Tsai T-N, Luan P-C. Cnn and lstm based facial expression analysis model for a humanoid robot. IEEE Access. 2019;7:93998–4011.

Liu C, Cao Y, Luo Y, Chen G, Vokkarane V, Yunsheng M, Chen S, Hou P. A new deep learning-based food recognition system for dietary assessment on an edge computing service infrastructure. IEEE Trans Serv Comput. 2017;11(2):249–61.

Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE. A survey of deep neural network architectures and their applications. Neurocomputing. 2017;234:11–26.

López AU, Mateo F, Navío-Marco J, Martínez-Martínez JM, Gómez-Sanchís J, Vila-Francés J, Serrano-López AJ. Analysis of computer user behavior, security incidents and fraud using self-organizing maps. Comput Secur. 2019;83:38–51.

Lopez-Martin M, Carro B, Sanchez-Esguevillas A. Application of deep reinforcement learning to intrusion detection for supervised problems. Expert Syst Appl. 2020;141:112963.

Ma X, Yao T, Menglan H, Dong Y, Liu W, Wang F, Liu J. A survey on deep learning empowered iot applications. IEEE Access. 2019;7:181721–32.

Makhzani A, Frey B. K-sparse autoencoders. arXiv preprint arXiv:1312.5663, 2013.

Mandic D, Chambers J. Recurrent neural networks for prediction: learning algorithms, architectures and stability. Hoboken: Wiley; 2001.

Book   Google Scholar  

Marlin B, Swersky K, Chen B, Freitas N. Inductive principles for restricted boltzmann machine learning. In: Proceedings of the Thirteenth International Conference on artificial intelligence and statistics, p. 509–16. JMLR Workshop and Conference Proceedings, 2010.

Masud M, Muhammad G, Alhumyani H, Alshamrani SS, Cheikhrouhou O, Ibrahim S, Hossain MS. Deep learning-based intelligent face recognition in iot-cloud environment. Comput Commun. 2020;152:215–22.

Memisevic R, Hinton GE. Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Comput. 2010;22(6):1473–92.

Article   MATH   Google Scholar  

Minaee S, Azimi E, Abdolrashidi AA. Deep-sentiment: sentiment analysis using ensemble of cnn and bi-lstm models. arXiv preprint arXiv:1904.04206, 2019.

Naeem M, Paragliola G, Coronato A. A reinforcement learning and deep learning based intelligent system for the support of impaired patients in home treatment. Expert Syst Appl. 2021;168:114285.

Niu Z, Zhong G, Hui Yu. A review on the attention mechanism of deep learning. Neurocomputing. 2021;452:48–62.

Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans Knowl Data Eng. 2009;22(10):1345–59.

Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al. Pytorch: An imperative style, high-performance deep learning library. Adv Neural Inf Process Syst. 2019;32:8026–37.

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825–30.

MathSciNet   MATH   Google Scholar  

Pi Y, Nath ND, Behzadan AH. Convolutional neural networks for object detection in aerial imagery for disaster response and recovery. Adv Eng Inf. 2020;43:101009.

Piccialli F, Giampaolo F, Prezioso E, Crisci D, Cuomo S. Predictive analytics for smart parking: A deep learning approach in forecasting of iot data. ACM Trans Internet Technol (TOIT). 2021;21(3):1–21.

Puterman ML. Markov decision processes: discrete stochastic dynamic programming. Hoboken: Wiley; 2014.

Qu X, Lin Y, Kai G, Linru M, Meng S, Mingxing K, Mu L, editors. A survey on the development of self-organizing maps for unsupervised intrusion detection. Mob Netw Appl. 2019; p. 1–22.

Rahman MW, Tashfia SS, Islam R, Hasan MM, Sultan SI, Mia S, Rahman MM. The architectural design of smart blind assistant using iot with deep learning paradigm. Internet of Things. 2021;13:100344.

Ren J, Green M, Huang X. From traditional to deep learning: fault diagnosis for autonomous vehicles. In: Learning control. Elsevier. 2021; p. 205–19.

Rifai S, Vincent P, Muller X, Glorot X, Bengio Y. Contractive auto-encoders: Explicit invariance during feature extraction. In: Icml, 2011.

Rosa RL, Schwartz GM, Ruggiero WV, Rodríguez DZ. A knowledge-based recommendation system that includes sentiment analysis and deep learning. IEEE Trans Ind Inf. 2018;15(4):2124–35.

Sarker IH. Context-aware rule learning from smartphone data: survey, challenges and future directions. J Big Data. 2019;6(1):1–25.

Article   MathSciNet   Google Scholar  

Sarker IH. A machine learning based robust prediction model for real-life mobile phone data. Internet of Things. 2019;5:180–93.

Sarker IH. Cyberlearning: effectiveness analysis of machine learning security modeling to detect cyber-anomalies and multi-attacks. Internet of Things. 2021;14:100393.

Sarker IH. Data science and analytics: an overview from data-driven smart computing, decision-making and applications perspective. SN Comput Sci. 2021.

Sarker IH. Deep cybersecurity: a comprehensive overview from neural network and deep learning perspective. SN Computer. Science. 2021;2(3):1–16.

MathSciNet   Google Scholar  

Sarker IH. Machine learning: Algorithms, real-world applications and research directions. SN Computer. Science. 2021;2(3):1–21.

Sarker IH, Abushark YB, Alsolami F, Khan AI. Intrudtree: a machine learning based cyber security intrusion detection model. Symmetry. 2020;12(5):754.

Sarker IH, Abushark YB, Khan AI. Contextpca: Predicting context-aware smartphone apps usage based on machine learning techniques. Symmetry. 2020;12(4):499.

Sarker IH, Colman A, Han J. Recencyminer: mining recency-based personalized behavior from contextual smartphone data. J Big Data. 2019;6(1):1–21.

Sarker IH, Colman A, Han J, Khan AI, Abushark YB, Salah K. Behavdt: a behavioral decision tree learning to build user-centric context-aware predictive model. Mob Netw Appl. 2020;25(3):1151–61.

Sarker IH, Colman A, Kabir MA, Han J. Individualized time-series segmentation for mining mobile phone user behavior. Comput J. 2018;61(3):349–68.

Sarker IH, Furhad MH, Nowrozy R. Ai-driven cybersecurity: an overview, security intelligence modeling and research directions. SN Computer. Science. 2021;2(3):1–18.

Sarker IH, Hoque MM, Uddin MK. Mobile data science and intelligent apps: concepts, ai-based modeling and research directions. Mob Netw Appl. 2021;26(1):285–303.

Sarker IH, Kayes ASM. Abc-ruleminer: User behavioral rule-based machine learning method for context-aware intelligent services. J Netw Comput Appl. 2020;168:102762.

Sarker IH, Kayes ASM, Badsha S, Alqahtani H, Watters P, Ng A. Cybersecurity data science: an overview from machine learning perspective. J Big data. 2020;7(1):1–29.

Sarker IH, Kayes ASM, Watters P. Effectiveness analysis of machine learning classification models for predicting personalized context-aware smartphone usage. J Big Data. 2019;6(1):1–28.

Sarker IH, Salah K. Appspred: predicting context-aware smartphone apps using random forest learning. Internet of Things. 2019;8:100106.

Satt A, Rozenberg S, Hoory R. Efficient emotion recognition from speech using deep learning on spectrograms. In: Interspeec, 2017; p. 1089–1093.

Sevakula RK, Singh V, Verma NK, Kumar C, Cui Y. Transfer learning for molecular cancer classification using deep neural networks. IEEE/ACM Trans Comput Biol Bioinf. 2018;16(6):2089–100.

Sujay Narumanchi H, Ananya Pramod Kompalli Shankar A, Devashish CK. Deep learning based large scale visual recommendation and search for e-commerce. arXiv preprint arXiv:1703.02344, 2017.

Shao X, Kim CS. Multi-step short-term power consumption forecasting using multi-channel lstm with time location considering customer behavior. IEEE Access. 2020;8:125263–73.

Siami-Namini S, Tavakoli N, Namin AS. The performance of lstm and bilstm in forecasting time series. In: 2019 IEEE International Conference on Big Data (Big Data), 2019; p. 3285–292. IEEE.

Ślusarczyk B. Industry 4.0: are we ready? Pol J Manag Stud. 2018; p. 17

Sumathi P, Subramanian R, Karthikeyan VV, Karthik S. Soil monitoring and evaluation system using edl-asqe: enhanced deep learning model for ioi smart agriculture network. Int J Commun Syst. 2021; p. e4859.

Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, 2015; p. 1–9.

Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C. A survey on deep transfer learning. In: International Conference on artificial neural networks, 2018; p. 270–279. Springer.

Vesanto J, Alhoniemi E. Clustering of the self-organizing map. IEEE Trans Neural Netw. 2000;11(3):586–600.

Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A, Bottou L. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res. 2010;11(12).

Wang J, Liang-Chih Yu, Robert Lai K, Zhang X. Tree-structured regional cnn-lstm model for dimensional sentiment analysis. IEEE/ACM Trans Audio Speech Lang Process. 2019;28:581–91.

Wang S, Wan J, Li D, Liu C. Knowledge reasoning with semantic data for real-time data processing in smart factory. Sensors. 2018;18(2):471.

Wang W, Zhao M, Wang J. Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J Ambient Intell Humaniz Comput. 2019;10(8):3035–43.

Wang X, Liu J, Qiu T, Chaoxu M, Chen C, Zhou P. A real-time collision prediction mechanism with deep learning for intelligent transportation system. IEEE Trans Veh Technol. 2020;69(9):9497–508.

Wang Y, Huang M, Zhu X, Zhao L. Attention-based lstm for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on empirical methods in natural language processing, 2016; p. 606–615.

Wei P, Li Y, Zhang Z, Tao H, Li Z, Liu D. An optimization method for intrusion detection classification model based on deep belief network. IEEE Access. 2019;7:87593–605.

Weiss K, Khoshgoftaar TM, Wang DD. A survey of transfer learning. J Big data. 2016;3(1):9.

Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C. Machine learning and deep learning methods for cybersecurity. Ieee access. 2018;6:35365–81.

Xu W, Sun H, Deng C, Tan Y. Variational autoencoder for semi-supervised text classification. In: Thirty-First AAAI Conference on artificial intelligence, 2017.

Xue Q, Chuah MC. New attacks on rnn based healthcare learning system and their detections. Smart Health. 2018;9:144–57.

Yousefi-Azar M, Hamey L. Text summarization using unsupervised deep learning. Expert Syst Appl. 2017;68:93–105.

Yuan X, Shi J, Gu L. A review of deep learning methods for semantic segmentation of remote sensing imagery. Expert Syst Appl. 2020;p. 114417.

Zhang G, Liu Y, Jin X. A survey of autoencoder-based recommender systems. Front Comput Sci. 2020;14(2):430–50.

Zhang X, Yao L, Huang C, Wang S, Tan M, Long Gu, Wang C. Multi-modality sensor data classification with selective attention. arXiv preprint arXiv:1804.05493, 2018.

Zhang X, Yao L, Wang X, Monaghan J, Mcalpine D, Zhang Y. A survey on deep learning based brain computer interface: recent advances and new frontiers. arXiv preprint arXiv:1905.04149, 2019; p. 66.

Zhang Y, Zhang P, Yan Y. Attention-based lstm with multi-task learning for distant speech recognition. In: Interspeech, 2017; p. 3857–861.

Download references

Author information

Authors and affiliations.

Swinburne University of Technology, Melbourne, VIC, 3122, Australia

Iqbal H. Sarker

Chittagong University of Engineering & Technology, Chittagong, 4349, Bangladesh

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Iqbal H. Sarker .

Ethics declarations

Conflict of interest.

The author declares no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances in Computational Approaches for Artificial Intelligence, Image Processing, IoT and Cloud Applications” guest edited by Bhanu Prakash K. N. and M. Shivakumar.

Rights and permissions

Reprints and permissions

About this article

Sarker, I.H. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN COMPUT. SCI. 2 , 420 (2021). https://doi.org/10.1007/s42979-021-00815-1

Download citation

Received : 29 May 2021

Accepted : 07 August 2021

Published : 18 August 2021

DOI : https://doi.org/10.1007/s42979-021-00815-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Deep learning
  • Artificial neural network
  • Artificial intelligence
  • Discriminative learning
  • Generative learning
  • Hybrid learning
  • Intelligent systems
  • Find a journal
  • Publish with us
  • Track your research

Main Navigation

  • Contact NeurIPS
  • Code of Ethics
  • Code of Conduct
  • Create Profile
  • Journal To Conference Track
  • Diversity & Inclusion
  • Proceedings
  • Future Meetings
  • Exhibitor Information
  • Privacy Policy

NeurIPS 2024

Conference Dates: (In person) 9 December - 15 December, 2024

Homepage: https://neurips.cc/Conferences/2024/

Call For Papers 

Abstract submission deadline: May 15, 2024

Author notification: Sep 25, 2024

Camera-ready, poster, and video submission: Oct 30, 2024 AOE

Submit at: https://openreview.net/group?id=NeurIPS.cc/2024/Conference  

The site will start accepting submissions on Apr 22, 2024 

Subscribe to these and other dates on the 2024 dates page .

The Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024) is an interdisciplinary conference that brings together researchers in machine learning, neuroscience, statistics, optimization, computer vision, natural language processing, life sciences, natural sciences, social sciences, and other adjacent fields. We invite submissions presenting new and original research on topics including but not limited to the following:

  • Applications (e.g., vision, language, speech and audio, Creative AI)
  • Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
  • Evaluation (e.g., methodology, meta studies, replicability and validity, human-in-the-loop)
  • General machine learning (supervised, unsupervised, online, active, etc.)
  • Infrastructure (e.g., libraries, improved implementation and scalability, distributed solutions)
  • Machine learning for sciences (e.g. climate, health, life sciences, physics, social sciences)
  • Neuroscience and cognitive science (e.g., neural coding, brain-computer interfaces)
  • Optimization (e.g., convex and non-convex, stochastic, robust)
  • Probabilistic methods (e.g., variational inference, causal inference, Gaussian processes)
  • Reinforcement learning (e.g., decision and control, planning, hierarchical RL, robotics)
  • Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
  • Theory (e.g., control theory, learning theory, algorithmic game theory)

Machine learning is a rapidly evolving field, and so we welcome interdisciplinary submissions that do not fit neatly into existing categories.

Authors are asked to confirm that their submissions accord with the NeurIPS code of conduct .

Formatting instructions:   All submissions must be in PDF format, and in a single PDF file include, in this order:

  • The submitted paper
  • Technical appendices that support the paper with additional proofs, derivations, or results 
  • The NeurIPS paper checklist  

Other supplementary materials such as data and code can be uploaded as a ZIP file

The main text of a submitted paper is limited to nine content pages , including all figures and tables. Additional pages containing references don’t count as content pages. If your submission is accepted, you will be allowed an additional content page for the camera-ready version.

The main text and references may be followed by technical appendices, for which there is no page limit.

The maximum file size for a full submission, which includes technical appendices, is 50MB.

Authors are encouraged to submit a separate ZIP file that contains further supplementary material like data or source code, when applicable.

You must format your submission using the NeurIPS 2024 LaTeX style file which includes a “preprint” option for non-anonymous preprints posted online. Submissions that violate the NeurIPS style (e.g., by decreasing margins or font sizes) or page limits may be rejected without further review. Papers may be rejected without consideration of their merits if they fail to meet the submission requirements, as described in this document. 

Paper checklist: In order to improve the rigor and transparency of research submitted to and published at NeurIPS, authors are required to complete a paper checklist . The paper checklist is intended to help authors reflect on a wide variety of issues relating to responsible machine learning research, including reproducibility, transparency, research ethics, and societal impact. The checklist forms part of the paper submission, but does not count towards the page limit.

Please join the NeurIPS 2024 Checklist Assistant Study that will provide you with free verification of your checklist performed by an LLM here . Please see details in our  blog

Supplementary material: While all technical appendices should be included as part of the main paper submission PDF, authors may submit up to 100MB of supplementary material, such as data, or source code in a ZIP format. Supplementary material should be material created by the authors that directly supports the submission content. Like submissions, supplementary material must be anonymized. Looking at supplementary material is at the discretion of the reviewers.

We encourage authors to upload their code and data as part of their supplementary material in order to help reviewers assess the quality of the work. Check the policy as well as code submission guidelines and templates for further details.

Use of Large Language Models (LLMs): We welcome authors to use any tool that is suitable for preparing high-quality papers and research. However, we ask authors to keep in mind two important criteria. First, we expect papers to fully describe their methodology, and any tool that is important to that methodology, including the use of LLMs, should be described also. For example, authors should mention tools (including LLMs) that were used for data processing or filtering, visualization, facilitating or running experiments, and proving theorems. It may also be advisable to describe the use of LLMs in implementing the method (if this corresponds to an important, original, or non-standard component of the approach). Second, authors are responsible for the entire content of the paper, including all text and figures, so while authors are welcome to use any tool they wish for writing the paper, they must ensure that all text is correct and original.

Double-blind reviewing:   All submissions must be anonymized and may not contain any identifying information that may violate the double-blind reviewing policy.  This policy applies to any supplementary or linked material as well, including code.  If you are including links to any external material, it is your responsibility to guarantee anonymous browsing.  Please do not include acknowledgements at submission time. If you need to cite one of your own papers, you should do so with adequate anonymization to preserve double-blind reviewing.  For instance, write “In the previous work of Smith et al. [1]…” rather than “In our previous work [1]...”). If you need to cite one of your own papers that is in submission to NeurIPS and not available as a non-anonymous preprint, then include a copy of the cited anonymized submission in the supplementary material and write “Anonymous et al. [1] concurrently show...”). Any papers found to be violating this policy will be rejected.

OpenReview: We are using OpenReview to manage submissions. The reviews and author responses will not be public initially (but may be made public later, see below). As in previous years, submissions under review will be visible only to their assigned program committee. We will not be soliciting comments from the general public during the reviewing process. Anyone who plans to submit a paper as an author or a co-author will need to create (or update) their OpenReview profile by the full paper submission deadline. Your OpenReview profile can be edited by logging in and clicking on your name in https://openreview.net/ . This takes you to a URL "https://openreview.net/profile?id=~[Firstname]_[Lastname][n]" where the last part is your profile name, e.g., ~Wei_Zhang1. The OpenReview profiles must be up to date, with all publications by the authors, and their current affiliations. The easiest way to import publications is through DBLP but it is not required, see FAQ . Submissions without updated OpenReview profiles will be desk rejected. The information entered in the profile is critical for ensuring that conflicts of interest and reviewer matching are handled properly. Because of the rapid growth of NeurIPS, we request that all authors help with reviewing papers, if asked to do so. We need everyone’s help in maintaining the high scientific quality of NeurIPS.  

Please be aware that OpenReview has a moderation policy for newly created profiles: New profiles created without an institutional email will go through a moderation process that can take up to two weeks. New profiles created with an institutional email will be activated automatically.

Venue home page: https://openreview.net/group?id=NeurIPS.cc/2024/Conference

If you have any questions, please refer to the FAQ: https://openreview.net/faq

Abstract Submission: There is a mandatory abstract submission deadline on May 15, 2024, six days before full paper submissions are due. While it will be possible to edit the title and abstract until the full paper submission deadline, submissions with “placeholder” abstracts that are rewritten for the full submission risk being removed without consideration. This includes titles and abstracts that either provide little or no semantic information (e.g., "We provide a new semi-supervised learning method.") or describe a substantively different claimed contribution.  The author list cannot be changed after the abstract deadline. After that, authors may be reordered, but any additions or removals must be justified in writing and approved on a case-by-case basis by the program chairs only in exceptional circumstances. 

Ethics review: Reviewers and ACs may flag submissions for ethics review . Flagged submissions will be sent to an ethics review committee for comments. Comments from ethics reviewers will be considered by the primary reviewers and AC as part of their deliberation. They will also be visible to authors, who will have an opportunity to respond.  Ethics reviewers do not have the authority to reject papers, but in extreme cases papers may be rejected by the program chairs on ethical grounds, regardless of scientific quality or contribution.  

Preprints: The existence of non-anonymous preprints (on arXiv or other online repositories, personal websites, social media) will not result in rejection. If you choose to use the NeurIPS style for the preprint version, you must use the “preprint” option rather than the “final” option. Reviewers will be instructed not to actively look for such preprints, but encountering them will not constitute a conflict of interest. Authors may submit anonymized work to NeurIPS that is already available as a preprint (e.g., on arXiv) without citing it. Note that public versions of the submission should not say "Under review at NeurIPS" or similar.

Dual submissions: Submissions that are substantially similar to papers that the authors have previously published or submitted in parallel to other peer-reviewed venues with proceedings or journals may not be submitted to NeurIPS. Papers previously presented at workshops are permitted, so long as they did not appear in a conference proceedings (e.g., CVPRW proceedings), a journal or a book.  NeurIPS coordinates with other conferences to identify dual submissions.  The NeurIPS policy on dual submissions applies for the entire duration of the reviewing process.  Slicing contributions too thinly is discouraged.  The reviewing process will treat any other submission by an overlapping set of authors as prior work. If publishing one would render the other too incremental, both may be rejected.

Anti-collusion: NeurIPS does not tolerate any collusion whereby authors secretly cooperate with reviewers, ACs or SACs to obtain favorable reviews. 

Author responses:   Authors will have one week to view and respond to initial reviews. Author responses may not contain any identifying information that may violate the double-blind reviewing policy. Authors may not submit revisions of their paper or supplemental material, but may post their responses as a discussion in OpenReview. This is to reduce the burden on authors to have to revise their paper in a rush during the short rebuttal period.

After the initial response period, authors will be able to respond to any further reviewer/AC questions and comments by posting on the submission’s forum page. The program chairs reserve the right to solicit additional reviews after the initial author response period.  These reviews will become visible to the authors as they are added to OpenReview, and authors will have a chance to respond to them.

After the notification deadline, accepted and opted-in rejected papers will be made public and open for non-anonymous public commenting. Their anonymous reviews, meta-reviews, author responses and reviewer responses will also be made public. Authors of rejected papers will have two weeks after the notification deadline to opt in to make their deanonymized rejected papers public in OpenReview.  These papers are not counted as NeurIPS publications and will be shown as rejected in OpenReview.

Publication of accepted submissions:   Reviews, meta-reviews, and any discussion with the authors will be made public for accepted papers (but reviewer, area chair, and senior area chair identities will remain anonymous). Camera-ready papers will be due in advance of the conference. All camera-ready papers must include a funding disclosure . We strongly encourage accompanying code and data to be submitted with accepted papers when appropriate, as per the code submission policy . Authors will be allowed to make minor changes for a short period of time after the conference.

Contemporaneous Work: For the purpose of the reviewing process, papers that appeared online within two months of a submission will generally be considered "contemporaneous" in the sense that the submission will not be rejected on the basis of the comparison to contemporaneous work. Authors are still expected to cite and discuss contemporaneous work and perform empirical comparisons to the degree feasible. Any paper that influenced the submission is considered prior work and must be cited and discussed as such. Submissions that are very similar to contemporaneous work will undergo additional scrutiny to prevent cases of plagiarism and missing credit to prior work.

Plagiarism is prohibited by the NeurIPS Code of Conduct .

Other Tracks: Similarly to earlier years, we will host multiple tracks, such as datasets, competitions, tutorials as well as workshops, in addition to the main track for which this call for papers is intended. See the conference homepage for updates and calls for participation in these tracks. 

Experiments: As in past years, the program chairs will be measuring the quality and effectiveness of the review process via randomized controlled experiments. All experiments are independently reviewed and approved by an Institutional Review Board (IRB).

Financial Aid: Each paper may designate up to one (1) NeurIPS.cc account email address of a corresponding student author who confirms that they would need the support to attend the conference, and agrees to volunteer if they get selected. To be considered for Financial the student will also need to fill out the Financial Aid application when it becomes available.

abstract learning research

Chemistry Education Research and Practice

Comparing drawing tasks and elaborate single-choice questions simulation-based learning: how do they facilitate students’ conceptual understanding on chemical equilibria.

Past research repeatedly revealed students’ struggles to understand chemical equilibria, especially concerning their dynamic nature. Black-box simulations have proven to be helpful here. However, the effect is strongly dependent on the quality of teaching, the design principles of which are not yet fully known. One aspect of debate concerns the nature of supportive learning tasks, which require students to activate, construct and reflect on their mental models to foster conceptual understanding. In this paper, we investigate how drawing-assisted simulation-based learning promotes conceptual understanding of chemical equilibria in comparison to single-choice tasks. Both types of supporting tasks involve simulation-based activities according to the German instructional design SIMMS ( S imulation-based I nstruction for M ental M odelling in S chool), which requires students to construct their own explanations and predictions on a chemical system before exploring it via molecular dynamics simulations and revising their explanations and predictions retrospectively. In a quasi-experimental intervention study with 174 German high school students of ten chemistry courses (tenth grade), two treatment groups (drawing group and single-choice group) were compared with a control group, assessing the progress in conceptual understanding during simulation-based learning via drawings and explanations as well as pre- and post-intervention via questionnaire. Our findings reveal similar effects of drawing tasks and elaborate single-choice tasks on conceptual understanding of chemical equilibria. For equilibrium dynamics specifically, simulation-based settings featuring drawing tasks seem to be slightly more effective than simulation-based settings featuring elaborate single-choice-tasks in fostering understanding. What is more, simulation-based settings on the divergent phenomenon of Le Chatelier (where different final states emerge from the same initial state, depending on the nature of external perturbation) seem to be more efficient than those on the convergent nature of chemical equilibria (where several initial states with different educt/product ratios yield the same final state in equilibrium) in fostering student understanding irrespective of the mode of the supportive learning task.

Supplementary files

  • Supplementary information PDF (257K)

Article information

abstract learning research

Download Citation

Permissions.

abstract learning research

Y. Peperkorn, J. Buschmann and S. Schwedler, Chem. Educ. Res. Pract. , 2024, Accepted Manuscript , DOI: 10.1039/D3RP00113J

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence . You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content .

Social activity

Search articles by author.

This article has not yet been cited.

Advertisements

More From Forbes

Neurodiversity: america's largest untapped resource and learning to leverage it.

Forbes Coaches Council

  • Share to Facebook
  • Share to Twitter
  • Share to Linkedin

Chief Academic & Learning Officer ( HCI Academy ); Chair/Professor, Organizational Leadership (UVU); OD Consultant (Human Capital Innovations)

With over 13 million American adults living with an autism spectrum disorder or attention-deficit/hyperactivity disorder, neurodivergent individuals represent the largest untapped talent pool in the United States workforce. However, many organizations still struggle to fully embrace neurodiversity and leverage the unique skills and perspectives of neurodivergent employees.

Today we will explore the research foundation supporting neurodiversity in the workplace and provide practical application strategies focused on creating an inclusive organizational culture that unleashes neurodivergent talent.

Defining Neurodiversity

To begin, it is important to define neurodiversity and understand who this encompasses. Neurodiversity recognizes that neurological differences, such as those seen in autism, ADHD, dyslexia, and other conditions, are natural human variations rather than disorders or diseases. Neurodivergent individuals think and experience the world in different ways compared to what has traditionally been considered "typical." This includes approximately 1 in 59 individuals on the autism spectrum and 3-7% of adults with ADHD.

While these conditions are commonly diagnosed during childhood, they persist into adulthood and impact an individual's engagement and success within society and the workforce. By embracing neurodiversity, organizations can leverage the strengths and perspectives of neurodivergent employees and create a fully inclusive work culture.

Why Is Chief Boden Leaving Chicago Fire Eamonn Walker s Exit Explained

Biden vs. trump 2024 election polls: biden leads trump by only single digits in new york, latest survey shows, here are the major allegations against sean diddy combs as cassie ventura breaks silence on attack video, the neurodivergent mind: hidden talents.

Research has revealed that neurodivergent minds possess unique talents and qualities that can provide benefits within many work environments when supported appropriately. A key area of strength lies in focused, detail-oriented thinking and skills. For example:

• Autistic individuals often excel at concentrated, narrow fields of interest where routine and rules are clearly defined. This focus and attention to detail make careers like software coding, engineering, and quality control natural fits.

• Those with ADHD frequently demonstrate high energy, creativity, and passion when motivated by stimulating work they find engaging. Careers in sales, marketing, design, and entrepreneurship can leverage these talents when provided flexibility.

Additionally, research suggests neurodivergent employees may bring innovative perspectives due to divergent thinking patterns. Their brains are wired to analyze problems and view situations from a different vantage point compared to neurotypical individuals. This fresh perspective has led to breakthrough innovations from tech giants like Microsoft, SAP, and Hewlett-Packard who actively recruit autistic talent.

Implementing Neurodiversity: Policies and Culture

With an understanding of neurodivergent minds and talents, organizations must take proactive steps to welcome this population into the workforce and support their success long-term. Key research-based recommendations include:

Hiring and Onboarding Practices

• Review job descriptions and interview questions for explicit, literal language with concrete examples to eliminate potential barriers.

• Train hiring managers and recruiters on how to identify transferable skills in neurodivergent candidates despite resume gaps.

• Provide accommodations like structured interviews or telework tryouts to allow candidates to demonstrate abilities.

• Assign mentors/buddies for neurodivergent new hires to help navigate organizational culture and social norms.

Flexible and Supportive Work Environments

• Allow flexibility in work schedules, locations, or roles to leverage individual strengths and accommodate challenges.

• Provide quiet spaces to limit distractions and private offices to avoid overstimulation.

• Establish open communication and specific contacts for assistance with accommodations, feedback, or workplace adjustments.

Education and Understanding

• Conduct organizational trainings and provide reference materials to raise awareness of neurodiversity and dispel misconceptions.

• Sponsor employee resource groups focused on neurological differences for support, education, and driving initiatives.

• Share internal success stories of neurodivergent employees to further acceptance and inclusion across the organization.

When these practical steps are taken, research has shown organizations experience increased neurodivergent employee retention, higher job satisfaction, better productivity, and reduced turnover costs. Beyond financial benefits, a culture of embracing neurodiversity fosters an innovative, inclusive environment where all talents and perspectives are valued. Leading organizations are already reaping these rewards.

Industry Successes: Leveraging Neurodivergent Talents

Forward-thinking organizations have implemented programs focused on hiring, supporting, and cultivating neurodivergent talent and are seeing significant positive impacts. Three success stories include:

SAP's Autism at Work Program

The global tech leader launched its recruitment program in 2012 focused on autistic talent . By providing specialized onboarding, mentoring circles, and assistive technologies, SAP has doubled the retention rate of autistic employees compared to neurotypical hires. Participating employees excel in quality assurance testing roles where their attention to detail and process-oriented skills are strengths. Over 300 autistic individuals have been hired across 30 countries, and autistic engineers have helped improve SAP products.

Virgin Atlantic's Autism Hiring Campaign

The international airline worked with the UK's National Autistic Society to design a specific recruitment and onboarding process tailored for potential autistic flight attendants in 2018. Modifications like structured interviews, social skills training, quiet zones, and designated support staff have helped create a welcoming, neurodiverse workplace. Not only have candidates flourished in their roles handling procedures and customer service, but the rate of customers recommending Virgin Atlantic also increased.

EY's Neurodiversity Program

One of the world's largest professional services organizations launched its neurodiversity program in 2014 to attract technical talent. Employees on the autism spectrum or with dyslexia, dyspraxia, and ADHD have filled roles performing software testing, compliance reviews, and data entry where their abilities to focus, pay attention to detail, and think systematically are strengths. EY provides customized support like mentors, sensory-friendly workspaces, and flexibility. Almost 70 neurodivergent individuals have been hired across the UK, contributing over $1 million in billings while achieving higher retention rates than typical recruits.

As someone who witnessed the struggles of neurodivergent family members and friends to find fulfilling work, this issue is deeply personal to me. The research on embracing neurodiversity at work gives me hope that more organizations are starting to recognize the untapped potential in neurodivergent minds.

While there is still progress to be made, more and more companies are implementing supportive programs and cultivating an inclusive culture. If done right, embracing neurodiversity should empower employees to bring their full, unique selves to work each day. It's encouraging that data shows such initiatives don't just benefit underrepresented groups—businesses see results like increased productivity and performance as well.

If more employers come to appreciate diverse perspectives instead of perceived "abnormalities," we can help ensure neurodivergent individuals obtain meaningful careers where they feel valued and able to contribute their strengths. With continued open-mindedness, compassion and customized support, I believe the workforce of tomorrow can be one where all people, regardless of neurotype, have an equal chance to thrive.

Forbes Coaches Council is an invitation-only community for leading business and career coaches. Do I qualify?

Jonathan H. Westover, Ph.D

  • Editorial Standards
  • Reprints & Permissions
  • Facts & Figures
  • Accreditations
  • Maps and Directions
  • Faculty Distinctions, Awards and Honors
  • Engineering Honors
  • Computer Engineering
  • Global Programs
  • Student Organizations
  • Peer Teachers
  • Fast Track: Optimizing the transition from Undergraduate to Graduate Studies
  • Admissions and Aid
  • Entry to a Major Process
  • Scholarships and Financial Aid
  • Research Areas
  • Undergraduate Research
  • Seminars and Distinguished Lectures
  • Industry Capstone Program
  • Industrial Affiliates Program

Online Artificial Intelligence and Machine Learning Certificate

Gain a competitive edge with our graduate-level Artificial Intelligence and Machine Learning Certificate. This program equips both novices and seasoned professionals with the essential skills to harness the power of modern Artificial Intelligence and Machine Learning in their domain. Upon completion, participants will master statistical analysis and machine learning techniques, enabling them to dissect complex data sets. Armed with the ability to synthesize and evaluate AI models, graduates will confidently tackle real-world challenges, leveraging cutting-edge tools to derive actionable insights and drive innovation in their respective fields.

I'm ready to apply!   Request more information

abstract learning research

Certificate Overview

The Artificial Intelligence and Machine Learning certificate is a 12-credit program that equips novices and seasoned professionals with the essential skills to harness the power of modern Artificial Intelligence and Machine Learning in their respective fields of operation.

Technical Qualifications

To be successful in this program, prospective students must demonstrate an understanding of core concepts in computer science or equivalent covered in the categories below:

  • Program Design and Concepts : programming proficiency through problem-solving with a high-level programming language, emphasizing computational thinking, data types, object-oriented design, dynamic memory management, and error handling for robust program development.
  • Data Structures : implementing essential abstract data types and algorithms covering stacks, queues, sorting, searching, graphs, and hashing; examining performance trade-offs, analyzing runtime and memory usage.
  • Algorithms : computer algorithms for numeric and non-numeric problems; design paradigms; analysis of time and space requirements of algorithms; correctness of algorithms.
  • Discrete Structures for Computing : foundations from discrete mathematics for algorithm analysis, focusing on correctness and performance; introducing models like finite state machines and Turing machines.
  • Mathematical Foundations : Calculus, Probability, and Linear Algebra.

Students must take four out of five possible courses to complete this certificate. See course information below.

Information

To qualify for this certificate, you must complete 12 semester credit hours (SCH) of coursework from the following list of courses. All courses must be completed with a grade of C or above. Each course is linked to its course description within the catalog.

Courses (12 credits):

Select four of the following:*

  • CSCE 625 - Artificial Intelligence
  • CSCE 633 - Machine Learning
  • CSCE 635 - AI Robotics
  • CSCE 636 - Deep Learning
  • CSCE 642 - Deep Reinforcement Learning

* Additional courses are available with the consultation of an academic advisor.

For more information, please see the course catalog .

Why choose Engineering Online

Advance your career with our Engineering Online program! Backed by the university's esteemed reputation and national recognition in engineering education, you'll engage directly with industry leaders and a rigorous curriculum. Beyond graduation, tap into the extensive Aggie Alumni Network, offering invaluable connections to propel your career forward.

Engineering Online Benefits

Girl writes notes while reading slide on laptop computer.

Certificate Highlights

Related academics.

Students hands typing on a laptop

Online Master of Computer Science

Student using double monitors

Online Master of Engineering in Computer Engineering

Frequently asked questions.

Discover answers to frequently asked questions tailored to assist you in making informed decisions regarding your education with Engineering Online.

Graduate Admissions

Use EngineeringCAS to apply for the distance education version of the certificate. Follow the provided instructions, as they may differ from certificate to certificate.

Graduate Tuition Calculator

To calculate cost, select the semester you’ll start, choose “Engineering” from the drop-down menu, and slide “Hours” to how many you’ll take each semester. Your total cost is Tuition and Required Fees + Engineering Program Fee (Remote).

Questions? Email [email protected] !

IMAGES

  1. What Is a Research Abstract? 3 Effective Examples

    abstract learning research

  2. How to write an abstract for a research paper?

    abstract learning research

  3. 23 Abstract Thinking Examples (2024)

    abstract learning research

  4. (PDF) Research Abstract

    abstract learning research

  5. What Is a Research Abstract? 3 Effective Examples

    abstract learning research

  6. 😍 How to write a research abstract. 10 Good Abstract Examples That Will

    abstract learning research

VIDEO

  1. What is Abstract Reasoning?

  2. CRITIQUE OF RESEARCH ABSTRACT

  3. abstract

  4. QUANTITATIVE Research Design: A Comprehensive Guide with Examples #phd #quantitativeresearch

  5. what is abstraction? what is abstract class? what is pure virtual class? #Harshika_SoftwareTrainer

  6. Abstracting

COMMENTS

  1. The Use of Concrete Examples Enhances the Learning of Abstract Concepts

    The use of so-called 'concrete', 'illustrative' or 'real-world' examples has been repeatedly proposed as an evidence-based way of enhancing the learning of abstract concepts (e.g. Deans for Impact, 2015; Nebel, 2020; Weinstein et al., 2018).Abstract concepts are defined by not having a physical form and so can be difficult for learners to process and understand (Harpaintner et al ...

  2. Learning Styles: An overview of theories, models, and measures

    Abstract. Although its origins have been traced back much further, research in the area of learning style has been active for—at a conservative estimate—around four decades. During that period the intensity of activity has varied, with recent years seeing a particularly marked upturn in the number of researchers working in the area.

  3. Learning Styles: Concepts and Evidence

    The authors of the present review were charged with determining whether these practices are supported by scientific evidence. We concluded that any credible validation of learning-styles-based instruction requires robust documentation of a very particular type of experimental finding with several necessary criteria. First, students must be divided into groups on the basis of their learning ...

  4. Development of abstract thinking during childhood and adolescence: The

    The focus will first be on research on relationally abstract thinking, reviewing studies which have investigated the orientation of attention towards self-generated thoughts and the manipulation and integration of relations. Second, I will discuss findings related to the processing of temporally abstract thoughts, reviewing studies of episodic ...

  5. Learning strategies: a synthesis and conceptual model

    Surface learning includes subject matter vocabulary, the content of the lesson and knowing much more. Strategies include record keeping, summarisation, underlining and highlighting, note taking ...

  6. How to Write an Abstract

    Write clearly and concisely. A good abstract is short but impactful, so make sure every word counts. Each sentence should clearly communicate one main point. To keep your abstract or summary short and clear: Avoid passive sentences: Passive constructions are often unnecessarily long.

  7. Grounded understanding of abstract concepts: The case of STEM learning

    Characterizing the neural implementation of abstract conceptual representations has long been a contentious topic in cognitive science. At the heart of the debate is whether the "sensorimotor" machinery of the brain plays a central role in representing concepts, or whether the involvement of these perceptual and motor regions is merely peripheral or epiphenomenal. The domain of science ...

  8. The science of effective learning with spacing and retrieval practice

    Alexander Renkl. Educational Psychology Review (2023) Research on the psychology of learning has highlighted straightforward ways of enhancing learning. However, effective learning strategies are ...

  9. PDF Reading and Understanding Abstracts

    Reading Abstracts Benefits Your Learning Abstracts are usually a student's first point of contact with professional scientific research. Although reading a whole article can be daunting, reading an abstract is much simpler and the benefits to your learning are direct. Here are some ways reading abstracts helps you learn: Finding sources quickly

  10. How to Write an Abstract

    Write your paper first, then create the abstract as a summary. Check the journal requirements before you write your abstract, eg. required subheadings. Include keywords or phrases to help readers search for your work in indexing databases like PubMed or Google Scholar. Double and triple check your abstract for spelling and grammar errors.

  11. Teaching the science of learning

    The science of learning has made a considerable contribution to our understanding of effective teaching and learning strategies. However, few instructors outside of the field are privy to this research. In this tutorial review, we focus on six specific cognitive strategies that have received robust support from decades of research: spaced practice, interleaving, retrieval practice, elaboration ...

  12. Effectiveness of online and blended learning from schools: A systematic

    This systematic analysis examines effectiveness research on online and blended learning from schools, particularly relevant during the Covid-19 pandemic, and also educational games, computer-supported cooperative learning (CSCL) and computer-assisted instruction (CAI), largely used in schools but with potential for outside school.

  13. 3. The Abstract

    An abstract summarizes, usually in one paragraph of 300 words or less, the major aspects of the entire paper in a prescribed sequence that includes: 1) the overall purpose of the study and the research problem(s) you investigated; 2) the basic design of the study; 3) major findings or trends found as a result of your analysis; and, 4) a brief summary of your interpretations and conclusions.

  14. Abstract Writing: A Step-by-Step Guide With Tips & Examples

    The process of writing an abstract can be daunting, but with these guidelines, you will succeed. The most efficient method of writing an excellent abstract is to centre the primary points of your abstract, including the research question and goals methods, as well as key results. Interested in learning more about dedicated research solutions?

  15. Abstracts

    Methodology: An abstract of a scientific work may include specific models or approaches used in the larger study. Other abstracts may describe the types of evidence used in the research. Results: Again, an abstract of a scientific work may include specific data that indicates the results of the project. Other abstracts may discuss the findings ...

  16. Abstract Thinking: Definition, Examples, Uses, and Tips

    Abstract thinking, also known as abstract reasoning, involves the ability to understand and think about complex concepts that, while real, are not tied to concrete experiences, objects, people, or situations. Abstract thinking is considered a type of higher-order thinking, usually about ideas and principles that are often symbolic or hypothetical.

  17. Kolb's Learning Styles & Experiential Learning Cycle

    Kolb's experiential learning theory works on two levels: a four-stage learning cycle and four separate learning styles. Much of Kolb's theory concerns the learner's internal cognitive processes. Kolb states that learning involves the acquisition of abstract concepts that can be applied flexibly in a range of situations.

  18. Writing an Abstract for Your Research Paper

    Definition and Purpose of Abstracts An abstract is a short summary of your (published or unpublished) research paper, usually about a paragraph (c. 6-7 sentences, 150-250 words) long. A well-written abstract serves multiple purposes: an abstract lets readers get the gist or essence of your paper or article quickly, in order to decide whether to….

  19. How to Write a Comprehensive and Informative Research Abstract

    An abstract should be a stand-alone summary of a research project. 1 Although abstracts are most used to provide an overview of a research project, they may also be used to summarize an implementation project related to practice, policy, or education in nursing. The abstract may be a precursor to a scientific manuscript, chapter, thesis, or ...

  20. Abstract concept learning in a simple neural network inspired by the

    Author summary Is it necessary to have advanced neural mechanisms to learn abstract concepts such as sameness or difference? Such tasks are usually considered a higher order cognitive capacity, dependent on complex cognitive processes located in the mammalian neocortex. It has always been astonishing therefore that honey bees have been shown capable of learning sameness and difference, and ...

  21. PDF Tips for Writing a Successful Abstract and Learning Objectives

    There are two parts to submitting a strong proposal: abstract and learning objectives Preparing to write an abstract • What do I have to present? • Is my topic relevant to the audience? The audience at the MN Prevention Program Sharing Conference ... • Description of the work, research, project, experience, innovative idea, etc.

  22. Deep Learning: A Comprehensive Overview on Techniques ...

    Deep learning (DL), a branch of machine learning (ML) and artificial intelligence (AI) is nowadays considered as a core technology of today's Fourth Industrial Revolution (4IR or Industry 4.0). Due to its learning capabilities from data, DL technology originated from artificial neural network (ANN), has become a hot topic in the context of computing, and is widely applied in various ...

  23. The Efficacy of Artificial Intelligence-Enabled Adaptive Learning

    Dr. Max Sommer is a Learning Designer at Elsevier. Max earned his Ph.D. in Educational Technology at the University of Florida in 2022. His background includes instructional design, teaching (face-to-face, online, and blended environments), accessible design, user experience/interface design, assessment development, curriculum development, and quantitative and mixed research methods.

  24. 15 Abstract Examples: A Comprehensive Guide

    When it comes to writing an abstract for a research paper, striking a balance between consciousness and informative detail is essential. Our examples of abstracts will help you grasp this balance better. ... Informative Abstract Example 2. Social learning takes place through observations of others within a community. In diverse urban landscapes ...

  25. Innovative learning environments: a learning experience with in-service

    abstract The challenges of modern society have led educators to reconceptualize formal learning spaces, into flexible spaces, imbued with technologies and active methodologies. To achieve this goal, a course was developed in an in-service teacher education Master program to support teachers in developing learning scenarios for innovative ...

  26. NeurIPS 2024 Call for Papers

    Abstract submission deadline: May 15, 2024. ... The paper checklist is intended to help authors reflect on a wide variety of issues relating to responsible machine learning research, including reproducibility, transparency, research ethics, and societal impact. The checklist forms part of the paper submission, but does not count towards the ...

  27. Chemistry Education Research and Practice

    One aspect of debate concerns the nature of supportive learning tasks, which require students to activate, construct and reflect on their mental models to foster conceptual understanding. In this paper, we investigate how drawing-assisted simulation-based learning promotes conceptual understanding of chemical equilibria in comparison to single ...

  28. America's Largest Untapped Resource And Learning To Leverage It

    Additionally, research suggests neurodivergent employees may bring innovative perspectives due to divergent thinking patterns. Their brains are wired to analyze problems and view situations from a ...

  29. Transfer Learning Reveals Cancer-Associated Fibroblasts Are Associated

    Transfer learning using transcriptional data from patient-derived organoid and CAF cocultures provided in silico validation of CAF induction of inflammatory and EMT epithelial cell states. Further experimental validation in cocultures demonstrated integrin beta 1 (ITGB1) and vascular endothelial factor A (VEGFA) interactions with neuropilin-1 ...

  30. Online Artificial Intelligence and Machine Learning Certificate

    Email [email protected]! Our graduate-level Artificial Intelligence and Machine Learning certificate equips both novices and seasoned professionals with the essential skills to harness the power of modern Artificial Intelligence and Machine Learning in their domain.