SYSTEMATIC REVIEW article

Differentiated instruction in secondary education: a systematic review of research evidence.

\nAnnemieke E. Smale-Jacobse

  • Department of Teacher Education, University of Groningen, Groningen, Netherlands

Differentiated instruction is a pedagogical-didactical approach that provides teachers with a starting point for meeting students' diverse learning needs. Although differentiated instruction has gained a lot of attention in practice and research, not much is known about the status of the empirical evidence and its benefits for enhancing student achievement in secondary education. The current review sets out to provide an overview of the theoretical conceptualizations of differentiated instruction as well as prior findings on its effectiveness. Then, by means of a systematic review of the literature from 2006 to 2016, empirical evidence on the effects of within-class differentiated instruction for secondary school students' academic achievement is evaluated and summarized. After a rigorous search and selection process, only 14 papers about 12 unique empirical studies on the topic were selected for review. A narrative description of the selected papers shows that differentiated instruction has been operationalized in many different ways. The selection includes studies on generic teacher trainings for differentiated instruction, ability grouping and tiering, individualization, mastery learning, heterogeneous grouping, and remediation in flipped classroom lessons. The majority of the studies show small to moderate positive effects of differentiated instruction on student achievement. Summarized effect sizes across studies range from d = +0.741 to +0.509 (omitting an outlier). These empirical findings give some indication of the possible benefits of differentiated instruction. However, they also point out that there are still severe knowledge gaps. More research is needed before drawing convincing conclusions regarding the effectiveness and value of different approaches to differentiated instruction for secondary school classes.

Introduction

Differentiation is a hot-topic in education nowadays. Policy-makers and researchers urge teachers to embrace diversity and to adapt their instruction to the diverse learning needs of students in their classrooms ( Schleicher, 2016 ; Unesco, 2017 ). Differentiation is a philosophy of teaching rooted in deep respect for students, acknowledgment of their differences, and the drive to help all students thrive. Such ideas imply that teachers proactively modify curricula, teaching methods, resources, learning activities, or requirements for student products to better meet students' learning needs ( Tomlinson et al., 2003 ). When teachers deliberately plan such adaptations to facilitate students' learning and execute these adaptations during their lessons we call it differentiated instruction. A number of developments in education have boosted the need for differentiated instruction. First, contemporary classes are becoming relatively heterogeneous because of policies focused on detracking, the inclusion of students from culturally and linguistically diverse backgrounds, and inclusive education in which special education students (SEN) attend classes along with non-SEN students ( Rock et al., 2008 ; Tomlinson, 2015 ). Since early stratification of students may have unintended effects on the educational opportunities of students with varying background characteristics, addressing students' learning needs by teaching adaptively within heterogeneous classrooms has been proposed as the best choice for a fair educational system ( Oakes, 2008 ; Schütz et al., 2008 ; Schofield, 2010 ; OECD, 2012 , 2018 ). In addition, even within relatively homogeneous classrooms, there are considerable differences between students that need attention ( Wilkinson and Penney, 2014 ). Second, the idea that learners have different learning needs and that a one-size-fits-all approach does not suffice, is gaining momentum ( Subban, 2006 ). Policy makers stress that all students should be supported to develop their knowledge and skills at their own level ( Rock et al., 2008 ; Schleicher, 2016 ) and there is the wish to improve equity or equality among students ( Unesco, 2017 ; Kyriakides et al., 2018 ). When the aim is to decrease the gap between low and high achieving students, teachers could invest most in supporting low achieving students. This is called convergent differentiation ( Bosker, 2005 ). Alternatively, teachers may apply divergent differentiation in which they strive for equality by dividing their efforts equally across all students, allowing for variation between students in the learning goals they reach, time they use, and outcomes they produce ( Bosker, 2005 ).

Although the concept of differentiated instruction is quite well-known, teachers find it difficult to grasp how differentiated instruction should be implemented in their classrooms ( Van Casteren et al., 2017 ). A recent study found that teachers across different countries infrequently adapt their instruction to student characteristics ( Schleicher, 2016 ). Struggling students may work on too difficult tasks or, conversely, high ability students may practice skills they have already mastered ( Tomlinson et al., 2003 ). Clearly, more information about effective practices is needed. A recent review and meta-analysis of differentiated instruction practices in primary education shows that differentiated instruction has some potential for improving student outcomes, when implemented well ( Deunk et al., 2018 ). However, these results may not generalize directly to secondary education, since the situation in which teachers teach multiple classes in secondary education is rather different in nature compared to primary education ( Van Casteren et al., 2017 ). For secondary education, evidence for the benefits of differentiated instruction is scarce ( Coubergs et al., 2013 ). The bulk of studies in secondary education focus on differentiation of students between classes by means of streaming or tracking ( Slavin, 1990a ; Schofield, 2010 ). Alternatively, the current study seeks to scrutinize which empirical evidence there is on the effectiveness of within-class differentiated instruction in secondary education, how studies operationalize the approach, and in which contexts the studies were performed.

Theory and Operationalizations

Operationalizing differentiated instruction in the classroom.

Theories of differentiation are bound by several guiding principles. They include a focus on essential ideas and skills in each content area, responsiveness to individual differences, integration of assessment and instruction, and ongoing adjustment of content, process, and products to meet students' learning needs ( Rock et al., 2008 ). Differentiation typically includes pro-active and deliberate adaptations of the content, process, product, learning environment or learning time, based on the assessment of students' readiness or another relevant student characteristic such as learning preference or interest ( Roy et al., 2013 ; Tomlinson, 2014 ). In Table 1 , we have schematized the theoretical construct of differentiated instruction in the lesson within the broader definition of within-class differentiation.

www.frontiersin.org

Table 1 . Theoretical model of within-class differentiation.

Differentiated instruction in the classroom entails two aspects. First is the pedagogy and didactics of differentiated instruction : which teaching practices and techniques do teachers use and what do they differentiate ( McQuarrie et al., 2008 ; Valiande and Koutselini, 2009 )? Teachers may offer students' adapted content , offer various options in the learning process , use different assessment products , or adapt the learning environment to students' learning needs ( Tomlinson, 2014 ). Teachers may also offer certain students more learning time or conversely, encourage high achievers to speed up their learning process ( Coubergs et al., 2013 ). Regarding the process , they may use pre-teaching or extended instruction to cater to the needs of students ( Smets and Struyven, 2018 ), or they could adapt instructions throughout the lesson. Second, the organizational aspect of differentiated instruction entails the structure in which it is embedded. There are different approaches a teacher may choose (see Table 1 ). In macro-adaptive approaches, teachers use some form of homogeneous clustering to organize their differentiated instruction ( Corno, 2008 ), including fixed or flexible grouping of students based on a common characteristic such as readiness or interest. Alternatively, teachers could use heterogeneous grouping to organize their differentiated instruction. Differentiation of the learning process may occur because students divide tasks within the group based on their learning preferences or abilities. Alternatively, a teacher may suggest a division of tasks or support based on assessment of learning needs ( Coubergs et al., 2013 ). When adaptations are taken to the level at which individual students work at their own rate on their level, this is called individualization ( Education Endowment Foundation, n.d. ). The learning goals are the same, but learning trajectories are tailored to individuals' needs. Some authors include individualized approaches into the theoretical construct of differentiated instruction ( Smit et al., 2011 ; Coubergs et al., 2013 ; Tomlinson, 2014 ), whereas others separate it from differentiated instruction ( Bray and McClaskey, 2013 ; Roy et al., 2013 ).

Lastly, there are teaching models or strategies in which differentiated instruction has a central place. One well-known example is group-based mastery learning . In this approach, subject matter is divided into small blocks or units. For each unit, the teacher gives uniform instructions to the whole group of students. Then, a formative assessment informs the teacher which students reach the desired level of mastery of the unit (usually set at 80–90% correct). Students below this criterion receive corrective instruction in small groups, or alternatively, forms of tutoring, peer tutoring or independent practice are also possible to differentiate the learning process ( Slavin, 1987 ). Differentiated instruction may also be embedded in other instructional approaches like peer tutoring, problem-based learning, flipped classroom models etc. ( Mastropieri et al., 2006 ; Coubergs et al., 2013 ; Altemueller and Lindquist, 2017 ).

Immediate, unplanned adaptations to student needs, so-called “micro-adaptations” ( Corno, 2008 ), are not included in the theoretical model in Table 1 , since differentiated instruction is—by nature—planned and deliberate ( Coubergs et al., 2013 ; Tomlinson, 2014 ; Keuning et al., 2017 ). Furthermore, we did not include the concept of “personalization” in our model since in personalized approaches students follow their own learning trajectories, pursue their own learning goals, and co-construct the learning trajectory, which makes it notably different from typical operationalizations of differentiated instruction ( Bray and McClaskey, 2013 ; Cavanagh, 2014 ).

Differentiation as a Sum of Its Parts

As noted above, differentiated instruction during the lesson is in fact only one piece of the mosaic ( Tomlinson, 1999 ). There are a lot of other steps that are crucial for successful implementation of differentiated instruction ( Keuning et al., 2017 ; Van Geel et al., 2019 ). Table 1 shows other behaviors that are related to what teachers do in the classroom. First, continuous monitoring and (formative) assessment and differentiated instruction are inseparable ( Hall, 1992 ; Valiande and Koutselini, 2009 ; Roy et al., 2013 ; Tomlinson, 2014 ; Denessen and Douglas, 2015 ; Prast et al., 2015 ). Some teachers may be inclined to use rather one-dimensional, fixed categorizations of students based on their learning needs at some point in time ( Smets and Struyven, 2018 ). Nevertheless, high quality differentiated instruction is based on the frequent assessment of learning needs and flexible adaptations to meet those needs. Prior to the lesson including differentiated instruction, teachers should have clear goals for their students, use some form of pre-assessment , and plan their adaptive instruction ( Prast et al., 2015 ; Keuning et al., 2017 ; Van Geel et al., 2019 ). Then, teachers proceed to the actual differentiated instruction during the lesson . After the lesson, teachers should evaluate students' progress toward their goals.

Besides these steps, more general high-quality teaching behaviors are preconditions to create a good context for differentiated instruction ( Wang et al., 1990 ; Tomlinson, 2014 ). For instance, creating a safe and stimulating learning environment in which students feel welcomed and respected is essential ( Tomlinson, 2014 ). In addition, good classroom management may help teachers to implement differentiated instruction in an orderly manner ( Maulana et al., 2015 ; Prast et al., 2015 ). In empirical studies, differentiated instruction has been found to be a separate domain of teaching, while at the same time being strongly interrelated with other high quality teaching behaviors ( Van de Grift et al., 2014 ; Maulana et al., 2015 ; Van der Lans et al., 2017 , 2018 ). In turn, high quality teaching behaviors like questioning, explaining the lesson content, or giving examples can be applied in a differentiated way, stressing that high quality teaching is both a contextual factor as a direct source of input for teachers' differentiated instruction.

Prior Review Studies on Differentiated Instruction

Although studies on within-class differentiated instruction in secondary education are scarce, a number of reviews and meta-analyses have shed some light on the effects on student achievement. Subban (2006) discusses a number of studies showing that adapting content or processes can make learning more engaging for students than one-size-fits-all teaching, and some studies showed positive effects of differentiated instruction on student achievement. The narrative review by Tomlinson et al. (2003) revealed studies showing that students achieve better results in mixed-ability classrooms in which the teacher differentiates instruction than in homogeneous classes were a more single-size approach is used. In a recent narrative research synthesis on adaptive teaching, one study on differentiated instruction was included. The authors found positive results of different types of adaptive teaching on students' academic and non-academic outcomes in primary education ( Parsons et al., 2018 ). In a large-scale meta-analysis by Scheerens (2016) , adaptive teaching was operationalized with some relevant indicators such as using variable teaching methods, orientation toward individual learning processes, and considering students' prerequisites. In this meta-analysis, a very small effect of adaptive teaching on student achievement was found.

A number of reviews report on specific operationalizations of within-class differentiated instruction. One of the most frequently reviewed forms is ability grouping . In within-class ability grouping, teachers cluster students into different homogeneous groups based on their abilities or readiness. In her narrative review, Tieso (2003) summarizes that ability grouping has a potential influence on student achievement when grouping is flexible, and teachers adapt their instruction to the needs of different groups. Steenbergen-Hu et al. (2016) performed a meta-synthesis including five other meta-analyses of the effects of ability grouping in K-12 education. In their study, within-class grouping was found to have at least a small positive impact on students' academic achievement (Hedges g = + 0.25). In the study of Kulik (1992) , who also combined results from different meta-analyses, a comparable effect size of Glass's Δ = + 0.25 in favor of within-class ability grouping was found. In the meta-analysis of Lou et al. (1996) on grouping in secondary education, within-class grouping was found to have a small positive effect (Cohen's d = + 0.12) on student outcomes. Substantive achievement gains were found in studies in which teachers adapted their teaching to needs of the different ability groups (Cohen's d = + 0.25), but not in studies in which teachers provided the same instruction for the different groups (Cohen's d = + 0.02). In his large meta-analysis of effects of instructional approaches on student outcomes, Hattie (2009) reported a small positive effect of within-class ability grouping on students' academic achievement (Cohen's d = +0.16). Conversely, Slavin (1990a) did not find significant effects of (between and within-class) ability grouping on achievement in secondary education. In a meta-synthesis of multiple meta-analyses on ability grouping—including between-class ability grouping—no overall positive effects of the approach were found ( Sipe and Curlette, 1996 ). Some studies have found that ability grouping effects may differ for subgroups of students. For instance, Lou et al. (1996) found that low-ability students learned significantly more in heterogeneous (mixed-ability) groups, average-ability students benefitted most in homogeneous ability groups, and for high-ability students group composition made no significant difference. In primary education, Deunk et al. (2018) found a negative effect of within-class homogeneous grouping for low achieving pupils. Conversely, Steenbergen-Hu et al. (2016) concluded that high-, average-, and low-ability students all benefited equally from ability grouping. Thus, the findings on differential effects of ability grouping remain inconclusive.

Another possible approach to differentiated instruction is tiering. Tiering refers to using the same curriculum material for all learners, but adjusting the depth of content, the learning activity process, and/or the type of product developed by the student to students' readiness, interest or learning style ( Pierce and Adams, 2005 ; Richards and Omdal, 2007 ). Teachers design a number of variations or tiers to a learning task, process or product, to which students are assigned based on assessed abilities. To our knowledge, there are no specific reviews of the literature or meta-analyses summarizing the effects of tiering on student achievement, but the approach is often combined with homogeneous (ability) grouping.

Alternatively, turning to heterogeneous grouping as an organizational structure for differentiated instruction, there is evidence that students of varying backgrounds working together may learn from each other's knowledge, from observing each other, and from commenting on each other's errors ( Nokes-Malach et al., 2015 ). However, based on their narrative review about differentiated instruction in secondary schools, Coubergs et al. (2013) concluded that there is little known about the effectiveness of differentiated instruction in heterogeneous settings They found that guiding heterogeneous groups is challenging for teachers, and that it is difficult to address the learning needs of all students in these mixed groups.

Reviews of effectiveness of individualized instruction indicate small effects on student outcomes. Hattie (2009) reports a small effect of individualization on student achievement (Cohen's d = +0.23). In addition, in another review a wide range of effects across meta-analyses was found of individualization on academic achievement of students (from −0.07 to +0.40; Education Endowment Foundation, n.d. ). Currently, mostly ICT-applications are used to individualize instruction. Review studies show that such adaptive ICT applications may considerably improve student achievement ( Ma et al., 2014 ; Van der Kleij et al., 2015 ; Kulik and Fletcher, 2016 ; Shute and Rahimi, 2017 ).

Guskey and Pigott (1988) performed a meta-analysis on the effects of group-based mastery learning on students' academic outcomes from grade one up to college. They reported positive effects on students' academic achievement as a result of the application of group-based mastery learning for, among others, high school students (Hedges g = +0.48). Later on, Kulik et al. (1990) and Hattie (2009) also reported relatively large positive effects of group-based mastery learning on student achievement (ES = +0.59 and Cohen's d = +0.58, respectively). Low ability students were generally found to profit most from the convergent approach ( Guskey and Pigott, 1988 ; Kulik et al., 1990 ). Mastery learning was among the most effective educational approaches in a meta-synthesis of multiple meta-analyses ( Sipe and Curlette, 1996 ). However, mastery learning may be particularly valuable to train specific skills but may yield fewer positive results for more general skills as measured by standardized tests ( Slavin, 1987 , 1990b ). Mastery learning has also been incorporated into broader interventions in secondary education such as the IMPROVE method ( Mevarech and Kramarski, 1997 ).

Overall, from previous review studies we can draw the conclusion that there is some evidence that differentiated instruction has potential power to affect students' academic achievement positively with small to medium effects. However, the evidence is limited and heterogeneous in nature. The effectiveness of some approaches to differentiated instruction, such as ability grouping, has been reviewed extensively, while other approaches have received less attention. Furthermore, most studies were executed some time ago and were executed in the context of primary education, while only few studies focus specifically on secondary education.

Contextual and Personal Factors Influencing Differentiated Instruction

When analyzing the effectiveness of differentiated instruction, it is important to acknowledge that classroom processes do not occur in a vacuum. Both internal and external sources determine whether teachers will succeed in developing complex teaching skills ( Clarke and Hollingsworth, 2002 ). In the case of differentiated instruction, teacher-level variables like education, professional development and personal characteristics like knowledge, attitudes, beliefs, values and self-efficacy may influence their behavior ( Tomlinson, 1995 ; Tomlinson et al., 2003 ; Kiley, 2011 ; De Jager, 2013 ; Parsons et al., 2013 ; Dixon et al., 2014 ; De Neve and Devos, 2016 ; Suprayogi et al., 2017 ; Stollman, 2018 ). Teachers need thorough content knowledge and a broad range of pedagogical and didactic skills to plan and execute differentiated instruction ( Van Casteren et al., 2017 ). At the classroom level, diversity of the student population ( De Neve and Devos, 2016 ) and class-size ( Blatchford et al., 2011 ; Suprayogi et al., 2017 ; Stollman, 2018 ) influence interactions between teachers and their students. Moreover, school characteristics matter. For instance, a school principal's support can influence implementation of differentiated instruction ( Hertberg-Davis and Brighton, 2006 ). Additionally, structural organizational conditions, such as time and resources available for professional development, and cultural organizational conditions such as the learning environment, support from the school board, and a professional culture of collaboration may influence teaching ( Imants and Van Veen, 2010 ; Stollman, 2018 ). Teachers have reported that preparation time is a crucial factor determining the implementation of differentiated instruction ( De Jager, 2013 ; Van Casteren et al., 2017 ). Moreover, collaboration is key; a high pedagogical team culture influences both the learning climate and the implementation of differentiated instruction ( Smit and Humpert, 2012 ; Stollman, 2018 ). Lastly, country level requirements and (assessment) policies that stress differentiated instruction may influence implementation ( Mills et al., 2014 ).

Research Questions

Researchers and teachers lack a systematic overview of the current empirical evidence for different approaches to within-class differentiated instruction in secondary education. Therefore, we aim to (1) give an overview of the empirical literature on effects of differentiated instruction on student achievement in secondary education, and (2) consider the degree to which contextual and personal factors inhibit or enhance the effects of within-class differentiated instruction.

Our study is guided by the following research questions:

RQ1. What is the research base regarding the effects of within-class differentiated instruction on students' academic achievement in secondary education?

RQ2. How are the selected approaches to differentiated instruction operationalized?

RQ3. What are the overall effects of differentiated instruction on students' academic achievement?

RQ4. Which contextual and personal factors inhibit or enhance the effects of differentiated instruction on student achievement?

Based on previous research, we hypothesize to find literature on multiple possible approaches to differentiated instruction in the classroom. Probably, there will be more evidence for some operationalizations (like ability grouping) than for others. Overall, we hypothesize that differentiated instruction will have a small to medium positive effect on students' academic achievement. Several contextual and personal factors may affect the implementation. In this review, we will include information about relevant contextual and personal variables—when provided—into the interpretation of the literature.

Study Design

In order to provide a systematic overview of the literature on within-class differentiated instruction, a best evidence synthesis ( Slavin, 1986 , 1995 ; Best Evidence Encyclopedia, n.d.) was applied. This was done by a-priori defining consistent, transparent standards to identify relevant studies about within-class differentiated instruction. Each selected study is discussed in some detail and results are evaluated. In case enough papers are found that are comparable, findings can be pooled across studies. The best-evidence strategy is particularly suitable for topics—such as differentiated instruction—for which the body of literature is expected to be rather small and diverse. In such cases, it is important to learn as much as possible from each study, not just to average quantitative outcomes and study characteristics (compare Slavin and Cheung, 2005 ). In a recent review study on differentiated instruction in primary schools, the best evidence synthesis approach was used as well ( Deunk et al., 2018 ). In this study, the authors mentioned the benefits of selecting studies using strict pre-defined criteria (to avoid a garbage in-garbage-out effect). Moreover, combining a meta-analysis with relatively extended descriptions of the included studies in order to make the information more fine-grained was found to improve the interpretability of the results.

Working Definition of Differentiated Instruction

To select relevant studies for our review, we used the following working definition of differentiated instruction: Differentiated teaching in the classroom consisting of planned adaptations in process, learning time, content, product or learning environment for groups of students or individual students. Adaptations can be based on achievement/readiness or another relevant student characteristic (such as prior knowledge, learning preferences, and interest) with the goal of meeting students' learning needs.

Adaptations that are merely organizational, such as placing students in homogeneous groups without adapting the teaching to relevant inter-learner differences, were excluded. Interventions using approaches like peer tutoring, project-based learning and other types of collaborative leaning were eligible, but only when planned differentiated instruction was applied based on relevant student characteristics (e.g., by assigning specific roles based on students' abilities). Beyond the scope of this review were studies on differentiated instruction outside the classroom such as between-class differentiation (streaming or tracking), tutoring outside the classroom, or stratification of students between schools.

Search Strategy

The studies for our best evidence synthesis were identified in a number of steps. First, we performed a systematic search in the online databases ERIC, PsycINFO, and Web of Science (SSCI). Following the guidelines of Petticrew and Roberts (2006) , a set of keywords referring to the intervention (differentiation combined with keywords referring to instruction), the population (secondary education) and the outcomes of interest (academic outcomes) were used. We limited the findings to studies published between 2006 and 2016 that were published in academic journals. Although this first search yielded relevant studies, it failed to identify a number of important studies on differentiated instruction practices known from the literature. This was because search terms like “differentiation” and “adaptive” were not used in all relevant studies. Some authors used more specific terms such as ability grouping, tiered lessons, flexible grouping and mastery learning. Therefore, an additional search was performed in ERIC and PsycINFO with more specific keywords associated with differentiated instruction. We added keywords referring to various homogeneous or heterogeneous clustering approaches, to mastery learning approaches, or to convergent or divergent approaches (see Appendix A for the full search string) 1 .

Additional to this protocol-driven approach, we used more informal approaches to trace relevant studies. We cross-referenced the selected papers and recent review studies on related topics, used personal knowledge about relevant papers, and consulted experts in the field. We only used newly identified papers in case they were from journals indexed in the online databases Ebscohost, Web of Science, or Scopus to avoid selecting predatory journal outputs.

Selection of Papers

The identified papers were screened in pre-designed Excel sheets in two stages. First, two independent coders applied a set of inclusion criteria (criteria 1–8) to all papers based on title, abstract, and keywords. The papers that met the following conditions were reviewed in full text: (1) one or both of the coders judged the paper to be included for full text review based on the inclusion criteria using the title, abstract, and keywords, or (2) the study fulfilled some of the inclusion criteria but not all criteria could be discerned clearly from the title, abstract or keywords. Second, in a full text review, two coders applied the inclusion criteria again after reading the full paper. If a study met the basic criteria 1–8, additional methodological criteria (9–13) were checked in order to make the final selection. To assure the quality of the coding process, full-text coding of both coders was compared. Differences between coders about whether the study met certain inclusion criteria were resolved by discussion and consensus. The dual coding process by two reviewers was used since this substantially increases the chance that eligible studies are rightfully included ( Edwards et al., 2002 ). Only studies that met all 13 inclusion criteria were included in the review.

Inclusion Criteria

The following inclusion criteria were used to select the relevant papers. These criteria were based on a prior review study on differentiated instruction in primary education ( Deunk et al., 2018 ) and the best evidence studies by Slavin and colleagues ( Slavin and Cheung, 2005 ; Slavin et al., 2008 , 2009 ; Slavin, 2013 ; Cheung et al., 2017 ).

1. Within-class differentiated instruction: The study is about the effect of within-class differentiated instruction, as defined in our study (see section Working Definition of Differentiated Instruction).

2. Practicality : The differentiated instruction approach is practical for teachers ( Janssen et al., 2015 ). Teachers must be able to apply this intervention themselves in a regular classroom. In addition, the intervention is time- and cost-effective, meaning that it should not take excessive training or coaching nor use of external teachers in the classroom to implement the approach. Interventions in which ICT applications are used to support the teachers' instruction and can be controlled by the teacher (e.g., in blended learning environments in which teachers make use of on-line tools or PowerPoint) could be included. However, studies on the effects of fully computerized adaptive programs (e.g., with adaptive feedback or intelligent tutors) or differentiation approaches for which an external teacher (or tutor) is needed (such as pullout interventions) were excluded.

3. Study type: Students in a differentiated instruction intervention condition are compared to those in a control condition in which students are taught using standard practice (“business as usual”), or to an alternative intervention (compare Slavin et al., 2008 , 2009 ; Slavin, 2013 ; Cheung et al., 2017 ; Deunk et al., 2018 ). The design could be truly randomized or quasi-experimental or matched (the control condition could be a group of other students in a between-group design, or students could be their own control group in a within-groups design) 2 . Additionally, large-scale survey designs in which within-class differentiated instruction is retrospectively linked to academic outcomes were eligible for inclusion (compare Deunk et al., 2018 ). Surveys have increasingly included been used in reviews of effectiveness, although one must keep in mind that no finding from a survey is definitive ( Petticrew and Roberts, 2006 ).

4. Quantitative empirical study : The study contains quantitative empirical data of at least 15 students per experimental group (compare Slavin et al., 2008 , 2009 ; Slavin, 2013 ; Cheung et al., 2017 ; Deunk et al., 2018 ). Other studies such as qualitative studies, case studies with fewer than 15 students, or theoretical or descriptive studies were excluded.

5. Secondary education: The study was executed in secondary education. For example, in middle schools, high schools, vocational schools, sixth-form schools or comparable levels of education for students from an age of about 11 or 12 years onwards. In some contexts, secondary schools could include grades as low as five, but they usually start with sixth or seventh grades (compare Slavin, 1990a ).

6. Mainstream education : The study was performed in a mainstream school setting (in a regular school, during school hours). Studies that were performed in non-school settings (e.g., in a laboratory or the workplace) or in an alternate school setting (e.g., an on-line course, a summer school, a special needs school) were excluded.

7. Academic achievement : Academic achievement of students is reported as a quantitative dependent variable, such as mathematics skills, language comprehension, or knowledge of history.

8. Language : The paper is written in English or Dutch (all authors master these languages), but the actual studies could be performed in any country.

Additional inclusion criteria used in the full-text review:

9. Differentiated instruction purpose: The study is about differentiated instruction with the aim of addressing cognitive differences (e.g., readiness, achievement level, intelligence) or differences in motivation / interest or learning profiles ( Tomlinson et al., 2003 ). Studies in which adaptions were made based on other factors such as culture (“culturally responsive teaching”) or physical or mental disabilities are beyond the scope of this review.

10. Implementation : The intervention is (at least partly) implemented. If this was not specifically reported, implementation was assumed.

11. Outcome measurement: The dependent variables/outcome measures include quantitative measures of achievement. Experimenter-made measures were accepted if they were comprehensive and fair to the both groups; no treatment-inherent measures were included ( Slavin and Madden, 2011 ).

12. Effect sizes : The paper provides enough information to calculate or extract effect sizes about the effectiveness of the differentiated instruction approach.

13. Comparability : Pretest information is provided (unless random assignments of at least 30 units was used and there were no indications of initial inequality). Studies with pretest differences of more than 50% of a standard deviation were excluded because—even with analyses of covariance—large pretest differences cannot be adequately adjusted for ( Slavin et al., 2009 ; Slavin, 2013 ; Cheung et al., 2017 ; compare Deunk et al., 2018 ).

Data Extraction

After the final selection of papers based on the criteria above, relevant information was extracted from the papers and coded by two independent reviewers in a pre-designed Excel sheet (see Appendix B ). Discrepancies between the extractions of both reviewers were discussed until consensus was reached. Missing information regarding the methodology or results was requested from the authors by e-mail (although only few responses were received). The content coding was used (additional to the full texts) to inform the literature synthesis and to extract data for the calculation of effect sizes.

Data Analysis

We transformed all outcomes on student achievement from the selected papers to Cohen's d , which is the standardized mean difference between groups ( Petticrew and Roberts, 2006 ; Borenstein et al., 2009 ). To do so, the program Comprehensive Meta-Analysis (CMA) version 2 was used ( Borenstein et al., 2009 ). Effect sizes were calculated using a random effects model since we have no reason to assume that the studies are “identical” in the sense that the true effect size is exactly the same in all studies ( Borenstein et al., 2010 ). Methods of calculating effects using different types of data are described in Borenstein et al. (2009) and Lyons (2003) . When outcomes were reported in multiple formats in the paper, we chose the means and standard deviations to come to transparent and comparable outcomes. The effects were standardized using post-score standard deviations for measures where this was needed. For some outcome formats, CMA requires the user to insert a pre-post correlation. Since none of the selected papers provided this number, we assumed a correlation of 0.80 in the analyses since it is reasonable to assume such a pre- post correlation in studies in secondary education ( Swanson and Lussier, 2001 ; Cole et al., 2011 ). This correlation does not affect the Cohen's d statistic but has impact on its variance component. For the papers in which multiple outcome measures were reported, we used the means of the different measures. In case only subgroup means (of subgroups within classes of schools) were reported, we combined the outcomes of the subgroups with study as the unit of analysis to calculate a combined effect ( Borenstein et al., 2009 ). For one study in which the intervention was executed in separate schools differing in implementation and findings, we have included the schools in the analyses separately (using schools in which the intervention took place as the unit of analysis).

Search Results

Our search led to 1,365 hits from the online databases ERIC, PsycINFO and Web of Science and 34 cross-referenced papers. Excluding duplicates, 1,029 papers were reviewed. See Appendix C for a flow-chart of the selection process. In total, 14 papers met the eligibility criteria for inclusion. Papers reporting on the same project and outcomes were taken together as one study. The papers by Altintas and Özdemir (2015a , b) report on the same project. The same applies to two other papers as well ( Vogt and Rogalla, 2009 ; Bruhwiler and Blatchford, 2011 ). Thus, in the end, 12 unique studies were included in our review and meta-analysis leading to 15 effects in total (since for one study the four different schools in which the intervention was executed were taken as the unit of analysis).

Study Characteristics

In Table 2 , the characteristics and individual effects of the studies included in our review are summarized. The selection of studies includes eight quasi-experimental studies in which classes were randomly allocated to a control or experimental condition ( Mastropieri et al., 2006 ; Richards and Omdal, 2007 ; Huber et al., 2009 ; Vogt and Rogalla, 2009 ; Little et al., 2014 ; Altintas and Özdemir, 2015a , b ; Bal, 2016 ; Bhagat et al., 2016 ), three studies in which schools were randomly allocated to conditions ( Wambugu and Changeiywo, 2008 ; Mitee and Obaitan, 2015 ; Bikić et al., 2016 ), and one survey-study ( Smit and Humpert, 2012 ). These studies covered a wide range of academic subjects, including science, mathematics and reading. In terms of the number of participating students, six studies were small-scale studies ( N < 250) and six were large-scale studies ( N > 250). However, note that all experiments had nested designs. Only the studies of Little et al. (2014) and Vogt and Rogalla (2009) have at least 15 cases in each experimental condition at the level of randomization. Four studies were performed in the United States of America, five in Europe, one in Taiwan, and two in Africa. All studies were performed in secondary education, but the Vogt and Rogalla study represents a combined sample of primary- and secondary education students.

www.frontiersin.org

Table 2 . Summary of contents of the selected papers and the effects of the individual studies on student achievement.

Literature Synthesis

To further reflect on the findings from the selected studies in respect to our research questions, we will give a more detailed description of the study designs, implementations and findings here.

Studies on Generic Approaches to Differentiated Instruction

Although adaptive teaching does not necessarily include differentiated instruction, we found two quasi-experimental studies on adaptive teaching that (to some extent) matched our definition of differentiated instruction. In the large-scale study by Vogt and Rogalla (2009) , teachers were trained in adaptive teaching competency to improve their teaching and, in turn, to maximize students' learning. In the project “Adaptive Teaching Competency,” that was also included in the paper of Bruhwiler and Blatchford (2011) , adaptive teaching was characterized as including: sufficient subject knowledge, taking the diverse pre-conditions and learning processes of students into account, using various effective teaching methods for the whole group, differentiating for students' varying learning needs, supporting students in the regulation of learning processes, and using effective classroom management. In the project, teachers learned to focus on both adaptive planning prior to the lesson, as well as making adaptations during the lesson. Teachers of 27 primary school classes and 23 secondary school classes with 623 students were recruited to learn more about adaptive teaching. They participated in a 2-day workshop, received several coaching sessions in the classroom and used the adaptive teaching framework in their classes for eight science lessons. After the intervention, it was measured—among others—whether teachers differentiated to meet students' diverse skills and interests. After the intervention, teachers' competency in planning adaptive lessons significantly increased but their “Adaptive Implementation” did not change much. Unfortunately, in the coaching sessions, teachers often did not discuss about issues of adapting to the diversity of students' skills and their pre-existing knowledge. The results of students in the experimental classes were compared to those of 299 control students. The authors reported that the secondary students in the experimental group outperformed their counterparts in control classrooms on a science achievement test after the intervention. However, since we only had access to the means of the combined sample in primary and secondary education we used the combined sample results. Our calculation based on these means shows a small non-significant intervention effect of d = +0.133 (see Table 2 ). The authors argue that more coaching may be needed to foster the implementation of adaptive teaching in the classroom, although it would decrease the cost-effectiveness of the approach.

In the study by Huber et al. (2009) , teachers learned about adaptive teaching in a workshop, and were asked to incorporate it into their lessons. The intervention was the Prevention through Alternative Learning Styles (PALS) program aimed at prevention of alcohol-, tobacco-, and other drug (AOTD) abuse. Prevention of alcohol-, tobacco-, and other drugs is rather commonplace in secondary schools. For instance, in the US, students typically get into prevention programs more than once in their school career ( Kumar et al., 2013 ) and European schools are also encouraged to take action in promoting students' health ( World Health Organiasation, 2011 ). Teachers attended a 1-day workshop about adaptive teaching by means of: modifying time, increasing or decreasing the number of items to be learned or completed, increasing the level of support, changing the input or the way the material is presented, changing the output, adapting the amount of active participation, changing to alternate goals and expectations, adapting the level of difficulty for each individual, and providing different instruction and materials. In addition, teachers learned about alternative learning styles and disabilities. PALS materials were developed by the research team to match students' specific needs and related abilities. In a quasi-experimental study, four grade 6–8 teachers taught the 10 PALS intervention lessons to their classes and PALS team members taught another 24 classes. School officials suggested a convenient comparison group receiving the traditional prevention program. In reference to the control group, the PALS program had a large significant effect of d = +1.374 on students' knowledge of the effects of ATOD (see Table 2 ). These results were replicated in a second, within-group repeated measures design. Although the findings seem promising, more information is needed about how the approach was implemented; in the paper, it is unclear how teachers applied the information from the training in their instruction. Moreover, replication of the findings in a study in which teachers teach all project lessons may also help clarify whether the effects of the intervention were affected by the fact that project staff taught most lessons in the experimental condition.

We only selected two studies using a generic approach to differentiated instruction and the effects of the studies described above differ considerably regarding their intervention, school subject, and findings. This makes it hard to estimate the overall effectiveness of generic approaches. The study of Huber seems promising, but unfortunately, the study of Vogt and Rogalla did not lead to positive achievement effects for students across the primary and secondary school group. More studies are needed to gain insight in how teachers could effectively and efficiently be supported or coached to master the multifaceted approach of differentiated instruction.

Studies on Differentiated Instruction Using Homogeneous Clustering

A number of selected studies use a macro-adaptive approach to differentiated instruction ( Richards and Omdal, 2007 ; Altintas and Özdemir, 2015a , b ; Bal, 2016 ; Bikić et al., 2016 ). Of these studies, the study of Richards and Omdal (2007) has the most robust design. In this study, first year students were randomized over 14 classes and then classes were randomly assigned to conditions. Within the experimental condition, the science content for ability groups was adapted to students' learning needs by means of tiering. To study the effectiveness of the approach, 194 students were randomly assigned to classes in which the teachers used tiered content, while 194 other students were in the control group that worked with the midrange curriculum for 4 weeks. Each teacher was assigned at least one treatment and one control class. After a pretest, students in the experimental condition were assigned to three ability groups: a low background knowledge group (around the lowest scoring 10 percent of all students), a midrange group (about 80 percent), and a high background group (the highest scoring 10 percent). One of the researchers produced the instructional materials for the study. To develop the differentiated materials, first core instructional materials were developed that were aimed at the midrange group. Next, the content was differentiated for the low and high background students. Adaptations were made to the depth of content, the degree of teacher dependence and structuring, the number of steps, the skills, time on task, the product, and the available resources. Students were asked to work together within their tiers. There was an overall small significant effect of the intervention of d = +0.284 in favor of the tiering condition (see Table 2 ). Closer analyses of subgroup results (see Table 2 ) show that this is particularly due to a large effect for the low background learners of d = +1.057. For high-range learners, differences between the control condition and the experimental condition are near to zero ( d = +0.077), although this may be partly due to a ceiling effect on the test. The authors conclude that curriculum differentiation through tiered assignments can be an effective way to address the needs of low achieving students. They recommend, however, that it should be accompanied by professional support and that teachers who design the tiers should have substantial subject matter knowledge and experience with learners with different needs.

In the study by Bikić et al. (2016) , the effectiveness of differentiated instruction of geometry content within a problem-based learning approach is studied. In the quasi-experiment, the authors compare an approach in which students solved mathematics problems on three levels differing in complexity using problem-based learning to a control condition. The study design is not described in detail, but since the authors state “students of the experimental group and control group were not the students from the same school” it seems that schools were allocated to an experimental or control condition to study the effectiveness of the approach. Within the experimental condition, 88 secondary school students were assigned to three groups (low- average-, or high-achievers) based on an initial test, and then worked on adapted levels of geometry problems for 16 lessons before completing a final test. An example of the differentiated materials in the paper shows that the three ability groups all received a different task (which was a variation of the same task differing in complexity). Unfortunately, it is not described how the students exactly processed the content. In the control condition, 77 other students were taught in the usual, traditional manner. Students in the ability grouping condition outperformed the control students with a moderate positive effect of d = +0.539 (see Table 2 ). Subgroup analyses indicate that the approach was most effective for average ability students; students in the high achieving group did not outperform high achieving students in the control group. Do note however that the high achieving groups were small (12 exp. vs. 14 contr. students), hence, these results should be interpreted with caution. More research would be needed to clarify to which extent the differentiated content improved the effectiveness of the problem-based learning approach.

A different grouping approach is one based on preferred learning styles. In the study of Bal (2016) , grade 6 students completed an algebra pre-test as well as filling out a learning style inventory (kinesthetic, visual, affective learning styles). Algebra-learning materials an activities are adapted for two tiers; for low performing students and high performing students, also adapted for different learning styles of students in the experimental group. Despite the fact that there are reasons not to use learning styles as a distinction between students (see e.g., Kirschner et al., 2018 ), the authors did find large positive effects of the tiering approach after 4 weeks of teaching ( d = + 1.085, see Table 2 ). Do note however that ANCOVA results were used to calculate the effects which may lead to some positive bias in this estimate. Based on information from student-interviews presented in the paper, it seems that students experienced success in learning and enjoyed the materials and activities developed for the experimental condition. It is unclear however, how the materials and activities were made more appropriate for students' readiness (and learning style) and how they differed from the approach in the control condition that used traditional teaching. In that sense, it is difficult to judge what caused these positive findings. In another study on mathematics by Altintas and Özdemir (2015a , b) , teachers assessed students' preferred learning modalities by taking a multiple intelligences inventory. The data obtained from the inventory were used to determine the students' project topics, to select the teachers' teaching strategies, and to determine the relevant factors for motivating students. The effectiveness of the approach, which was originally designed for gifted students, was evaluated in a sample of 5 to 7th grade students in Turkey. After pretesting, one class of students was allocated to the experimental condition and one class of the same grade formed the control group. The authors report a very large effect of the intervention after six practices lasting 7 weeks each when compared to classes working with the Purdue model for both grade 6 and grade 7 students ( d = +4.504 across subgroups, see Table 2 ). However, it is difficult to discern what exactly caused this finding. Little information was provided about how exactly the teachers planned and executed the lessons and how students' activities and objectives were matched to their dominant intelligences, nor was there much information about possible confounding factors. In addition, since the researcher who developed the multiple intelligences theory admits that the theory is no longer up to date ( Gardner, 2016 ), one could question whether learning preferences could be better determined based on another distinction.

In summary, from the studies we found on the effectiveness approaches to differentiated instruction using homogeneous clustering, we could infer that overall small to medium sized effects (and in some cases also large effects) of the approach on student achievement can be achieved in beta subjects. The study of Altintas and Özdemir shows a very large effect of this approach and the study of Bal also shows large effects. However, before we can corroborate these findings, more information would be needed. When we look at the operationalizations of differentiated instruction in the two larger studies, we see that teachers used variations of learning tasks that were designed to better match the learning needs of different ability groups. Differential effects for student outcomes are somewhat variable; the results are most profound for the low achieving group in the study by Richards and Omdal (2007) , and for the low and average achieving group in the study of Bikić et al. (2016) . In both studies, effectiveness for the high achieving group seemed negligible.

Studies on Mastery Learning

In two included studies, mastery learning was used to boost student achievement in physics and mathematics. The quasi-experimental studies reporting on mastery learning approaches in secondary education used randomization of schools to conditions and were both performed in African schools ( Wambugu and Changeiywo, 2008 ; Mitee and Obaitan, 2015 ). In the papers, the authors describe similar characteristics of mastery learning in their theoretical framework, such as specifying learning goals, breaking down the curriculum into small units, formative assessment, using corrective instruction for students who did not reach mastery, and retesting. This process continues until virtually all the students master the taught material ( Mitee and Obaitan, 2015 ), which emphasizes its aim of convergent differentiation. Mittee and Obaitan report a large effect of the mastery learning approach of d = +1.461 based on an experiment in which about 400 students from four schools were allocated to a mastery learning or a control condition (see Table 2 ). Wambugu and Changeiywo randomly divided four classes from four schools over the mastery learning or the experimental condition. Comparing the results on the physics achievement test of the two experimental classes a two control classes, they found a large effect of mastery learning ( d = +1.322 based on the findings of an ANOVA, see Table 2 ). However, do note that pretests were only available for two out of four classes (one control and one experimental).

Unfortunately, the information on the mastery learning approach in the lessons is rather limited in both papers. Therefore, it is difficult to judge how such large achievement gains can be reached by implementing mastery learning in secondary education. Nevertheless, we can extract a number of recommendations: First, both studies use corrective instruction for helping students gain mastery. Secondly, in both studies the authors refer to some type of collaborative learning in the corrective instruction phase. Lastly, Wambugu and Changeiywo note that the time needed to develop the learning objectives, formative tests, and corrective activities is considerable so teachers may want to work together in teacher teams to achieve these goals. More high-quality research is needed to replicate these findings and to gain insight in how teachers can apply this approach in practice.

Studies on Individualized Differentiated Instruction

The large-scale quasi-experimental study on differentiated reading instruction in middle schools by Little et al. (2014) used individualized adaptations to address students' learning needs. They used a program called the Schoolwide Enrichment Model-Reading Framework (SEM-R) to support students' reading adaptively. The SEM-R approach consists of three phases: (1) short read-alouds by the teacher (“Book Hooks”) and brief discussions about books, (2) students read independently in self-selected, challenging books while the teacher organizes individualized 5- to 7-min conferences with each student once every 1 to 2 weeks, (3) interest-based and more project-oriented activities. Professional development of teachers included workshops as well as classroom support from project staff. The focus of the intervention was on phases 1 and 2. Teachers were expected to implement SEM-R on a daily basis for about 40 to 45 min per day or 3 h per week. In a cluster-randomized design executed in four middle schools with 2,150 students, the effectiveness of the approach was compared to that of traditional teaching. The effects of the approach varied considerably across the different schools. The authors reported that, for the reading fluency outcome, SEM-R students significantly outperformed their control counterparts in two out of four schools. The standardized mean differences ranged from about −0.1 to +0.3 between the schools (see Table 2 ). The authors conclude that the intervention was at least as effective as traditional instruction. However, the wide range of implementations and effects on student outcomes between classes and schools illustrates the difficulty of implementing intensive forms of individualization in practice.

In the survey study of Smit and Humpert (2012) , the authors assessed which teaching practices teachers used to differentiate their teaching. In this sub-study of the project “Schools in Alpine Regions,” teachers from 8 primary schools and 14 secondary schools in the rural Alpine region of Switzerland participated. Teachers responded to a teacher questionnaire about differentiated instruction. They mainly reported to make adaptations at the individual level by, for instance, providing students with individual tasks (tiered assignments), adapting the number of tasks, or providing more time to work on tasks. Teachers often used “learning plans” as well as tasks in which students could take individual learning trajectories varying the content or learning rate. Flexible grouping was less common and alternative assessments were very rare. Peer tutoring occurred frequently, and tiered assignments were very common. On average, 38% of teachers' weekly lessons were differentiated. The authors conclude that teachers in their sample, on average, did not execute very elaborate differentiated instruction. Moreover, no significant relation between differentiated instruction and student achievement was found for neither a standardized language test ( d = −0.092) nor a standardized mathematics test ( d = −0.085, see Table 2 ). Following the survey study, an intervention study was executed with 10 of the schools that were included in the survey-study. In this study (that was not included in our selection since it was not published in an academic journal), teachers participated in workshops and team meetings and logged their learning experiences in portfolios. Teachers barely progressed in their differentiated instruction during the 2.5-year project ( Smit et al., 2011 ). Nevertheless, a high pedagogical team culture in schools was found to have a positive influence teachers' differentiated instruction ( Smit et al., 2011 ; Smit and Humpert, 2012) , and as such may be one of the keys to achieve improvement.

Overall, it seems that it is rather difficult to boost the achievement of the whole class by means of individualized approaches. However, as Little et al. (2014) suggest, individualization may be used as an approach to increase students' engagement with the learning content. A drawback of the approach may be that the requirements for organizing and monitoring learning activities by the teacher in individualized approaches could leave less time for high quality pedagogical interaction. Possibly, future research on individualization supported by digital technology may open up more possibilities for this approach to have high impact on student achievement ( Education Endowment Foundation, n.d. ).

Studies on Differentiated Instruction Using Heterogeneous Clustering

One of the included studies used differentiated instruction within mixed-ability learning settings. In the study by Mastropieri et al. (2006) , grade eight students worked on science assignments in groups of two or three. Peer-mediated differentiated instruction and tiering was used to adapt the content to students' learning needs within the groups. The authors developed three tiers of each assignment varying in complexity. Within the peer groups, students could work on activities on their own appropriate level and continue to the next level once proficiency was obtained. All lower ability level students—including students with learning disabilities—were required to begin with the lowest tier. In the experiment, 13 classes with a total of 216 students were assigned to the peer-mediated differentiated content condition or a teacher-led control condition. The researchers divided the classes in such a way that each teacher taught at least one experimental and one control classroom. After about 12 weeks, a small positive effect was found in favor of the peer-mediated condition with tiered content on both the unit test and the high stakes end of year test (respectively d = + 0.466 and d = + 0.306, see Table 2 ). The overall effect of d = +0.386 is comparable to that of the tiering intervention of Richards and Omdal (2007) discussed earlier. The effect is slightly higher, but this may also partly be affected by the use of adjusted means. In any case, more research is needed to disentangle the effects of the peer-learning and the differentiated content.

Studies on Differentiated Instruction in Flipped Classrooms

In flipped classroom instruction, content dissemination (lecture) is moved outside of the classroom, typically by letting students watch instructional videos before the lesson. This opens up more time for active learning inside the classroom ( Leo and Puzio, 2016 ). This format implies differentiation of learning time and pace before the lesson since students may rewind, pause or watch the video's multiple times according to their learning needs. However, whether the activities during the lesson encompass our operationalization of differentiated instruction (see Table 1 ) varies. From a recent meta-analysis on flipping the classroom ( Akçayir and Akçayir, 2018 ), we found one study in secondary education in which remediation in the classroom was mentioned as being part of the intervention. Bhagat et al. (2016) report on a quasi-experiment in which 41 high school students were assigned to a classroom using flipping-the-classroom and 41 students were in the control condition. The experimental group underwent “flipped” lessons on trigonometry for 6 weeks, while the control group followed similar lessons using the conventional learning method. Students in the flipped condition watched videos of 15–20 min before the lesson. During the lesson, students discussed problems collaboratively and, in the meantime, students who needed remediation were provided with extra instruction. After the intervention, students from the flipped classrooms outperformed their counterparts on a mathematics test and were more motivated. The authors report a large effect of the intervention on students' mathematics achievement based on analysis of covariance. However, the combined effect across the subgroup mean differences is modest d = 0.376, see Table 2 ). On average, experimental students of all abilities performed better, except for high achievers who did not significantly outperform the control group. These differential effects should be interpreted with caution because of the limited number of students in the subgroups. The pro of this study is that it gives some insights in the benefits of differentiated instruction embedded in an innovative approach to teaching. Yet, the authors did not specify clearly what the remediation and collaborative learning in the classroom consisted of and cannot disentangle effects of different elements of the intervention. More research would be needed to clarify the role and effectiveness of differentiated instruction in flipped settings.

Contextual and Personal Variables

As we discussed in the theoretical framework, many variables may influence teachers' implementation of differentiated instruction. We hoped to find evidence for this assumption in our selection of papers. However, in general, little information was provided about contextual and personal factors such as school, class, or teacher characteristics.

In our sample of studies, differentiated instruction was mostly applied to teaching mathematics and science. Additionally, there were also papers on literacy and social sciences. No clear differences in effectiveness could be observed between the subjects. Students varied in background characteristics across the studies. In the study by Little et al. (2014) , for instance, about 48 to 77 percent of students were from low SES. In the study by Mastropieri et al. (2006) , many ethnicities were represented. In the studies by Huber et al. (2009) , students were mostly European-American. Student ages varied from about 11 to 17 years old (see Table 2 ). Teacher characteristics were rarely reported. In the study by Mastropieri et al. (2006) , relatively inexperienced teachers participated with a mean of about 3 years in their current position, and in the studies by Vogt and Rogalla (2009) and Smit and Humpert (2012) , years of teaching experience varied considerably, with an average of about 15 to 17 years.

The only variable that is rather consistent across the studies is that teachers in the included studies relied considerably on external sources of information or support to help them implement differentiated instruction within their classrooms. In most of the selected studies, the research team developed materials for students, and teachers were instructed or coached in implementing the interventions (see Table 2 ). Although we aimed to select practical interventions, little information is provided about whether teachers were able to successfully execute the differentiated instruction practices independently in the long run.

Overall Effects of Differentiated Instruction

Ideally, combining our narrative reflection on the included papers with a meta-analysis of the findings would give us an answer as to how effective within-class differentiated instruction in secondary education may be. However, unfortunately, the number of papers that remained after applying our selection criteria is limited and the studies are heterogeneous in nature so meta-analyses of results should be interpreted with caution. To inform the readers however, we did add a forest plot with an overview of the average effect size of each individual study to the appendix (see Appendix D ). In Table 2 the effects and intermediate calculations for individual studies are described. A summary effect across all studies is also reported ( d = +0.741; 95% CI = 0.397–1.1085; Q = 507.701; df = 14; p < 0.01). The p -value of the Q statistic was significant which may indicate heterogeneity of the papers meaning that the true effects of the interventions may vary. Noticeably, the largest studies in our sample show small positive effects of differentiated instruction. In contrast, the relatively small studies reported on large effects, and the other studies mostly show moderate effects of the approach. A cumulative analysis (see Appendix D ) illustrates that the small study by Altintas and Özdemir (2015a , b) considerably shifts the point estimate of the effect size in the positive direction. Excluding this outlier, the summary effect of differentiated instruction is d = +0.509 (95% CI = 0.215–0.803; see Appendix D ). A funnel plot was made to check for publication bias (see Appendix E ). Using Duval and Tweedie's Trim and Fill method ( Duval and Tweedie, 2000 ), no adjusted values were estimated. This indicates that there is no evidence of publication bias. These analyses give some information about the range of effects that can be achieved with differentiated instruction interventions ranging. However, unquestionably, more information is needed before drawing a more definitive conclusion about the overall and relative effects of different approaches to differentiated instruction in secondary schools.

Suggestions for Reporting on Differentiated Instruction Interventions

One of the issues we encountered when performing this review, was that interventions and research methodologies were often described rather briefly. In addition, relevant context information was frequently missing. This is problematic, not only from a scientific point of view, but also to judge the transferability of the findings to practice. Therefore, we encourage researchers to diligently report on the methods and analytical techniques they used and to be specific about the outcomes that led to their conclusions (see e.g., Hancock and Mueller, 2010 ). Except for this general suggestion, we would like to provide a number of specific recommendations for reporting on differentiated instruction interventions (see Appendix F ).

Conclusion and discussion

The most important conclusion from our systematic review of the literature is that there are too few high-quality studies on the effectiveness of differentiated instruction in secondary education. Only 12 studies from 14 papers were selected after applying strict selection criteria to a large amount of literature on the topic. As expected, we found papers on various operationalizations of differentiated instruction like homogeneous grouping, differentiated instruction in peer-learning, and individualization. However, even within the most well-known approaches like ability grouping, the empirical evidence was limited. High quality teacher-led differentiated instruction studies in secondary education are scarce, although the literature on ICT-applications for differentiated instruction seems to be on the rise. This paucity has not changed much after our search, although there are some recent interesting endeavors for teacher professionalization in differentiated instruction ( Brink and Bartz, 2017 ; Schipper et al., 2017 , 2018 ; Valiandes and Neophytou, 2018 ) and there have been some recent small-scale studies including aspects of differentiated instruction ( Sezer, 2017 ; Adeniji et al., 2018 ). This paucity is remarkable given the large interest for the topic of differentiated instruction in both the literature as well as in policy and practice. Apparently, the premises of differentiated instruction seems substantial enough for schools and policy makers to move towards implementation before a solid research base has been established. On the one hand, this seems defendable; differentiated instruction matches the ambitions of educationists to be more student-oriented and to improve equity among students. In addition, there is prior research showing benefits of approaches like ability grouping and mastery learning for K-12 students' achievement ( Guskey and Pigott, 1988 ; Kulik et al., 1990 ; Kulik, 1992 ; Lou et al., 1996 ; Hattie, 2009 ; Steenbergen-Hu et al., 2016 ). Furthermore, the ideas behind differentiated instruction are in line with approaches which have repeatedly been linked to better learning such as having students work on an appropriate level of moderate challenge according to their “zone of proximal development” and matching learning tasks to students' abilities and interests to create “flow” ( Tomlinson et al., 2003 ). On the other hand, more research on different operationalizations of differentiated instruction is needed to help teachers and policy makers to determine which approaches are helpful for students of different characteristics and to gain insight in how these could be implemented successfully. From prior research in primary education, we know that it is likely that not all approaches have comparable effects, and that effects for low- average- and high ability students may vary ( Deunk et al., 2018 ). Our current review shows that there is much work to be done in order to further clarify which approaches work and why within the context of secondary education.

Having said that, the studies that we did find do give us some directions about the expectations we may have about the effectiveness of differentiated instruction in secondary education. Most well-designed studies in our sample reported small to medium-sized positive effects of differentiated instruction on student achievement. This finding is comparable to the moderate effects found in most differentiated instruction reviews (e.g., Kulik, 1992 ; Lou et al., 1996 ; Steenbergen-Hu et al., 2016 ) and other studies on educational interventions ( Sipe and Curlette, 1996 ). The overall effect in our study is a bit higher than in prior reviews, possibly due to the inclusion of various approaches to differentiated instruction, including mastery learning and more holistic approaches. Although we cannot give a conclusive answer about the effectiveness of differentiated instruction in secondary education, most of the included studies do illustrate the possibility of improving student achievement by means of differentiated instruction.

Moreover, the selected papers give insight in the many different ways that differentiated instruction can be operationalized and studied in secondary education. For instance, a number of studies used generic training of teachers in principles of differentiated instruction. Based on the findings, we would suggest that more research is needed to study how teachers can adequately be guided to implement such holistic approaches into their daily teaching (compare practicality theory by Janssen et al., 2015 ). Alternatively, in four of the selected studies homogeneous clustering by means of tiering and ability grouping was used as a structure for differentiated instruction. For the subgroups, learning content was adapted to better fit the needs of the students ( Richards and Omdal, 2007 ; Altintas and Özdemir, 2015a , b ; Bal, 2016 ; Bikić et al., 2016 ). Medium to large positive effects were reported of such an approach, indicating this may be one of the ways teachers may address differentiated instruction. This finding is comparable to findings on ability grouping in the meta-analyses by Steenbergen-Hu et al. (2016) and Lou et al. (1996) . The effects were somewhat larger compared to those in the studies in primary education discussed by Deunk et al. (2018) and Slavin (1990a) . One possible explanation might be that some of the studies mentioned in those previous reviews may have included grouping without any instructional adaptations, which was excluded from the current review. Also, in our selected papers on homogeneous clustering, researcher-developed outcome measures were used. Researcher-developed measures have previously been associated with larger effects than standardized measures ( Slavin, 1987 ; Lou et al., 1996 ). Turning to another approach, two studies were reviewed on the effectiveness of mastery learning. The authors reported large effects of mastery learning on student achievement. However, since the research methods were not thoroughly described in the papers, we cannot say much about the quality of the intervention nor the implementation. Two other studies focused on individualization. Overall, small and non-significant effects of this approach were found. It could be that teachers grapple with the organizational requirements of individualized instruction ( Education Endowment Foundation, n.d. ). Additionally, a study was found that successfully embedded differentiated instruction in a peer-learning setting by means tiered content matching students' learning needs ( Mastropieri et al., 2006 ). Lastly, one of the studies embedded remediation and collaboration in a flipped-classroom format illustrating how differentiated instruction can be applied within different approaches to teaching ( Bhagat et al., 2016 ).

Unfortunately, in only three studies, authors reported on differential effects for subgroups of students within classes. This makes it difficult to judge which differentiated instruction approach is most suitable for whom. In the studies ( Richards and Omdal, 2007 ; Bhagat et al., 2016 ; Bikić et al., 2016 ) that did report effects for subgroups, the interventions were shown to be most beneficial for low achieving (and in case of Bikić also the average achieving) subgroups of students, even though the learning content was adapted to better match the needs of other students too. However, it remains unclear whether this was caused by the differentiated instruction, by the fact that the teachers directed more attention toward low performing students, or by the fact that the outcome measures did not match the adapted content. In addition, the subgroups were relatively small, limiting the power of the findings. Therefore, more empirical evidence is needed about the implementation and relative effects of differentiated instruction to further inform the “differentiation-dilemma” of how to best divide time over students with different needs ( Denessen, 2017 ).

Regarding the contextual and personal variables across studies, students' age, the school subjects and teaching experience of teachers varied. The fact that positive results have been replicated in several settings with different populations, gives a first indication that the approach may be transferable across different contexts ( Petticrew and Roberts, 2006 ). One consistent finding across the studies is that teachers relied on external support to implement within-class differentiated instruction during the interventions. This is to be expected, since prior reviews found that implementing differentiated instruction is quite complex for teachers and that they may need considerable guidance to get it right ( Tomlinson et al., 2003 ; Subban, 2006 ; Van Casteren et al., 2017 ). Previous studies show that teachers receiving more professional development in differentiated instruction perceive higher efficacy and adapt their teaching to students more often ( Dixon et al., 2014 ; Suprayogi et al., 2017 ).

The contribution of the current review to existing knowledge of the effects of differentiated instruction on students' achievement in secondary education is as follows: First, it provides an overview of theoretical concepts and operationalizations of differentiated instruction in the classroom. Next, it shows that a systematic review of the literature leads to a limited body of evidence regarding the effectiveness of within-class differentiated instruction in secondary education. This overview of the state of the art within this theme may inform further research initiatives. Additionally, the study addresses some contextual and personal factors that may affect teachers' differentiated instruction.

Limitations

The most salient drawback of the review is the limited number of studies that were included. On the one hand, it is unfortunate that the limited number of selected papers makes it difficult to come to definitive conclusions about the effectiveness of within-class differentiated instruction. On the other hand, the importance of using systematic reviews to identify research gaps to inform further development of the field should not be underestimated ( Petticrew and Roberts, 2006 ). Defining consistent criteria for the selection of the best evidence available—as we have done in this study—may limit the number of selected studies but does help to ensure that the studies that are selected are highly informative ( Slavin, 1995 ). The limited number of studies we found is just about comparable to the number of within-class approaches that were selected in a recent review of between-class and within-class differentiated instruction in primary education ( Deunk et al., 2018 ). We only included studies in which student achievement was reported as an outcome measure. In future research, adding other types of outcomes and other types of study designs could add to the breadth of the research base.

Another limitation has to do with the quality of the selected papers and consequently with our approach to the analyses. First, the fact that we did not locate any truly randomized designs necessitates caution in interpreting the findings. Potential biases are likely to be greater for non-randomized studies compared to randomized trials ( Higgins and Green, 2011 ). Second, the number of participants at the level of randomization (often the classroom level) was mostly low. Furthermore, it was sometimes difficult to determine the quality of the studies due to a lack of information in the papers. We tried to gain insight in the differentiated instruction interventions, but often essential information was omitted. Also, the conversion to Cohen's d could not always be done using an identical approach across the different studies. Must studies reported pre- and/or post-scores on achievement tests that we could use to calculate the effects in a rather straightforward manner, but in a few cases we had to estimate effects based on other types of information (for instance adjusted means or analyses of variance) which may complicate comparability across studies. Another drawback is that authors sometimes provided the outcomes of subgroups (for instance classes or ability groups within classes), sometimes only outcomes of the experimental conditions, or sometimes both. In the case of differentiated teaching, researchers should clearly explain their aims regarding which students they want to support (convergent or divergent). And if the aims differ per subgroup, they should ideally report these separate effects too. To inform future research on the topic, we have suggested some reporting guidelines that may help to clarify the content of future approaches to differentiated instruction and how they were studied in the Appendix.

A final limitation, inherent to a topic that is so multifaceted, is that the choices we have made in how we defined within-class differentiated instruction have influenced our selection of the literature and, thus, should be considered when interpreting the findings. The existing literature is marked by different ways of defining and operationalizing differentiated instruction ( Suprayogi et al., 2017 ; Deunk et al., 2018 ). As such, our review may differ from the operationalizations of other authors. In addition, other ways to adapt teaching to students' learning needs are also certainly interesting to consider by teachers who want to better align teaching to students' needs. For example, the use of scaffolding techniques in which instruction is broken up in chunks, and instruction in each chunk is provided contingent to students' level of understanding is a promising instructional technique ( Van de Pol et al., 2010 , 2015 ). In addition, formative assessment is a helpful starting point for differentiated instruction or other types of adaptive teaching ( Kingston and Nash, 2011 ). Furthermore, as discussed in the theoretical framework, differentiated instruction is a broad construct that adds up as a sum of its parts including lesson planning, differentiated instruction, evaluation and general high-quality teaching behaviors. We could not include all these factors into the working definition used to select and synthesize the studies. Therefore, readers should keep in mind that in order to understand differentiated instruction comprehensively and apply it in practice, there is more to it than just executing a differentiated lesson. A thoughtful approach using different steps starting from planning to evaluation including high quality teaching behaviors is key.

Recommendations for Research and Practice

We would like to urge researchers to further study the impact and implementation of differentiated instruction. First, reviews and meta-analyses combining quantitative and qualitative information on the effects of different approaches to differentiated instruction for different outcomes may add further to the current knowledge base ( Dixon-Woods et al., 2005 ). When more quantitative studies are located, this enables more statistical possibilities that can be used to gain insight in differential effects and predictive characteristics of different student outcomes ( Lou et al., 1996 ; Moeyaert et al., 2016 ; Deunk et al., 2018 ). And qualitative studies may help us understand how teachers differentiate and how their subjective experiences in the classroom influence their differentiated instruction ( Civitillo et al., 2016 ). In addition, authors may want to add studies on affective student outcomes as well. For example, students may have better attitudes and motivation in differentiated classes in which teaching better matches their learning needs ( Kulik and Kulik, 1982 ; Lou et al., 1996 ; Maulana et al., 2017 ; Van Casteren et al., 2017 ).

Second, future studies on the development and evaluation of differentiated instruction interventions could add to the knowledge base about how to reach differentiated instruction's potential in practice. In order to support teachers, specific coaching on the job by experienced peers or external coaches or other types of professionalization may help to develop awareness and implementation of differentiated instruction ( Latz et al., 2009 ; Smit and Humpert, 2012 ; Parsons et al., 2018 ; Valiandes and Neophytou, 2018 ). Teachers should learn to reflect upon the decisions they make when adapting their teaching ( Parsons et al., 2018 ). Moreover, teachers need team support and sufficient time to develop their differentiated instruction ( Stollman, 2018 ). Research shows that teachers themselves are quite enthusiastic about bottom-up professionalization approaches like peer-coaching or professional learning communities ( Van Casteren et al., 2017 ). Whatever approach one chooses, there are some characteristics which may facilitate the effectiveness of professionalization including: a focus on both content and pedagogical knowledge, sufficient duration of the intervention, initial training and follow-up sessions, a facilitation of collaboration and communication with colleagues and experts, constant on-site support and help during the implementation- and the development of personal skills for reflection and self-evaluation of teachers ( Valiandes and Neophytou, 2018 ). In addition, teacher educators should be mindful of teacher differences themselves too by providing differentiated professionalization ( Stollman, 2018 ). In this review, we did not include studies on the effectiveness of adaptive ICT applications on students' progress. However, ICT can play a significant role in the creation of student-centered learning environments when used as more than a simple add-on to regular teaching ( Smeets and Mooij, 2001 ; Deunk et al., 2018 ). Some recent studies on adaptive or personalized ICT programs, digital pen technologies, and blended learning show that such interventions can support differentiated instruction and have positive effects on student achievement ( Walkington, 2013 ; Chen et al., 2016 ; Van Halem et al., 2017 ; Ghysels and Haelermans, 2018 ), although more research is needed to assess for whom and for which type of outcomes these approaches are beneficial ( Van Klaveren et al., 2017 ). In the studies in this review, fixed outcome measures were used to assess students' learning. Possibly, adaptive testing will provide more room for assessing differentiated growth trajectories in future studies ( Martin and Lazendic, 2018 ).

Lastly, when aiming to gain further insight in the effectiveness of differentiated instruction, authors may want to reflect on how differentiated instruction is operationalized and measured. In prior research, teacher questionnaires were often used to assess teachers' differentiated instruction practices ( Roy et al., 2013 ; Prast et al., 2015 ). In addition, classroom observations of differentiated instruction or adaptive teaching behavior have been used ( Cassady et al., 2004 ; Van Tassel-Baska et al., 2006 ; Van de Grift, 2007 ). Alternatively, in our selection of papers, we found some interesting ways to determine how teachers differentiate. For example, using vignette or video tests ( Vogt and Rogalla, 2009 ; Bruhwiler and Blatchford, 2011 ) or by means of teacher logs or observations ( Little et al., 2014 ). Enriching measures of teacher behavior with information about the match of the behavior with students' needs may be another step forward ( Van Geel et al., 2019 ). We would like to recommend authors to further develop, evaluate and apply measures for differentiated instruction that can be used to gain insight in how differentiated instruction is linked to various student outcomes.

Data Availability Statement

The datasets generated for this study are available on request to the corresponding author.

Author Contributions

AS-J set up the methods of the paper, analyzed the theoretical backgrounds and is responsible for the concept of the article, and together with co-authors, extracted data, performed the analyses, and wrote the paper. AM coordinated the selection of studies, worked on data selection and extraction, and contributed to writing the paper. MH-L and RM designed the overarching project, acquired funding for the execution, and contributed to the conceptualization of differentiated instruction and the review process.

This work was supported by the Dutch scientific funding agency (NRO) under Grant number 405-15-732.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We want to thank Bernie Helms for his contribution to the practical work needed to execute this study. Additionally, we greatly value the consultations regarding the analyses with our colleagues Dr. Hester de Boer and Prof. Dr. Roel Bosker from GION Educational Sciences.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2019.02366/full#supplementary-material

1. ^ We did not include search terms specifically referring to heterogeneous approaches in the search string. Although heterogeneous grouping may include differentiation, adaptiveness is often not the focus of these studies.

2. ^ Quasi-experimental studies in which experimental and control groups are well matched, and covariates that correlate strongly with pretests are used to adjust outcomes, can be a valuable source of information usable for meta-analyses ( Slavin et al., 2008 ; Slavin and Smith, 2009 ), although the results of (especially small-scale) quasi-experimental studies should be evaluated with caution ( Cheung and Slavin, 2016 ).

3. ^ References included in the systematic review are marked with an asterisk.

Adeniji, S. M., Ameen, S. K., Dambatta, B. U., and Orilonise, R. (2018). Effect of mastery learning approach on senior school students' academic performance and retention in circle geometry. Int. J. Instruct. 11, 951–962. doi: 10.12973/iji.2018.11460a

CrossRef Full Text | Google Scholar

Akçayir, G., and Akçayir, M. (2018). The flipped classroom: a review of its advantages and challenges. Comput. Educ. 126, 334–345. doi: 10.1016/j.compedu.2018.07.021

Altemueller, L., and Lindquist, C. (2017). Flipped classroom instruction for inclusive learning. Br. J. Spec. Educ. 44, 341–358. doi: 10.1111/1467-8578.12177

* Altintas, E., and Özdemir, A. S. (2015a). The effect of the developed differentiation approach on the achievements of the students. Eurasian J. Educ. Res. 61, 199–216. doi: 10.14689/ejer.2015.61.11

* Altintas, E., and Özdemir, A. S. (2015b). Evaluating a newly developed differentiation approach in terms of student achievement and teachers' opinions. Educ. Sci. Theor. Pract. 15, 1103–1118. doi: 10.12738/estp.2015.4.2540

* and Bal, A. P. (2016). The effect of the differentiated teaching approach in the algebraic learning field on students' academic achievements. Eurasian J. Educ. Res. 63, 185–204. doi: 10.14689/ejer.2016.63.11

Best Evidence Encyclopedia (n.d.). Review Methods. Criteria for Inclusion in the Best Evidence Encyclopedia . Available online at: http://www.bestevidence.org/methods/criteria.htm

Google Scholar

* Bhagat, K. K., Chang, C., and Chang, C. (2016). The impact of the flipped classroom on mathematics concept learning in high school. J. Educ. Technol. Soc. 19, 134–142. Available online at: https://psycnet.apa.org/record/2016-35586-003

* Bikić, N., Maričić, S. M., and Pikula, M. (2016). The effects of differentiation of content in problem-solving in learning geometry in secondary school. EURASIA J. Math. Sci. Technol. Educ. 12, 2783–2795. doi: 10.12973/eurasia.2016.02304a

Blatchford, P., Bassett, P., and Brown, P. (2011). Examining the effect of class size on classroom engagement and teacher-pupil interaction: differences in relation to pupil prior attainment and primary vs. secondary schools. Learn. Instruct. 21, 715–730. doi: 10.1016/j.learninstruc.2011.04.001

Borenstein, M., Hedges, L. V., Higgins, J. P. T., and Rothstein, H. R. (2009). Introduction to Meta-Analysis . Chichester: John Wiley and Sons. doi: 10.1002/9780470743386

PubMed Abstract | CrossRef Full Text | Google Scholar

Borenstein, M., Hedges, L. V., Higgins, J. P. T., and Rothstein, H. R. (2010). A basic introduction to fixed-effect and random-effects models for meta-analysis. Res. Synth. Methods 1, 97–111. doi: 10.1002/jrsm.12

Bosker, R. J. (2005). De Grenzen van Gedifferentiëerd Onderwijs. Groningen: Rijksuniversiteit Groningen . Available online at: http://www.rug.nl/research/portal/files/14812458/bosker.pdf

Bray, B., and McClaskey, K. (2013). Personalization vs. Differentiation vs. Individualization. (No. version 3) . Available online at: http://www.personalizelearning.com/2012/04/explaining-chart.html

Brink, M., and Bartz, D. E. (2017). Effective use of formative assessment by high school teachers. Pract. Assess. Res. Eval. 22, 1–10. Available online at: https://pareonline.net/getvn.asp?v=22&n=8

* Bruhwiler, C., and Blatchford, P. (2011). Effects of class size and adaptive teaching competency on classroom processes and academic outcome. Learn. Instruct. 21, 95–108. doi: 10.1016/j.learninstruc.2009.11.004

Cassady, J. C., Neumeister, K. L. S., Adams, C. M., Cross, T. L., Dixon, F. A., and Pierce, R. L. (2004). The differentiated classroom observation scale. Roeper Rev. 26, 139–146. doi: 10.1080/02783190409554259

Cavanagh, S. (2014). What is personalised learning? Educators seek clarity. Education Week . Available online at: https://www.edweek.org/ew/articles/2014/10/22/09pl-overview.h34.html

Chen, C., Tan, C., and Lo, B. (2016). Facilitating English-language learners' oral reading fluency with digital pen technology. Interact. Learn. Environ. 24, 96–118. doi: 10.1080/10494820.2013.817442

Cheung, A. C. K., and Slavin, R. E. (2016). How methodological features affect effect sizes in education. Educ. Res. 45, 283–292. doi: 10.3102/0013189X16656615

Cheung, A. C. K., Slavin, R. E., Kim, E., and Lake, C. (2017). Effective secondary science programs: a best-evidence synthesis. J. Res. Sci. Teach. 54, 58–81. doi: 10.1002/tea.21338

Civitillo, S., Denessen, E., and Molenaar, I. (2016). How to see the classroom through the eyes of a teacher: consistency between perceptions on diversity and differentiation practices. J. Res. Spec. Educ. Needs 16, 587–591. doi: 10.1111/1471-3802.12190

Clarke, D., and Hollingsworth, H. (2002). Elaborating a model of teacher professional growth. Teach. Teach. Educ. 18, 947–967. doi: 10.1016/S0742-051X(02)00053-7

Cole, R., Haimson, J., Perez-Johnson, I., and May, H. (2011). Variability in Pretest-Posttest Correlation Coefficients by Student Achievement Level. (NCEE Reference Report 2011-4033). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

Corno, L. (2008). On teaching adaptively. Educ. Psychol. 43, 161–173. doi: 10.1080/00461520802178466

Coubergs, C., Struyven, K., Engels, N., Cools, W., and De Martelaer, K. (2013). Binnenklas-Differentiatie. Leerkansen Voor Alle Leerlingen . Leuven: Uitgeverij Acco.

De Jager, T. (2013). Guidelines to assist the implementation of differentiated learning activities in south African secondary schools. Int. J. Inclus. Educ. 17, 80–94. doi: 10.1080/13603116.2011.580465

De Neve, D., and Devos, G. (2016). The role of environmental factors in beginning teachers' professional learning related to differentiated instruction. Sch. Effect. Sch. Improv. 27, 557–579. doi: 10.1080/09243453.2015.1122637

Denessen, E. J. P. G. (2017). Verantwoord Omgaan met Verschillen: Social-Culturele Achtergronden en Differentiatie in Het Onderwijs. [Soundly Dealing with Differences: Socialcultural Background and Differentiation in Education]. (Inaugural lecture). Leiden: Leiden University . Available online at: https://openaccess.leidenuniv.~nl/handle/1887/51574

Denessen, E. J. P. G., and Douglas, A. S. (2015). “Teacher expectations and within-classroom differentiation,” in Routledge International Handbook of Social Psychology of the Classroom , eds C. M. Rubie-Davies, J. M. Stephens, and P. Watson (London: Routledge; Taylor and Francis Group, 296–303.

Deunk, M. I., Smale-Jacobse, A. E., de Boer, H., Doolaard, S., and Bosker, R. J. (2018). Effective differentiation practices: a systematic review and meta-analysis of studies on the cognitive effects of differentiation practices in primary education. Educ. Res. Rev. 24, 31–54. doi: 10.1016/j.edurev.2018.02.002

Dixon, F. A., Yssel, N., McConnell, J. M., and Hardin, T. (2014). Differentiated instruction, professional development, and teacher efficacy. J. Educ. Gifted 37, 111–127. doi: 10.1177/0162353214529042

Dixon-Woods, M., Agarwal, S., Jones, D., Young, B., and Sutton, A. (2005). Synthesising qualitative and quantitative evidence: a review of possible methods. J. Health Serv. Res. Policy 10, 45–53. doi: 10.1177/135581960501000110

Duval, S., and Tweedie, R. (2000). Trim and fill: a simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics 56, 455–463. doi: 10.1111/j.0006-341X.2000.00455.x

Education Endowment Foundation (n.d.). Teaching Learning Toolkit An Accessible Summary of the International Evidence on Teaching 5-16 year-Olds . Available online at: https://educationendowmentfoundation.org.uk/evidence-summaries/teaching-learning-toolkit/

Edwards, P., Clarke, M., DiGuiseppi, C., Pratap, S., Roberts, I., and Wentz, R. (2002). Identification of randomized controlled trials in systematic reviews: accuracy and reliability of screening records. Stat. Med. 21, 1635–1640. doi: 10.1002/sim.1190

Gardner, H. (2016). “Multiple intelligences: prelude, theory, and aftermath,” in Scientists Making a Difference , eds R. J. Sternberg, S. T. Fiske, and D. J. Foss (New York, NY: Cambridge University Press).doi: 10.1017/CBO9781316422250

Ghysels, J., and Haelermans, C. (2018). New evidence on the effect of computerized individualized practice and instruction on language skills. J. Comput. Assist. Learn. 34, 440–449. doi: 10.1111/jcal.12248

Guskey, T. R., and Pigott, T. D. (1988). Research on group-based mastery learning programs: a meta-analysis. J. Educ. Res. 81, 197–216. doi: 10.1080/00220671.1988.10885824

Hall, E. F. (1992). Assessment for differentiation. Br. J. Spec. Educ. 19, 20–23. doi: 10.1111/j.1467-8578.1992.tb00397.x

Hancock, G. R., and Mueller, R. O. (2010). The Reviewer's Guide to Quantitative Methods in the Social Sciences . New York, NY: Routledge.

Hattie, J. (2009). Visible Learning. A Synthesis of Over 800 Meta-Analyses Relating to Achievement. Oxon: Routledge.

Hertberg-Davis, H., and Brighton, C. M. (2006). Support and sabotage: principals' influence on middle school teachers' responses to differentiation. J. Secondary Gifted Educ. 17, 90–102. doi: 10.4219/jsge-2006-685

Higgins, J.P.T., and Green, S., (eds). (2011). Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. The Cochrane Collaboration . Available online at: www.handbook.cochrane.org .

* Huber, M. J., Workman, J., Ford, J. A., Moore, D., and Mayer, T. (2009). Evaluating the prevention through alternative learning styles program. J. Drug Educ. 39, 239–259. doi: 10.2190/DE.39.3.b

Imants, J., and Van Veen, K. (2010). “Teacher learning as workplace learning,” in International Encyclopedia of Education, 3rd Edn . eds P. Peterson, E. Baker, and B. McGaw (Oxford: Elsevier), 569–574. doi: 10.1016/B978-0-08-044894-7.00657-6

Janssen, F., Westbroek, H., and Doyle, W. (2015). Practicality studies: how to move from what works in principle to what works in practice. J. Learn. Sci. 24, 176–186. doi: 10.1080/10508406.2014.954751

Keuning, T., Van Geel, M., Frèrejean, J., Van Merriënboer, J., Dolmans, D., and Visscher, A. (2017). Differentiëren bij rekenen: Een cognitieve taakanalyse van het denken en handelen van basisschoolleerkrachten [Differentiating in mathematics: a cognitive task analysis of primary school teachers' reflections and practices]. Pedagog. Stud. 94, 160–181. Available online at: http://pedagogischestudien.nl/download?type=document&identifier=640319

Kiley, D. (2011). Differentiated Instruction in the Secondary Classroom: Analysis of the Level of Implementation and Factors that Influence Practice (Partial FULFILLMENT of the Requirements for the Degree of Doctor of Education) . Kalamazoo: Western Michigan University.

Kingston, N., and Nash, B. (2011). Formative assessment: a meta-analysis and a call for research. Educ. Meas. Issues Pract. 30, 28–37. doi: 10.1111/j.1745-3992.2011.00220.x

Kirschner, P. A., Claessens, L., and Raaijmakers, S. (2018). Op de Schouders van Reuzen. Inspirerende Inzichten uit de Cognitieve Psychologie voor Leerkrachten. [On the Shoulders of Giants. Inspiring Insights from Cognitive Psychology for Teachers] . Meppel: Drukkerij Ten Brink Uitgevers.

Kulik, C. C., and Kulik, J. A. (1982). Effects of ability grouping on secondary school students: a meta-analysis of evaluation findings. Am. Educ. Res. J. 19, 415–428. doi: 10.3102/00028312019003415

Kulik, C. C., Kulik, J. A., and Bangert-Drowns, R. L. (1990). Effectiveness of mastery learning programs: a meta-analysis. Rev. Educ. Res. 60, 265–299. doi: 10.3102/00346543060002265

Kulik, J. A. (1992). An Analysis of the Research on Ability Grouping: Historical and Contemporary Perspectives. Research-Based Decision Making Series. National Research Center on the Gifted and Talented . Available online at: http://search.ebscohost.com.proxy-ub.rug.nl/login.aspx?direct=trueanddb=ericandAN=ED350777andsite=ehost-liveandscope=site

Kulik, J. A., and Fletcher, J. D. (2016). Effectiveness of intelligent tutoring systems: a meta-analytic review. Rev. Educ. Res. 86, 42–78. doi: 10.3102/0034654315581420

Kumar, R., O'malley, P. M., Johnston, L. D., and Laetz, V. B. (2013). Alcohol, tobacco, and other drug use prevention programs in U.S. schools: a descriptive summary. Prev. Sci. 14, 581–592. doi: 10.1007/s11121-012-0340-z

Kyriakides, L., Creemers, B., and Charalambous, E. (2018). Equity and Quality Dimensions in Educational Effectiveness. Dordrecht: Springer International Publishing. doi: 10.1007/978-3-319-72066-1

Latz, A. O., Speirs Neumeister, K. L., Adams, C. M., and Pierce, R. L. (2009). Peer coaching to improve classroom differentiation: perspectives from project CLUE. Roeper Rev. 31, 27–39. doi: 10.1080/02783190802527356

Leo, J., and Puzio, K. (2016). Flipped instruction in a high school science classroom. J. Sci. Educ. Technol. 25, 775–781. doi: 10.1007/s10956-016-9634-4

* Little, C. A., McCoach, D. B., and Reis, S. M. (2014). Effects of differentiated reading instruction on student achievement in middle school. J. Adv. Acad. 25, 384–402. doi: 10.1177/1932202X14549250

Lou, Y., Abrami, P. C., Spence, J. C., Poulsen, C., Chambers, B., and d'Apollonia, S. (1996). Within-class grouping: a meta-analysis. Rev. Educ. Res. 66, 423–458. doi: 10.3102/00346543066004423

Lyons, L. C. (2003). Meta-Analysis: Methods of Accumulating Results Across Research Domains . Available online at: http://www.lyonsmorris.com/lyons/metaAnalysis/index.cfm

Ma, W., Adesope, O. O., Nesbit, J. C., and Liu, Q. (2014). Intelligent tutoring systems and learning outcomes: a meta-analysis. J. Educ. Psychol. 106, 901–918. doi: 10.1037/a0037123

Martin, A. J., and Lazendic, G. (2018). Computer-adaptive testing: implications for students' achievement, motivation, engagement, and subjective test experience. J. Educ. Psychol. 110, 27–45. doi: 10.1037/edu0000205

* Mastropieri, M. A., Scruggs, T. E., Norland, J. J., Berkeley, S., McDuffie, K., Tornquist, E. H., et al. (2006). Differentiated curriculum enhancement in inclusive middle school science: effects on classroom and high-stakes tests. J. Spec. Educ. 40, 130–137. doi: 10.1177/00224669060400030101

Maulana, R., Helms-Lorenz, M., and Van de Grift, W. J. C. M. (2015). Development and evaluation of a questionnaire measuring pre-service teachers' teaching behaviour: a rasch modelling approach. Sch. Effect. Sch. Improv. 26, 169–194. doi: 10.1080/09243453.2014.939198

Maulana, R., Helms-Lorenz, M., and Van de Grift, W. J. C. M. (2017). Validating a model of effective teaching behaviour of pre-service teachers. Teach. Teach. Theor. Pract. 23, 471–493. doi: 10.1080/13540602.2016.1211102

McQuarrie, L., McRae, P., and Stack-Cutler, H. (2008). Differentiated Instruction Provincial Research Review. Edmonton, AB: Alberta Initiative for School Improvement.

Mevarech, Z. R., and Kramarski, B. (1997). IMPROVE: a multidimensional method for teaching mathematics in heterogeneous classrooms. Am. Educ. Res. J. 34, 365–394. doi: 10.3102/00028312034002365

Mills, M., Monk, S., Keddie, A., Renshaw, P., Christie, P., Geelan, D., et al. (2014). Differentiated learning: from policy to classroom. Oxford Rev. Educ. 40, 331–348. doi: 10.1080/03054985.2014.911725

* Mitee, T. L., and Obaitan, G. N. (2015). Effect of mastery learning on senior secondary school students' cognitive learning outcome in quantitative chemistry. J. Educ. Pract. 6, 34–38. Available online at: https://files.eric.ed.gov/fulltext/EJ1083639.pdf

Moeyaert, M., Ugille, M., Beretvas, N., Ferron, J., Bunuan, R., and Van den Noortgate, W. (2016). Methods for dealing with multiple outcomes in meta-analysis: a comparison between averaging effect sizes, robust variance estimation and multilevel meta-analysis. Int. J. Soc. Res. Methodol. 20, 559–572. doi: 10.1080/13645579.2016.1252189

Nokes-Malach, T., Richey, J., and Gadgil, S. (2015). When is it better to learn together? Insights from research on collaborative learning. Educ. Psychol. Rev. 27, 645–656. doi: 10.1007/s10648-015-9312-8

Oakes, J. (2008). Keeping track: structuring equality and inequality in an era of accountability. Teach. College Rec. 110, 700–712. Available online at: https://www.tcrecord.org/Content.asp?ContentId=14610

OECD (2012). Equity and Quality in Education. Supporting Disadvantaged Students and Schools. Paris: OECD Publishing. doi: 10.1787/9789264130852-en

OECD (2018). The Resilience of Students with an Immigrant Background. Factors that Shape Well-being. Paris: OECD Publishing. doi: 10.1787/9789264292093-en

Parsons, S. A., Dodman, S. L., and Cohen Burrowbridge, S. (2013). Broadening the view of differentiated instruction differentiation shouldn't end with planning but should continue as teachers adapt their instruction during lessons. Kappan 95, 38–42. doi: 10.1177/003172171309500107

Parsons, S. A., Vaughn, M., Scales, R. Q., Gallagher, M. A., Parsons, A. W., Davis, S. G., et al. (2018). Teachers' instructional adaptations: a research synthesis. Rev. Educ. Res. 88, 205–242. doi: 10.3102/0034654317743198

Petticrew, M., and Roberts, H. (2006). Systematic Reviews in the Social Sciences. A Practical Guide. Malden, MA: USA Blackwell publishing. doi: 10.1002/9780470754887

Pierce, R., and Adams, C. (2005). Using tiered lessons in mathematics. Math. Teach. Middle Sch. 11, 144–149.

Prast, E. J., Van de Weijer-Bergsma, E., Kroesbergen, E. H., Van Luit, and Johannes, E. H. (2015). Readiness-based differentiation in primary school mathematics: expert recommendations and teacher self-assessment. Frontline Learn. Res. 3, 90–116. doi: 10.14786/flr.v3i2.163

* Richards, M. R. E., and Omdal, S. N. (2007). Effects of tiered instruction on academic achievement in a secondary science course. J. Adv. Acad. 18, 424–453. doi: 10.4219/jaa-2007-499

Rock, M. L., Gregg, M., Ellis, E., and Gable, R. A. (2008). REACH: a framework for differentiating classroom instruction. Prev. Sch. Fail. 52, 31–47. doi: 10.3200/PSFL.52.2.31-47

Roy, A., Guay, F., and Valois, P. (2013). Teaching to address diverse learning needs: development and validation of a differentiated instruction scale. Int. J. Inclus. Educ. 17, 1186–1204. doi: 10.1080/13603116.2012.743604

Scheerens, J. (2016). “Meta-analyses of school and instructional effectiveness,” in Educational Effectiveness and Ineffectiveness , ed J. Scheerens (Dordrecht: Springer Science + Business Media), 175–223. doi: 10.1007/978-94-017-7459-8_8

Schipper, T., Goei, S. L., de Vries, S., and van Veen, K. (2017). Professional growth in adaptive teaching competence as a result of lesson study. Teach. Teach. Educ. 68, 289–303. doi: 10.1016/j.tate.2017.09.015

Schipper, T., Goei, S. L., de Vries, S., and van Veen, K. (2018). Developing teachers' self-efficacy and adaptive teaching behaviour through lesson study. International Journal of Educational Research , 88, 109–120. doi: 10.1016/j.ijer.2018.01.011

Schleicher, A. (2016). Teaching Excellence Through Professional Learning and Policy Reform: Lessons from Around the World . Paris: International Summit on the Teaching Profession; OECD Publishing. doi: 10.1787/9789264252059-en

Schofield, J. W. (2010). International evidence on ability grouping with curriculum differentiation and the achievement gap in secondary schools. Teach. College Rec. 112, 1492–1528. Available online at: https://www.tcrecord.org/Content.asp?ContentId=15684

Schütz, G., Ursprung, H., and Wößmann, L. (2008). Education Policy and Equality of Opportunity, Vol. 61 (Kyklos: Wiley Blackwell), 279–308.

Sezer, B. (2017). The effectiveness of a technology-enhanced flipped science classroom. J. Educ. Comput. Res. 55, 471–494. doi: 10.1177/0735633116671325

Shute, V. J., and Rahimi, S. (2017). Review of computer-based assessment for learning in elementary and secondary education. J. Comput. Assist. Learn. 33, 1–19. doi: 10.1111/jcal.12172

Sipe, T. A., and Curlette, W. L. (1996). A meta-synthesis of factors related to educational achievement: a methodological approach to summarizing and synthesizing meta-analyses. Int. J. Educ. Res. 25, 83–698. doi: 10.1016/S0883-0355(96)80001-2

Slavin, R., and Smith, D. (2009). The relationship between sample sizes and effect sizes in systematic reviews in education. Educ. Eval. Policy Anal. 31, 500–506. doi: 10.3102/0162373709352369

Slavin, R. E. (1986). Best-evidence synthesis: an alternative to meta-analytic and traditional reviews. Educ. Res. 15, 5–11. doi: 10.3102/0013189X015009005

Slavin, R. E. (1987). Mastery learning reconsidered. Rev. Educ. Res. 57, 175–214. doi: 10.3102/00346543057002175

Slavin, R. E. (1990a). Achievement effects of ability grouping in secondary schools: a best-evidence synthesis. Rev. Educ. Res. 60, 471–499. doi: 10.3102/00346543060003471

Slavin, R. E. (1990b). Mastery learning re-reconsidered. Rev. Educ. Res. 60, 300–302. doi: 10.3102/00346543060002300

Slavin, R. E. (1995). Best evidence synthesis. An intelligent alternative to meta-analysis. J. Clin. Epidemiol. 48, 9–18. doi: 10.1016/0895-4356(94)00097-A

Slavin, R. E. (2013). Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia. Sch. Effect. Sch. Improv. 24, 383–391. doi: 10.1080/09243453.2013.797913

Slavin, R. E., and Cheung, A. (2005). A synthesis of research on language of reading instruction for English language learners. Rev. Educ. Res. 75, 247–284. doi: 10.3102/00346543075002247

Slavin, R. E., Cheung, A., Groff, C., and Lake, C. (2008). Effective reading programs for middle and high schools: a best-evidence synthesis. Read. Res. Q. 43, 290–322. doi: 10.1598/RRQ.43.3.4

Slavin, R. E., Lake, C., and Groff, C. (2009). Effective programs in middle and high school mathematics: a best-evidence synthesis. Rev. Educ. Res. 79, 839–911. doi: 10.3102/0034654308330968

Slavin, R. E., and Madden, N. A. (2011). Measures inherent to treatments in program effectiveness reviews. J. Res. Educ. Effect. 4, 370–380. doi: 10.1080/19345747.2011.558986

Smeets, E., and Mooij, T. (2001). Pupil-centred learning, ICT, and teacher behaviour: observations in educational practice. Br. J. Educ. Technol. 32, 403. doi: 10.1111/1467-8535.00210

Smets, W., and Struyven, K. (2018). Realist review of literature on catering for different instructional needs with preteaching and extended instruction. Educ. Sci . 8, 113. doi: 10.3390/educsci8030113

* Smit, R., and Humpert, W. (2012). Differentiated instruction in small schools. Teach. Teach. Educ. 28, 1152–1162. doi: 10.1016/j.tate.2012.07.003

Smit, R., Humpert, W., Obertüfer-Gahler, R., Engeli, E., and Breuer-Brodmüller, M. (2011). “Differenzierung als Chance für kleine schulen - empirische befunde im längsschnitt,” in Schule im Alpinen Raum , eds R. Müller, A. Keller, U. Kerle, A. Raggl, and E. Steiner (Innsbruck: Studienverlag), 435–488.

Steenbergen-Hu, S., Makel, M. C., and Olszewski-Kubilius, P. (2016). What one hundred years of research says about the effects of ability grouping and acceleration on K−12 students' academic achievement: findings of two second-order meta-analyses. Rev. Educ. Res. 86, 849–899. doi: 10.3102/0034654316675417

Stollman, S. H. M. (2018). Differentiated Instruction in Practice: A Teacher Perspective . Leiden: ICLON, Leiden University Graduate School of Teaching.

Subban, P. (2006). Differentiated instruction: a research basis. Int. Educ. J. 7, 935–947. Available online at: http://ehlt.flinders.edu.au/education/iej/articles/v7n7/Subban/BEGIN.HTM

Suprayogi, M. N., Valcke, M., and Godwin, R. (2017). Teachers and their implementation of differentiated instruction in the classroom. Teach. Teach. Educ. 67, 291–301. doi: 10.1016/j.tate.2017.06.020

Swanson, H. L., and Lussier, C. M. (2001). A selective synthesis of the experimental literature on dynamic assessment. Rev. Educ. Res. 71, 321–363. doi: 10.3102/00346543071002321

Tieso, C. L. (2003). Ability grouping is not just tracking anymore. Roeper Rev. 26, 29–36. doi: 10.1080/02783190309554236

Tomlinson, C. (2015). Teaching for excellence in academically diverse classrooms. Society 52, 203–209. doi: 10.1007/s12115-015-9888-0

Tomlinson, C. A. (1995). Deciding to differentiate instruction in middle school: one school's journey. Gifted Child Q. 39, 77–87. doi: 10.1177/001698629503900204

Tomlinson, C. A. (1999). Mapping a route toward differentiated instruction. Pers. Learn. 57, 12–16.

Tomlinson, C. A. (2014). The Differentiated Classroom. Responding to the Needs of All Learrners, 2nd Edn . Alexandria, VA: ASCD.

Tomlinson, C. A., Brighton, C., Hertberg, H., Callahan, C. M., Moon, T. R., Brimijoin, K., et al. (2003). Differentiating instruction in response to student readiness, interest, and learning profile in academically diverse classrooms: a review of literature. J. Educ. Gifted 27, 119–145. doi: 10.1177/016235320302700203

Unesco (2017). A Guide for Ensuring Inclusion and Equity in Education. Paris: United Nations Educational, Scientific and Cultural Organization . Available online at: https://unesdoc.unesco.org/ark:/48223/pf0000248254

Valiande, S., and Koutselini, M. I. (2009). (2009). “Application and evaluation of differentiation instruction in mixed ability classrooms,” Paper presented at the 4th Hellenic Observatory PhD Symposium (London: LSE, 25–26.

Valiandes, S., and Neophytou, L. (2018). Teachers' professional development for differentiated instruction in mixed-ability classrooms: investigating the impact of a development program on teachers' professional learning and on students' achievement. Teach. Dev. 22, 123–138. doi: 10.1080/13664530.2017.1338196

Van Casteren, W., Bendig-Jacobs, J., Wartenbergh-Cras, F., Van Essen, M., and Kurver, B. (2017). Differentiëren en Differentiatievaardigheden in Het Voortgezet Onderwijs. Nijmegen: ResearchNed.

Van de Grift, W. J. C. M. (2007). Quality of teaching in four European countries: A review of the literature and application of an assessment instrument. Educ. Res. 49, 127–152. doi: 10.1080/00131880701369651

Van de Grift, W. J. C. M., Helms-Lorenz, M., and Maulana, R. (2014). Teaching skills of student teachers: calibration of an evaluation instrument and its value in predicting student academic engagement. Stud. Educ. Eval. 43, 150–159. doi: 10.1016/j.stueduc.2014.09.003

Van de Pol, J., Volman, M., and Beishuizen, J. (2010). Scaffolding in Teacher–Student interaction: a decade of research. Educ. Psychol. Rev. 22, 271–296. doi: 10.1007/s10648-010-9127-6

Van de Pol, J., Volman, M., Oort, F., and Beishuizen, J. (2015). The effects of scaffolding in the classroom: support contingency and student independent working time in relation to student achievement, task effort and appreciation of support. Instruct. Sci. 43, 615–641. doi: 10.1007/s11251-015-9351-z

Van der Kleij, F., Feskens, R. C. W., and Eggen, T. J. H. M. (2015). Effects of feedback in a computer-based learning environment on students' learning outcomes. Rev. Educ. Res. 85, 475–511. doi: 10.3102/0034654314564881

Van der Lans, R. M., Van de Grift, W. J. C. M., and van Veen, K. (2017). Individual differences in teacher development: an exploration of the applicability of a stage model to assess individual teachers. Learn. Individ. Diff. 58, 46–55. doi: 10.1016/j.lindif.2017.07.007

Van der Lans, R. M., Van de Grift, W. J. C. M., and van Veen, K. (2018). Developing an instrument for teacher feedback: using the rasch model to explore teachers' development of effective teaching strategies and behaviors. J. Exp. Educ. 86, 247–264. doi: 10.1080/00220973.2016.1268086

Van Geel, M., Keuning, T., Frèrejean, J., Dolmans, D., Van Merriënboer, J., and Visscher, A. J. (2019). Capturing the complexity of differentiated instruction. Sch. Effect. Sch. Improv. 30, 51–67. doi: 10.1080/09243453.2018.1539013

Van Halem, N., Van Klaveren, C. P. B. J., and Cornelisz, I. (2017). Oefent een leerling meer door niveaudifferentiatie? Het effect van data-gestuurde differentiatie op leerinspanning en de rol van eerder behaalde cijfers. [Does a learner practice more because of readiness-based differentiation? The effect of data-driven differentiation on learning effort and the role of prior grades]. Pedagog. Stud. 94, 182–195. Available online at: http://pedagogischestudien.nl/download?type=document&identifier=640298

Van Klaveren, C., Vonk, S., and Cornelisz, I. (2017). The effect of adaptive versus static practicing on student learning - evidence from a randomized field experiment. Econ. Educ. Rev. 58, 175–187. doi: 10.1016/j.econedurev.2017.04.003

Van Tassel-Baska, J., Quek, C., and Feng, A. X. (2006). The development and use of a structured teacher observation scale to assess differentiated best practice. Roeper Rev. 29, 84–92. doi: 10.1080/02783190709554391

* Vogt, F., and Rogalla, M. (2009). Developing adaptive teaching competency through coaching. Teach. Teach. Educ. 25, 1051–1060. doi: 10.1016/j.tate.2009.04.002

Walkington, C. A. (2013). Using adaptive learning technologies to personalize instruction to student interests: the impact of relevant contexts on performance and learning outcomes. J. Educ. Psychol. 105, 932–945. doi: 10.1037/a0031882

* Wambugu, P. W., and Changeiywo, J. M. (2008). Effects of mastery learning approach on secondary school students' physics achievement. EURASIA J. Math. Sci. Technol. Educ. 4, 293–302. doi: 10.12973/ejmste/75352

Wang, M. C., Haertel, G. D., and Walberg, H. J. (1990). What influences learning? A content analysis of review literature. J. Educ. Res. 84, 30–43. doi: 10.1080/00220671.1990.10885988

Wilkinson, S. D., and Penney, D. (2014). The effects of setting on classroom teaching and student learning in mainstream mathematics, English and science lessons: a critical review of the literature in England. Educ. Rev. 66, 411–427. doi: 10.1080/00131911.2013.787971

World Health Organiasation (2011). European Action Plan to Reduce the Harmful Use of Alcohol 2012–2020. Copenhagen: World Health Organization Regional Office for Europe . Available online at: https://www.stap.nl/en/home/european-alcohol-policy.html

Keywords: review, differentiation, differentiated instruction, adaptive teaching, ability grouping, secondary education, student performance, effectiveness

Citation: Smale-Jacobse AE, Meijer A, Helms-Lorenz M and Maulana R (2019) Differentiated Instruction in Secondary Education: A Systematic Review of Research Evidence. Front. Psychol. 10:2366. doi: 10.3389/fpsyg.2019.02366

Received: 14 May 2019; Accepted: 04 October 2019; Published: 22 November 2019.

Reviewed by:

Copyright © 2019 Smale-Jacobse, Meijer, Helms-Lorenz and Maulana. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Annemieke E. Smale-Jacobse, a.e.smale-jacobse@rug.nl

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Differentiated Instruction in Secondary Education: A Systematic Review of Research Evidence

Affiliation.

  • 1 Department of Teacher Education, University of Groningen, Groningen, Netherlands.
  • PMID: 31824362
  • PMCID: PMC6883934
  • DOI: 10.3389/fpsyg.2019.02366

Differentiated instruction is a pedagogical-didactical approach that provides teachers with a starting point for meeting students' diverse learning needs. Although differentiated instruction has gained a lot of attention in practice and research, not much is known about the status of the empirical evidence and its benefits for enhancing student achievement in secondary education. The current review sets out to provide an overview of the theoretical conceptualizations of differentiated instruction as well as prior findings on its effectiveness. Then, by means of a systematic review of the literature from 2006 to 2016, empirical evidence on the effects of within-class differentiated instruction for secondary school students' academic achievement is evaluated and summarized. After a rigorous search and selection process, only 14 papers about 12 unique empirical studies on the topic were selected for review. A narrative description of the selected papers shows that differentiated instruction has been operationalized in many different ways. The selection includes studies on generic teacher trainings for differentiated instruction, ability grouping and tiering, individualization, mastery learning, heterogeneous grouping, and remediation in flipped classroom lessons. The majority of the studies show small to moderate positive effects of differentiated instruction on student achievement. Summarized effect sizes across studies range from d = +0.741 to +0.509 (omitting an outlier). These empirical findings give some indication of the possible benefits of differentiated instruction. However, they also point out that there are still severe knowledge gaps. More research is needed before drawing convincing conclusions regarding the effectiveness and value of different approaches to differentiated instruction for secondary school classes.

Keywords: ability grouping; adaptive teaching; differentiated instruction; differentiation; effectiveness; review; secondary education; student performance.

Copyright © 2019 Smale-Jacobse, Meijer, Helms-Lorenz and Maulana.

Publication types

  • Systematic Review

A systematic literature review of empirical research on ChatGPT in education

  • Open access
  • Published: 26 May 2024
  • Volume 3 , article number  60 , ( 2024 )

Cite this article

You have full access to this open access article

literature review differentiation in education

  • Yazid Albadarin   ORCID: orcid.org/0009-0005-8068-8902 1 ,
  • Mohammed Saqr 1 ,
  • Nicolas Pope 1 &
  • Markku Tukiainen 1  

365 Accesses

Explore all metrics

Over the last four decades, studies have investigated the incorporation of Artificial Intelligence (AI) into education. A recent prominent AI-powered technology that has impacted the education sector is ChatGPT. This article provides a systematic review of 14 empirical studies incorporating ChatGPT into various educational settings, published in 2022 and before the 10th of April 2023—the date of conducting the search process. It carefully followed the essential steps outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) guidelines, as well as Okoli’s (Okoli in Commun Assoc Inf Syst, 2015) steps for conducting a rigorous and transparent systematic review. In this review, we aimed to explore how students and teachers have utilized ChatGPT in various educational settings, as well as the primary findings of those studies. By employing Creswell’s (Creswell in Educational research: planning, conducting, and evaluating quantitative and qualitative research [Ebook], Pearson Education, London, 2015) coding techniques for data extraction and interpretation, we sought to gain insight into their initial attempts at ChatGPT incorporation into education. This approach also enabled us to extract insights and considerations that can facilitate its effective and responsible use in future educational contexts. The results of this review show that learners have utilized ChatGPT as a virtual intelligent assistant, where it offered instant feedback, on-demand answers, and explanations of complex topics. Additionally, learners have used it to enhance their writing and language skills by generating ideas, composing essays, summarizing, translating, paraphrasing texts, or checking grammar. Moreover, learners turned to it as an aiding tool to facilitate their directed and personalized learning by assisting in understanding concepts and homework, providing structured learning plans, and clarifying assignments and tasks. However, the results of specific studies (n = 3, 21.4%) show that overuse of ChatGPT may negatively impact innovative capacities and collaborative learning competencies among learners. Educators, on the other hand, have utilized ChatGPT to create lesson plans, generate quizzes, and provide additional resources, which helped them enhance their productivity and efficiency and promote different teaching methodologies. Despite these benefits, the majority of the reviewed studies recommend the importance of conducting structured training, support, and clear guidelines for both learners and educators to mitigate the drawbacks. This includes developing critical evaluation skills to assess the accuracy and relevance of information provided by ChatGPT, as well as strategies for integrating human interaction and collaboration into learning activities that involve AI tools. Furthermore, they also recommend ongoing research and proactive dialogue with policymakers, stakeholders, and educational practitioners to refine and enhance the use of AI in learning environments. This review could serve as an insightful resource for practitioners who seek to integrate ChatGPT into education and stimulate further research in the field.

Similar content being viewed by others

literature review differentiation in education

Empowering learners with ChatGPT: insights from a systematic literature exploration

literature review differentiation in education

Incorporating AI in foreign language education: An investigation into ChatGPT’s effect on foreign language learners

literature review differentiation in education

Large language models in education: A focus on the complementary relationship between human teachers and ChatGPT

Avoid common mistakes on your manuscript.

1 Introduction

Educational technology, a rapidly evolving field, plays a crucial role in reshaping the landscape of teaching and learning [ 82 ]. One of the most transformative technological innovations of our era that has influenced the field of education is Artificial Intelligence (AI) [ 50 ]. Over the last four decades, AI in education (AIEd) has gained remarkable attention for its potential to make significant advancements in learning, instructional methods, and administrative tasks within educational settings [ 11 ]. In particular, a large language model (LLM), a type of AI algorithm that applies artificial neural networks (ANNs) and uses massively large data sets to understand, summarize, generate, and predict new content that is almost difficult to differentiate from human creations [ 79 ], has opened up novel possibilities for enhancing various aspects of education, from content creation to personalized instruction [ 35 ]. Chatbots that leverage the capabilities of LLMs to understand and generate human-like responses have also presented the capacity to enhance student learning and educational outcomes by engaging students, offering timely support, and fostering interactive learning experiences [ 46 ].

The ongoing and remarkable technological advancements in chatbots have made their use more convenient, increasingly natural and effortless, and have expanded their potential for deployment across various domains [ 70 ]. One prominent example of chatbot applications is the Chat Generative Pre-Trained Transformer, known as ChatGPT, which was introduced by OpenAI, a leading AI research lab, on November 30th, 2022. ChatGPT employs a variety of deep learning techniques to generate human-like text, with a particular focus on recurrent neural networks (RNNs). Long short-term memory (LSTM) allows it to grasp the context of the text being processed and retain information from previous inputs. Also, the transformer architecture, a neural network architecture based on the self-attention mechanism, allows it to analyze specific parts of the input, thereby enabling it to produce more natural-sounding and coherent output. Additionally, the unsupervised generative pre-training and the fine-tuning methods allow ChatGPT to generate more relevant and accurate text for specific tasks [ 31 , 62 ]. Furthermore, reinforcement learning from human feedback (RLHF), a machine learning approach that combines reinforcement learning techniques with human-provided feedback, has helped improve ChatGPT’s model by accelerating the learning process and making it significantly more efficient.

This cutting-edge natural language processing (NLP) tool is widely recognized as one of today's most advanced LLMs-based chatbots [ 70 ], allowing users to ask questions and receive detailed, coherent, systematic, personalized, convincing, and informative human-like responses [ 55 ], even within complex and ambiguous contexts [ 63 , 77 ]. ChatGPT is considered the fastest-growing technology in history: in just three months following its public launch, it amassed an estimated 120 million monthly active users [ 16 ] with an estimated 13 million daily queries [ 49 ], surpassing all other applications [ 64 ]. This remarkable growth can be attributed to the unique features and user-friendly interface that ChatGPT offers. Its intuitive design allows users to interact seamlessly with the technology, making it accessible to a diverse range of individuals, regardless of their technical expertise [ 78 ]. Additionally, its exceptional performance results from a combination of advanced algorithms, continuous enhancements, and extensive training on a diverse dataset that includes various text sources such as books, articles, websites, and online forums [ 63 ], have contributed to a more engaging and satisfying user experience [ 62 ]. These factors collectively explain its remarkable global growth and set it apart from predecessors like Bard, Bing Chat, ERNIE, and others.

In this context, several studies have explored the technological advancements of chatbots. One noteworthy recent research effort, conducted by Schöbel et al. [ 70 ], stands out for its comprehensive analysis of more than 5,000 studies on communication agents. This study offered a comprehensive overview of the historical progression and future prospects of communication agents, including ChatGPT. Moreover, other studies have focused on making comparisons, particularly between ChatGPT and alternative chatbots like Bard, Bing Chat, ERNIE, LaMDA, BlenderBot, and various others. For example, O’Leary [ 53 ] compared two chatbots, LaMDA and BlenderBot, with ChatGPT and revealed that ChatGPT outperformed both. This superiority arises from ChatGPT’s capacity to handle a wider range of questions and generate slightly varied perspectives within specific contexts. Similarly, ChatGPT exhibited an impressive ability to formulate interpretable responses that were easily understood when compared with Google's feature snippet [ 34 ]. Additionally, ChatGPT was compared to other LLMs-based chatbots, including Bard and BERT, as well as ERNIE. The findings indicated that ChatGPT exhibited strong performance in the given tasks, often outperforming the other models [ 59 ].

Furthermore, in the education context, a comprehensive study systematically compared a range of the most promising chatbots, including Bard, Bing Chat, ChatGPT, and Ernie across a multidisciplinary test that required higher-order thinking. The study revealed that ChatGPT achieved the highest score, surpassing Bing Chat and Bard [ 64 ]. Similarly, a comparative analysis was conducted to compare ChatGPT with Bard in answering a set of 30 mathematical questions and logic problems, grouped into two question sets. Set (A) is unavailable online, while Set (B) is available online. The results revealed ChatGPT's superiority in Set (A) over Bard. Nevertheless, Bard's advantage emerged in Set (B) due to its capacity to access the internet directly and retrieve answers, a capability that ChatGPT does not possess [ 57 ]. However, through these varied assessments, ChatGPT consistently highlights its exceptional prowess compared to various alternatives in the ever-evolving chatbot technology.

The widespread adoption of chatbots, especially ChatGPT, by millions of students and educators, has sparked extensive discussions regarding its incorporation into the education sector [ 64 ]. Accordingly, many scholars have contributed to the discourse, expressing both optimism and pessimism regarding the incorporation of ChatGPT into education. For example, ChatGPT has been highlighted for its capabilities in enriching the learning and teaching experience through its ability to support different learning approaches, including adaptive learning, personalized learning, and self-directed learning [ 58 , 60 , 91 ]), deliver summative and formative feedback to students and provide real-time responses to questions, increase the accessibility of information [ 22 , 40 , 43 ], foster students’ performance, engagement and motivation [ 14 , 44 , 58 ], and enhance teaching practices [ 17 , 18 , 64 , 74 ].

On the other hand, concerns have been also raised regarding its potential negative effects on learning and teaching. These include the dissemination of false information and references [ 12 , 23 , 61 , 85 ], biased reinforcement [ 47 , 50 ], compromised academic integrity [ 18 , 40 , 66 , 74 ], and the potential decline in students' skills [ 43 , 61 , 64 , 74 ]. As a result, ChatGPT has been banned in multiple countries, including Russia, China, Venezuela, Belarus, and Iran, as well as in various educational institutions in India, Italy, Western Australia, France, and the United States [ 52 , 90 ].

Clearly, the advent of chatbots, especially ChatGPT, has provoked significant controversy due to their potential impact on learning and teaching. This indicates the necessity for further exploration to gain a deeper understanding of this technology and carefully evaluate its potential benefits, limitations, challenges, and threats to education [ 79 ]. Therefore, conducting a systematic literature review will provide valuable insights into the potential prospects and obstacles linked to its incorporation into education. This systematic literature review will primarily focus on ChatGPT, driven by the aforementioned key factors outlined above.

However, the existing literature lacks a systematic literature review of empirical studies. Thus, this systematic literature review aims to address this gap by synthesizing the existing empirical studies conducted on chatbots, particularly ChatGPT, in the field of education, highlighting how ChatGPT has been utilized in educational settings, and identifying any existing gaps. This review may be particularly useful for researchers in the field and educators who are contemplating the integration of ChatGPT or any chatbot into education. The following research questions will guide this study:

What are students' and teachers' initial attempts at utilizing ChatGPT in education?

What are the main findings derived from empirical studies that have incorporated ChatGPT into learning and teaching?

2 Methodology

To conduct this study, the authors followed the essential steps of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) and Okoli’s [ 54 ] steps for conducting a systematic review. These included identifying the study’s purpose, drafting a protocol, applying a practical screening process, searching the literature, extracting relevant data, evaluating the quality of the included studies, synthesizing the studies, and ultimately writing the review. The subsequent section provides an extensive explanation of how these steps were carried out in this study.

2.1 Identify the purpose

Given the widespread adoption of ChatGPT by students and teachers for various educational purposes, often without a thorough understanding of responsible and effective use or a clear recognition of its potential impact on learning and teaching, the authors recognized the need for further exploration of ChatGPT's impact on education in this early stage. Therefore, they have chosen to conduct a systematic literature review of existing empirical studies that incorporate ChatGPT into educational settings. Despite the limited number of empirical studies due to the novelty of the topic, their goal is to gain a deeper understanding of this technology and proactively evaluate its potential benefits, limitations, challenges, and threats to education. This effort could help to understand initial reactions and attempts at incorporating ChatGPT into education and bring out insights and considerations that can inform the future development of education.

2.2 Draft the protocol

The next step is formulating the protocol. This protocol serves to outline the study process in a rigorous and transparent manner, mitigating researcher bias in study selection and data extraction [ 88 ]. The protocol will include the following steps: generating the research question, predefining a literature search strategy, identifying search locations, establishing selection criteria, assessing the studies, developing a data extraction strategy, and creating a timeline.

2.3 Apply practical screen

The screening step aims to accurately filter the articles resulting from the searching step and select the empirical studies that have incorporated ChatGPT into educational contexts, which will guide us in answering the research questions and achieving the objectives of this study. To ensure the rigorous execution of this step, our inclusion and exclusion criteria were determined based on the authors' experience and informed by previous successful systematic reviews [ 21 ]. Table 1 summarizes the inclusion and exclusion criteria for study selection.

2.4 Literature search

We conducted a thorough literature search to identify articles that explored, examined, and addressed the use of ChatGPT in Educational contexts. We utilized two research databases: Dimensions.ai, which provides access to a large number of research publications, and lens.org, which offers access to over 300 million articles, patents, and other research outputs from diverse sources. Additionally, we included three databases, Scopus, Web of Knowledge, and ERIC, which contain relevant research on the topic that addresses our research questions. To browse and identify relevant articles, we used the following search formula: ("ChatGPT" AND "Education"), which included the Boolean operator "AND" to get more specific results. The subject area in the Scopus and ERIC databases were narrowed to "ChatGPT" and "Education" keywords, and in the WoS database was limited to the "Education" category. The search was conducted between the 3rd and 10th of April 2023, which resulted in 276 articles from all selected databases (111 articles from Dimensions.ai, 65 from Scopus, 28 from Web of Science, 14 from ERIC, and 58 from Lens.org). These articles were imported into the Rayyan web-based system for analysis. The duplicates were identified automatically by the system. Subsequently, the first author manually reviewed the duplicated articles ensured that they had the same content, and then removed them, leaving us with 135 unique articles. Afterward, the titles, abstracts, and keywords of the first 40 manuscripts were scanned and reviewed by the first author and were discussed with the second and third authors to resolve any disagreements. Subsequently, the first author proceeded with the filtering process for all articles and carefully applied the inclusion and exclusion criteria as presented in Table  1 . Articles that met any one of the exclusion criteria were eliminated, resulting in 26 articles. Afterward, the authors met to carefully scan and discuss them. The authors agreed to eliminate any empirical studies solely focused on checking ChatGPT capabilities, as these studies do not guide us in addressing the research questions and achieving the study's objectives. This resulted in 14 articles eligible for analysis.

2.5 Quality appraisal

The examination and evaluation of the quality of the extracted articles is a vital step [ 9 ]. Therefore, the extracted articles were carefully evaluated for quality using Fink’s [ 24 ] standards, which emphasize the necessity for detailed descriptions of methodology, results, conclusions, strengths, and limitations. The process began with a thorough assessment of each study's design, data collection, and analysis methods to ensure their appropriateness and comprehensive execution. The clarity, consistency, and logical progression from data to results and conclusions were also critically examined. Potential biases and recognized limitations within the studies were also scrutinized. Ultimately, two articles were excluded for failing to meet Fink’s criteria, particularly in providing sufficient detail on methodology, results, conclusions, strengths, or limitations. The review process is illustrated in Fig.  1 .

figure 1

The study selection process

2.6 Data extraction

The next step is data extraction, the process of capturing the key information and categories from the included studies. To improve efficiency, reduce variation among authors, and minimize errors in data analysis, the coding categories were constructed using Creswell's [ 15 ] coding techniques for data extraction and interpretation. The coding process involves three sequential steps. The initial stage encompasses open coding , where the researcher examines the data, generates codes to describe and categorize it, and gains a deeper understanding without preconceived ideas. Following open coding is axial coding , where the interrelationships between codes from open coding are analyzed to establish more comprehensive categories or themes. The process concludes with selective coding , refining and integrating categories or themes to identify core concepts emerging from the data. The first coder performed the coding process, then engaged in discussions with the second and third authors to finalize the coding categories for the first five articles. The first coder then proceeded to code all studies and engaged again in discussions with the other authors to ensure the finalization of the coding process. After a comprehensive analysis and capturing of the key information from the included studies, the data extraction and interpretation process yielded several themes. These themes have been categorized and are presented in Table  2 . It is important to note that open coding results were removed from Table  2 for aesthetic reasons, as it included many generic aspects, such as words, short phrases, or sentences mentioned in the studies.

2.7 Synthesize studies

In this stage, we will gather, discuss, and analyze the key findings that emerged from the selected studies. The synthesis stage is considered a transition from an author-centric to a concept-centric focus, enabling us to map all the provided information to achieve the most effective evaluation of the data [ 87 ]. Initially, the authors extracted data that included general information about the selected studies, including the author(s)' names, study titles, years of publication, educational levels, research methodologies, sample sizes, participants, main aims or objectives, raw data sources, and analysis methods. Following that, all key information and significant results from the selected studies were compiled using Creswell’s [ 15 ] coding techniques for data extraction and interpretation to identify core concepts and themes emerging from the data, focusing on those that directly contributed to our research questions and objectives, such as the initial utilization of ChatGPT in learning and teaching, learners' and educators' familiarity with ChatGPT, and the main findings of each study. Finally, the data related to each selected study were extracted into an Excel spreadsheet for data processing. The Excel spreadsheet was reviewed by the authors, including a series of discussions to ensure the finalization of this process and prepare it for further analysis. Afterward, the final result being analyzed and presented in various types of charts and graphs. Table 4 presents the extracted data from the selected studies, with each study labeled with a capital 'S' followed by a number.

This section consists of two main parts. The first part provides a descriptive analysis of the data compiled from the reviewed studies. The second part presents the answers to the research questions and the main findings of these studies.

3.1 Part 1: descriptive analysis

This section will provide a descriptive analysis of the reviewed studies, including educational levels and fields, participants distribution, country contribution, research methodologies, study sample size, study population, publication year, list of journals, familiarity with ChatGPT, source of data, and the main aims and objectives of the studies. Table 4 presents a comprehensive overview of the extracted data from the selected studies.

3.1.1 The number of the reviewed studies and publication years

The total number of the reviewed studies was 14. All studies were empirical studies and published in different journals focusing on Education and Technology. One study was published in 2022 [S1], while the remaining were published in 2023 [S2]-[S14]. Table 3 illustrates the year of publication, the names of the journals, and the number of reviewed studies published in each journal for the studies reviewed.

3.1.2 Educational levels and fields

The majority of the reviewed studies, 11 studies, were conducted in higher education institutions [S1]-[S10] and [S13]. Two studies did not specify the educational level of the population [S12] and [S14], while one study focused on elementary education [S11]. However, the reviewed studies covered various fields of education. Three studies focused on Arts and Humanities Education [S8], [S11], and [S14], specifically English Education. Two studies focused on Engineering Education, with one in Computer Engineering [S2] and the other in Construction Education [S3]. Two studies focused on Mathematics Education [S5] and [S12]. One study focused on Social Science Education [S13]. One study focused on Early Education [S4]. One study focused on Journalism Education [S9]. Finally, three studies did not specify the field of education [S1], [S6], and [S7]. Figure  2 represents the educational levels in the reviewed studies, while Fig.  3 represents the context of the reviewed studies.

figure 2

Educational levels in the reviewed studies

figure 3

Context of the reviewed studies

3.1.3 Participants distribution and countries contribution

The reviewed studies have been conducted across different geographic regions, providing a diverse representation of the studies. The majority of the studies, 10 in total, [S1]-[S3], [S5]-[S9], [S11], and [S14], primarily focused on participants from single countries such as Pakistan, the United Arab Emirates, China, Indonesia, Poland, Saudi Arabia, South Korea, Spain, Tajikistan, and the United States. In contrast, four studies, [S4], [S10], [S12], and [S13], involved participants from multiple countries, including China and the United States [S4], China, the United Kingdom, and the United States [S10], the United Arab Emirates, Oman, Saudi Arabia, and Jordan [S12], Turkey, Sweden, Canada, and Australia [ 13 ]. Figures  4 and 5 illustrate the distribution of participants, whether from single or multiple countries, and the contribution of each country in the reviewed studies, respectively.

figure 4

The reviewed studies conducted in single or multiple countries

figure 5

The Contribution of each country in the studies

3.1.4 Study population and sample size

Four study populations were included: university students, university teachers, university teachers and students, and elementary school teachers. Six studies involved university students [S2], [S3], [S5] and [S6]-[S8]. Three studies focused on university teachers [S1], [S4], and [S6], while one study specifically targeted elementary school teachers [S11]. Additionally, four studies included both university teachers and students [S10] and [ 12 , 13 , 14 ], and among them, study [S13] specifically included postgraduate students. In terms of the sample size of the reviewed studies, nine studies included a small sample size of less than 50 participants [S1], [S3], [S6], [S8], and [S10]-[S13]. Three studies had 50–100 participants [S2], [S9], and [S14]. Only one study had more than 100 participants [S7]. It is worth mentioning that study [S4] adopted a mixed methods approach, including 10 participants for qualitative analysis and 110 participants for quantitative analysis.

3.1.5 Participants’ familiarity with using ChatGPT

The reviewed studies recruited a diverse range of participants with varying levels of familiarity with ChatGPT. Five studies [S2], [S4], [S6], [S8], and [S12] involved participants already familiar with ChatGPT, while eight studies [S1], [S3], [S5], [S7], [S9], [S10], [S13] and [S14] included individuals with differing levels of familiarity. Notably, one study [S11] had participants who were entirely unfamiliar with ChatGPT. It is important to note that four studies [S3], [S5], [S9], and [S11] provided training or guidance to their participants before conducting their studies, while ten studies [S1], [S2], [S4], [S6]-[S8], [S10], and [S12]-[S14] did not provide training due to the participants' existing familiarity with ChatGPT.

3.1.6 Research methodology approaches and source(S) of data

The reviewed studies adopted various research methodology approaches. Seven studies adopted qualitative research methodology [S1], [S4], [S6], [S8], [S10], [S11], and [S12], while three studies adopted quantitative research methodology [S3], [S7], and [S14], and four studies employed mixed-methods, which involved a combination of both the strengths of qualitative and quantitative methods [S2], [S5], [S9], and [S13].

In terms of the source(s) of data, the reviewed studies obtained their data from various sources, such as interviews, questionnaires, and pre-and post-tests. Six studies relied on interviews as their primary source of data collection [S1], [S4], [S6], [S10], [S11], and [S12], four studies relied on questionnaires [S2], [S7], [S13], and [S14], two studies combined the use of pre-and post-tests and questionnaires for data collection [S3] and [S9], while two studies combined the use of questionnaires and interviews to obtain the data [S5] and [S8]. It is important to note that six of the reviewed studies were quasi-experimental [S3], [S5], [S8], [S9], [S12], and [S14], while the remaining ones were experimental studies [S1], [S2], [S4], [S6], [S7], [S10], [S11], and [S13]. Figures  6 and 7 illustrate the research methodologies and the source (s) of data used in the reviewed studies, respectively.

figure 6

Research methodologies in the reviewed studies

figure 7

Source of data in the reviewed studies

3.1.7 The aim and objectives of the studies

The reviewed studies encompassed a diverse set of aims, with several of them incorporating multiple primary objectives. Six studies [S3], [S6], [S7], [S8], [S11], and [S12] examined the integration of ChatGPT in educational contexts, and four studies [S4], [S5], [S13], and [S14] investigated the various implications of its use in education, while three studies [S2], [S9], and [S10] aimed to explore both its integration and implications in education. Additionally, seven studies explicitly explored attitudes and perceptions of students [S2] and [S3], educators [S1] and [S6], or both [S10], [S12], and [S13] regarding the utilization of ChatGPT in educational settings.

3.2 Part 2: research questions and main findings of the reviewed studies

This part will present the answers to the research questions and the main findings of the reviewed studies, classified into two main categories (learning and teaching) according to AI Education classification by [ 36 ]. Figure  8 summarizes the main findings of the reviewed studies in a visually informative diagram. Table 4 provides a detailed list of the key information extracted from the selected studies that led to generating these themes.

figure 8

The main findings in the reviewed studies

4 Students' initial attempts at utilizing ChatGPT in learning and main findings from students' perspective

4.1 virtual intelligent assistant.

Nine studies demonstrated that ChatGPT has been utilized by students as an intelligent assistant to enhance and support their learning. Students employed it for various purposes, such as answering on-demand questions [S2]-[S5], [S8], [S10], and [S12], providing valuable information and learning resources [S2]-[S5], [S6], and [S8], as well as receiving immediate feedback [S2], [S4], [S9], [S10], and [S12]. In this regard, students generally were confident in the accuracy of ChatGPT's responses, considering them relevant, reliable, and detailed [S3], [S4], [S5], and [S8]. However, some students indicated the need for improvement, as they found that answers are not always accurate [S2], and that misleading information may have been provided or that it may not always align with their expectations [S6] and [S10]. It was also observed by the students that the accuracy of ChatGPT is dependent on several factors, including the quality and specificity of the user's input, the complexity of the question or topic, and the scope and relevance of its training data [S12]. Many students felt that ChatGPT's answers were not always accurate and most of them believed that it requires good background knowledge to work with.

4.2 Writing and language proficiency assistant

Six of the reviewed studies highlighted that ChatGPT has been utilized by students as a valuable assistant tool to improve their academic writing skills and language proficiency. Among these studies, three mainly focused on English education, demonstrating that students showed sufficient mastery in using ChatGPT for generating ideas, summarizing, paraphrasing texts, and completing writing essays [S8], [S11], and [S14]. Furthermore, ChatGPT helped them in writing by making students active investigators rather than passive knowledge recipients and facilitated the development of their writing skills [S11] and [S14]. Similarly, ChatGPT allowed students to generate unique ideas and perspectives, leading to deeper analysis and reflection on their journalism writing [S9]. In terms of language proficiency, ChatGPT allowed participants to translate content into their home languages, making it more accessible and relevant to their context [S4]. It also enabled them to request changes in linguistic tones or flavors [S8]. Moreover, participants used it to check grammar or as a dictionary [S11].

4.3 Valuable resource for learning approaches

Five studies demonstrated that students used ChatGPT as a valuable complementary resource for self-directed learning. It provided learning resources and guidance on diverse educational topics and created a supportive home learning environment [S2] and [S4]. Moreover, it offered step-by-step guidance to grasp concepts at their own pace and enhance their understanding [S5], streamlined task and project completion carried out independently [S7], provided comprehensive and easy-to-understand explanations on various subjects [S10], and assisted in studying geometry operations, thereby empowering them to explore geometry operations at their own pace [S12]. Three studies showed that students used ChatGPT as a valuable learning resource for personalized learning. It delivered age-appropriate conversations and tailored teaching based on a child's interests [S4], acted as a personalized learning assistant, adapted to their needs and pace, which assisted them in understanding mathematical concepts [S12], and enabled personalized learning experiences in social sciences by adapting to students' needs and learning styles [S13]. On the other hand, it is important to note that, according to one study [S5], students suggested that using ChatGPT may negatively affect collaborative learning competencies between students.

4.4 Enhancing students' competencies

Six of the reviewed studies have shown that ChatGPT is a valuable tool for improving a wide range of skills among students. Two studies have provided evidence that ChatGPT led to improvements in students' critical thinking, reasoning skills, and hazard recognition competencies through engaging them in interactive conversations or activities and providing responses related to their disciplines in journalism [S5] and construction education [S9]. Furthermore, two studies focused on mathematical education have shown the positive impact of ChatGPT on students' problem-solving abilities in unraveling problem-solving questions [S12] and enhancing the students' understanding of the problem-solving process [S5]. Lastly, one study indicated that ChatGPT effectively contributed to the enhancement of conversational social skills [S4].

4.5 Supporting students' academic success

Seven of the reviewed studies highlighted that students found ChatGPT to be beneficial for learning as it enhanced learning efficiency and improved the learning experience. It has been observed to improve students' efficiency in computer engineering studies by providing well-structured responses and good explanations [S2]. Additionally, students found it extremely useful for hazard reporting [S3], and it also enhanced their efficiency in solving mathematics problems and capabilities [S5] and [S12]. Furthermore, by finding information, generating ideas, translating texts, and providing alternative questions, ChatGPT aided students in deepening their understanding of various subjects [S6]. It contributed to an increase in students' overall productivity [S7] and improved efficiency in composing written tasks [S8]. Regarding learning experiences, ChatGPT was instrumental in assisting students in identifying hazards that they might have otherwise overlooked [S3]. It also improved students' learning experiences in solving mathematics problems and developing abilities [S5] and [S12]. Moreover, it increased students' successful completion of important tasks in their studies [S7], particularly those involving average difficulty writing tasks [S8]. Additionally, ChatGPT increased the chances of educational success by providing students with baseline knowledge on various topics [S10].

5 Teachers' initial attempts at utilizing ChatGPT in teaching and main findings from teachers' perspective

5.1 valuable resource for teaching.

The reviewed studies showed that teachers have employed ChatGPT to recommend, modify, and generate diverse, creative, organized, and engaging educational contents, teaching materials, and testing resources more rapidly [S4], [S6], [S10] and [S11]. Additionally, teachers experienced increased productivity as ChatGPT facilitated quick and accurate responses to questions, fact-checking, and information searches [S1]. It also proved valuable in constructing new knowledge [S6] and providing timely answers to students' questions in classrooms [S11]. Moreover, ChatGPT enhanced teachers' efficiency by generating new ideas for activities and preplanning activities for their students [S4] and [S6], including interactive language game partners [S11].

5.2 Improving productivity and efficiency

The reviewed studies showed that participants' productivity and work efficiency have been significantly enhanced by using ChatGPT as it enabled them to allocate more time to other tasks and reduce their overall workloads [S6], [S10], [S11], [S13], and [S14]. However, three studies [S1], [S4], and [S11], indicated a negative perception and attitude among teachers toward using ChatGPT. This negativity stemmed from a lack of necessary skills to use it effectively [S1], a limited familiarity with it [S4], and occasional inaccuracies in the content provided by it [S10].

5.3 Catalyzing new teaching methodologies

Five of the reviewed studies highlighted that educators found the necessity of redefining their teaching profession with the assistance of ChatGPT [S11], developing new effective learning strategies [S4], and adapting teaching strategies and methodologies to ensure the development of essential skills for future engineers [S5]. They also emphasized the importance of adopting new educational philosophies and approaches that can evolve with the introduction of ChatGPT into the classroom [S12]. Furthermore, updating curricula to focus on improving human-specific features, such as emotional intelligence, creativity, and philosophical perspectives [S13], was found to be essential.

5.4 Effective utilization of CHATGPT in teaching

According to the reviewed studies, effective utilization of ChatGPT in education requires providing teachers with well-structured training, support, and adequate background on how to use ChatGPT responsibly [S1], [S3], [S11], and [S12]. Establishing clear rules and regulations regarding its usage is essential to ensure it positively impacts the teaching and learning processes, including students' skills [S1], [S4], [S5], [S8], [S9], and [S11]-[S14]. Moreover, conducting further research and engaging in discussions with policymakers and stakeholders is indeed crucial for the successful integration of ChatGPT in education and to maximize the benefits for both educators and students [S1], [S6]-[S10], and [S12]-[S14].

6 Discussion

The purpose of this review is to conduct a systematic review of empirical studies that have explored the utilization of ChatGPT, one of today’s most advanced LLM-based chatbots, in education. The findings of the reviewed studies showed several ways of ChatGPT utilization in different learning and teaching practices as well as it provided insights and considerations that can facilitate its effective and responsible use in future educational contexts. The results of the reviewed studies came from diverse fields of education, which helped us avoid a biased review that is limited to a specific field. Similarly, the reviewed studies have been conducted across different geographic regions. This kind of variety in geographic representation enriched the findings of this review.

In response to RQ1 , "What are students' and teachers' initial attempts at utilizing ChatGPT in education?", the findings from this review provide comprehensive insights. Chatbots, including ChatGPT, play a crucial role in supporting student learning, enhancing their learning experiences, and facilitating diverse learning approaches [ 42 , 43 ]. This review found that this tool, ChatGPT, has been instrumental in enhancing students' learning experiences by serving as a virtual intelligent assistant, providing immediate feedback, on-demand answers, and engaging in educational conversations. Additionally, students have benefited from ChatGPT’s ability to generate ideas, compose essays, and perform tasks like summarizing, translating, paraphrasing texts, or checking grammar, thereby enhancing their writing and language competencies. Furthermore, students have turned to ChatGPT for assistance in understanding concepts and homework, providing structured learning plans, and clarifying assignments and tasks, which fosters a supportive home learning environment, allowing them to take responsibility for their own learning and cultivate the skills and approaches essential for supportive home learning environment [ 26 , 27 , 28 ]. This finding aligns with the study of Saqr et al. [ 68 , 69 ] who highlighted that, when students actively engage in their own learning process, it yields additional advantages, such as heightened motivation, enhanced achievement, and the cultivation of enthusiasm, turning them into advocates for their own learning.

Moreover, students have utilized ChatGPT for tailored teaching and step-by-step guidance on diverse educational topics, streamlining task and project completion, and generating and recommending educational content. This personalization enhances the learning environment, leading to increased academic success. This finding aligns with other recent studies [ 26 , 27 , 28 , 60 , 66 ] which revealed that ChatGPT has the potential to offer personalized learning experiences and support an effective learning process by providing students with customized feedback and explanations tailored to their needs and abilities. Ultimately, fostering students' performance, engagement, and motivation, leading to increase students' academic success [ 14 , 44 , 58 ]. This ultimate outcome is in line with the findings of Saqr et al. [ 68 , 69 ], which emphasized that learning strategies are important catalysts of students' learning, as students who utilize effective learning strategies are more likely to have better academic achievement.

Teachers, too, have capitalized on ChatGPT's capabilities to enhance productivity and efficiency, using it for creating lesson plans, generating quizzes, providing additional resources, generating and preplanning new ideas for activities, and aiding in answering students’ questions. This adoption of technology introduces new opportunities to support teaching and learning practices, enhancing teacher productivity. This finding aligns with those of Day [ 17 ], De Castro [ 18 ], and Su and Yang [ 74 ] as well as with those of Valtonen et al. [ 82 ], who revealed that emerging technological advancements have opened up novel opportunities and means to support teaching and learning practices, and enhance teachers’ productivity.

In response to RQ2 , "What are the main findings derived from empirical studies that have incorporated ChatGPT into learning and teaching?", the findings from this review provide profound insights and raise significant concerns. Starting with the insights, chatbots, including ChatGPT, have demonstrated the potential to reshape and revolutionize education, creating new, novel opportunities for enhancing the learning process and outcomes [ 83 ], facilitating different learning approaches, and offering a range of pedagogical benefits [ 19 , 43 , 72 ]. In this context, this review found that ChatGPT could open avenues for educators to adopt or develop new effective learning and teaching strategies that can evolve with the introduction of ChatGPT into the classroom. Nonetheless, there is an evident lack of research understanding regarding the potential impact of generative machine learning models within diverse educational settings [ 83 ]. This necessitates teachers to attain a high level of proficiency in incorporating chatbots, such as ChatGPT, into their classrooms to create inventive, well-structured, and captivating learning strategies. In the same vein, the review also found that teachers without the requisite skills to utilize ChatGPT realized that it did not contribute positively to their work and could potentially have adverse effects [ 37 ]. This concern could lead to inequity of access to the benefits of chatbots, including ChatGPT, as individuals who lack the necessary expertise may not be able to harness their full potential, resulting in disparities in educational outcomes and opportunities. Therefore, immediate action is needed to address these potential issues. A potential solution is offering training, support, and competency development for teachers to ensure that all of them can leverage chatbots, including ChatGPT, effectively and equitably in their educational practices [ 5 , 28 , 80 ], which could enhance accessibility and inclusivity, and potentially result in innovative outcomes [ 82 , 83 ].

Additionally, chatbots, including ChatGPT, have the potential to significantly impact students' thinking abilities, including retention, reasoning, analysis skills [ 19 , 45 ], and foster innovation and creativity capabilities [ 83 ]. This review found that ChatGPT could contribute to improving a wide range of skills among students. However, it found that frequent use of ChatGPT may result in a decrease in innovative capacities, collaborative skills and cognitive capacities, and students' motivation to attend classes, as well as could lead to reduced higher-order thinking skills among students [ 22 , 29 ]. Therefore, immediate action is needed to carefully examine the long-term impact of chatbots such as ChatGPT, on learning outcomes as well as to explore its incorporation into educational settings as a supportive tool without compromising students' cognitive development and critical thinking abilities. In the same vein, the review also found that it is challenging to draw a consistent conclusion regarding the potential of ChatGPT to aid self-directed learning approach. This finding aligns with the recent study of Baskara [ 8 ]. Therefore, further research is needed to explore the potential of ChatGPT for self-directed learning. One potential solution involves utilizing learning analytics as a novel approach to examine various aspects of students' learning and support them in their individual endeavors [ 32 ]. This approach can bridge this gap by facilitating an in-depth analysis of how learners engage with ChatGPT, identifying trends in self-directed learning behavior, and assessing its influence on their outcomes.

Turning to the significant concerns, on the other hand, a fundamental challenge with LLM-based chatbots, including ChatGPT, is the accuracy and quality of the provided information and responses, as they provide false information as truth—a phenomenon often referred to as "hallucination" [ 3 , 49 ]. In this context, this review found that the provided information was not entirely satisfactory. Consequently, the utilization of chatbots presents potential concerns, such as generating and providing inaccurate or misleading information, especially for students who utilize it to support their learning. This finding aligns with other findings [ 6 , 30 , 35 , 40 ] which revealed that incorporating chatbots such as ChatGPT, into education presents challenges related to its accuracy and reliability due to its training on a large corpus of data, which may contain inaccuracies and the way users formulate or ask ChatGPT. Therefore, immediate action is needed to address these potential issues. One possible solution is to equip students with the necessary skills and competencies, which include a background understanding of how to use it effectively and the ability to assess and evaluate the information it generates, as the accuracy and the quality of the provided information depend on the input, its complexity, the topic, and the relevance of its training data [ 28 , 49 , 86 ]. However, it's also essential to examine how learners can be educated about how these models operate, the data used in their training, and how to recognize their limitations, challenges, and issues [ 79 ].

Furthermore, chatbots present a substantial challenge concerning maintaining academic integrity [ 20 , 56 ] and copyright violations [ 83 ], which are significant concerns in education. The review found that the potential misuse of ChatGPT might foster cheating, facilitate plagiarism, and threaten academic integrity. This issue is also affirmed by the research conducted by Basic et al. [ 7 ], who presented evidence that students who utilized ChatGPT in their writing assignments had more plagiarism cases than those who did not. These findings align with the conclusions drawn by Cotton et al. [ 13 ], Hisan and Amri [ 33 ] and Sullivan et al. [ 75 ], who revealed that the integration of chatbots such as ChatGPT into education poses a significant challenge to the preservation of academic integrity. Moreover, chatbots, including ChatGPT, have increased the difficulty in identifying plagiarism [ 47 , 67 , 76 ]. The findings from previous studies [ 1 , 84 ] indicate that AI-generated text often went undetected by plagiarism software, such as Turnitin. However, Turnitin and other similar plagiarism detection tools, such as ZeroGPT, GPTZero, and Copyleaks, have since evolved, incorporating enhanced techniques to detect AI-generated text, despite the possibility of false positives, as noted in different studies that have found these tools still not yet fully ready to accurately and reliably identify AI-generated text [ 10 , 51 ], and new novel detection methods may need to be created and implemented for AI-generated text detection [ 4 ]. This potential issue could lead to another concern, which is the difficulty of accurately evaluating student performance when they utilize chatbots such as ChatGPT assistance in their assignments. Consequently, the most LLM-driven chatbots present a substantial challenge to traditional assessments [ 64 ]. The findings from previous studies indicate the importance of rethinking, improving, and redesigning innovative assessment methods in the era of chatbots [ 14 , 20 , 64 , 75 ]. These methods should prioritize the process of evaluating students' ability to apply knowledge to complex cases and demonstrate comprehension, rather than solely focusing on the final product for assessment. Therefore, immediate action is needed to address these potential issues. One possible solution would be the development of clear guidelines, regulatory policies, and pedagogical guidance. These measures would help regulate the proper and ethical utilization of chatbots, such as ChatGPT, and must be established before their introduction to students [ 35 , 38 , 39 , 41 , 89 ].

In summary, our review has delved into the utilization of ChatGPT, a prominent example of chatbots, in education, addressing the question of how ChatGPT has been utilized in education. However, there remain significant gaps, which necessitate further research to shed light on this area.

7 Conclusions

This systematic review has shed light on the varied initial attempts at incorporating ChatGPT into education by both learners and educators, while also offering insights and considerations that can facilitate its effective and responsible use in future educational contexts. From the analysis of 14 selected studies, the review revealed the dual-edged impact of ChatGPT in educational settings. On the positive side, ChatGPT significantly aided the learning process in various ways. Learners have used it as a virtual intelligent assistant, benefiting from its ability to provide immediate feedback, on-demand answers, and easy access to educational resources. Additionally, it was clear that learners have used it to enhance their writing and language skills, engaging in practices such as generating ideas, composing essays, and performing tasks like summarizing, translating, paraphrasing texts, or checking grammar. Importantly, other learners have utilized it in supporting and facilitating their directed and personalized learning on a broad range of educational topics, assisting in understanding concepts and homework, providing structured learning plans, and clarifying assignments and tasks. Educators, on the other hand, found ChatGPT beneficial for enhancing productivity and efficiency. They used it for creating lesson plans, generating quizzes, providing additional resources, and answers learners' questions, which saved time and allowed for more dynamic and engaging teaching strategies and methodologies.

However, the review also pointed out negative impacts. The results revealed that overuse of ChatGPT could decrease innovative capacities and collaborative learning among learners. Specifically, relying too much on ChatGPT for quick answers can inhibit learners' critical thinking and problem-solving skills. Learners might not engage deeply with the material or consider multiple solutions to a problem. This tendency was particularly evident in group projects, where learners preferred consulting ChatGPT individually for solutions over brainstorming and collaborating with peers, which negatively affected their teamwork abilities. On a broader level, integrating ChatGPT into education has also raised several concerns, including the potential for providing inaccurate or misleading information, issues of inequity in access, challenges related to academic integrity, and the possibility of misusing the technology.

Accordingly, this review emphasizes the urgency of developing clear rules, policies, and regulations to ensure ChatGPT's effective and responsible use in educational settings, alongside other chatbots, by both learners and educators. This requires providing well-structured training to educate them on responsible usage and understanding its limitations, along with offering sufficient background information. Moreover, it highlights the importance of rethinking, improving, and redesigning innovative teaching and assessment methods in the era of ChatGPT. Furthermore, conducting further research and engaging in discussions with policymakers and stakeholders are essential steps to maximize the benefits for both educators and learners and ensure academic integrity.

It is important to acknowledge that this review has certain limitations. Firstly, the limited inclusion of reviewed studies can be attributed to several reasons, including the novelty of the technology, as new technologies often face initial skepticism and cautious adoption; the lack of clear guidelines or best practices for leveraging this technology for educational purposes; and institutional or governmental policies affecting the utilization of this technology in educational contexts. These factors, in turn, have affected the number of studies available for review. Secondly, the utilization of the original version of ChatGPT, based on GPT-3 or GPT-3.5, implies that new studies utilizing the updated version, GPT-4 may lead to different findings. Therefore, conducting follow-up systematic reviews is essential once more empirical studies on ChatGPT are published. Additionally, long-term studies are necessary to thoroughly examine and assess the impact of ChatGPT on various educational practices.

Despite these limitations, this systematic review has highlighted the transformative potential of ChatGPT in education, revealing its diverse utilization by learners and educators alike and summarized the benefits of incorporating it into education, as well as the forefront critical concerns and challenges that must be addressed to facilitate its effective and responsible use in future educational contexts. This review could serve as an insightful resource for practitioners who seek to integrate ChatGPT into education and stimulate further research in the field.

Data availability

The data supporting our findings are available upon request.

Abbreviations

  • Artificial intelligence

AI in education

Large language model

Artificial neural networks

Chat Generative Pre-Trained Transformer

Recurrent neural networks

Long short-term memory

Reinforcement learning from human feedback

Natural language processing

Preferred Reporting Items for Systematic Reviews and Meta-Analyses

AlAfnan MA, Dishari S, Jovic M, Lomidze K. ChatGPT as an educational tool: opportunities, challenges, and recommendations for communication, business writing, and composition courses. J Artif Intell Technol. 2023. https://doi.org/10.37965/jait.2023.0184 .

Article   Google Scholar  

Ali JKM, Shamsan MAA, Hezam TA, Mohammed AAQ. Impact of ChatGPT on learning motivation. J Engl Stud Arabia Felix. 2023;2(1):41–9. https://doi.org/10.56540/jesaf.v2i1.51 .

Alkaissi H, McFarlane SI. Artificial hallucinations in ChatGPT: implications in scientific writing. Cureus. 2023. https://doi.org/10.7759/cureus.35179 .

Anderson N, Belavý DL, Perle SM, Hendricks S, Hespanhol L, Verhagen E, Memon AR. AI did not write this manuscript, or did it? Can we trick the AI text detector into generated texts? The potential future of ChatGPT and AI in sports & exercise medicine manuscript generation. BMJ Open Sport Exerc Med. 2023;9(1): e001568. https://doi.org/10.1136/bmjsem-2023-001568 .

Ausat AMA, Massang B, Efendi M, Nofirman N, Riady Y. Can chat GPT replace the role of the teacher in the classroom: a fundamental analysis. J Educ. 2023;5(4):16100–6.

Google Scholar  

Baidoo-Anu D, Ansah L. Education in the Era of generative artificial intelligence (AI): understanding the potential benefits of ChatGPT in promoting teaching and learning. Soc Sci Res Netw. 2023. https://doi.org/10.2139/ssrn.4337484 .

Basic Z, Banovac A, Kruzic I, Jerkovic I. Better by you, better than me, chatgpt3 as writing assistance in students essays. 2023. arXiv preprint arXiv:2302.04536 .‏

Baskara FR. The promises and pitfalls of using chat GPT for self-determined learning in higher education: an argumentative review. Prosiding Seminar Nasional Fakultas Tarbiyah dan Ilmu Keguruan IAIM Sinjai. 2023;2:95–101. https://doi.org/10.47435/sentikjar.v2i0.1825 .

Behera RK, Bala PK, Dhir A. The emerging role of cognitive computing in healthcare: a systematic literature review. Int J Med Inform. 2019;129:154–66. https://doi.org/10.1016/j.ijmedinf.2019.04.024 .

Chaka C. Detecting AI content in responses generated by ChatGPT, YouChat, and Chatsonic: the case of five AI content detection tools. J Appl Learn Teach. 2023. https://doi.org/10.37074/jalt.2023.6.2.12 .

Chiu TKF, Xia Q, Zhou X, Chai CS, Cheng M. Systematic literature review on opportunities, challenges, and future research recommendations of artificial intelligence in education. Comput Educ Artif Intell. 2023;4:100118. https://doi.org/10.1016/j.caeai.2022.100118 .

Choi EPH, Lee JJ, Ho M, Kwok JYY, Lok KYW. Chatting or cheating? The impacts of ChatGPT and other artificial intelligence language models on nurse education. Nurse Educ Today. 2023;125:105796. https://doi.org/10.1016/j.nedt.2023.105796 .

Cotton D, Cotton PA, Shipway JR. Chatting and cheating: ensuring academic integrity in the era of ChatGPT. Innov Educ Teach Int. 2023. https://doi.org/10.1080/14703297.2023.2190148 .

Crawford J, Cowling M, Allen K. Leadership is needed for ethical ChatGPT: Character, assessment, and learning using artificial intelligence (AI). J Univ Teach Learn Pract. 2023. https://doi.org/10.53761/1.20.3.02 .

Creswell JW. Educational research: planning, conducting, and evaluating quantitative and qualitative research [Ebook]. 4th ed. London: Pearson Education; 2015.

Curry D. ChatGPT Revenue and Usage Statistics (2023)—Business of Apps. 2023. https://www.businessofapps.com/data/chatgpt-statistics/

Day T. A preliminary investigation of fake peer-reviewed citations and references generated by ChatGPT. Prof Geogr. 2023. https://doi.org/10.1080/00330124.2023.2190373 .

De Castro CA. A Discussion about the Impact of ChatGPT in education: benefits and concerns. J Bus Theor Pract. 2023;11(2):p28. https://doi.org/10.22158/jbtp.v11n2p28 .

Deng X, Yu Z. A meta-analysis and systematic review of the effect of Chatbot technology use in sustainable education. Sustainability. 2023;15(4):2940. https://doi.org/10.3390/su15042940 .

Eke DO. ChatGPT and the rise of generative AI: threat to academic integrity? J Responsib Technol. 2023;13:100060. https://doi.org/10.1016/j.jrt.2023.100060 .

Elmoazen R, Saqr M, Tedre M, Hirsto L. A systematic literature review of empirical research on epistemic network analysis in education. IEEE Access. 2022;10:17330–48. https://doi.org/10.1109/access.2022.3149812 .

Farrokhnia M, Banihashem SK, Noroozi O, Wals AEJ. A SWOT analysis of ChatGPT: implications for educational practice and research. Innov Educ Teach Int. 2023. https://doi.org/10.1080/14703297.2023.2195846 .

Fergus S, Botha M, Ostovar M. Evaluating academic answers generated using ChatGPT. J Chem Educ. 2023;100(4):1672–5. https://doi.org/10.1021/acs.jchemed.3c00087 .

Fink A. Conducting research literature reviews: from the Internet to Paper. Incorporated: SAGE Publications; 2010.

Firaina R, Sulisworo D. Exploring the usage of ChatGPT in higher education: frequency and impact on productivity. Buletin Edukasi Indonesia (BEI). 2023;2(01):39–46. https://doi.org/10.56741/bei.v2i01.310 .

Firat, M. (2023). How chat GPT can transform autodidactic experiences and open education.  Department of Distance Education, Open Education Faculty, Anadolu Unive .‏ https://orcid.org/0000-0001-8707-5918

Firat M. What ChatGPT means for universities: perceptions of scholars and students. J Appl Learn Teach. 2023. https://doi.org/10.37074/jalt.2023.6.1.22 .

Fuchs K. Exploring the opportunities and challenges of NLP models in higher education: is Chat GPT a blessing or a curse? Front Educ. 2023. https://doi.org/10.3389/feduc.2023.1166682 .

García-Peñalvo FJ. La percepción de la inteligencia artificial en contextos educativos tras el lanzamiento de ChatGPT: disrupción o pánico. Educ Knowl Soc. 2023;24: e31279. https://doi.org/10.14201/eks.31279 .

Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor A, Chartash D. How does ChatGPT perform on the United States medical Licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9: e45312. https://doi.org/10.2196/45312 .

Hashana AJ, Brundha P, Ayoobkhan MUA, Fazila S. Deep Learning in ChatGPT—A Survey. In   2023 7th international conference on trends in electronics and informatics (ICOEI) . 2023. (pp. 1001–1005). IEEE. https://doi.org/10.1109/icoei56765.2023.10125852

Hirsto L, Saqr M, López-Pernas S, Valtonen T. (2022). A systematic narrative review of learning analytics research in K-12 and schools.  Proceedings . https://ceur-ws.org/Vol-3383/FLAIEC22_paper_9536.pdf

Hisan UK, Amri MM. ChatGPT and medical education: a double-edged sword. J Pedag Educ Sci. 2023;2(01):71–89. https://doi.org/10.13140/RG.2.2.31280.23043/1 .

Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023. https://doi.org/10.1093/jncics/pkad010 .

Househ M, AlSaad R, Alhuwail D, Ahmed A, Healy MG, Latifi S, Sheikh J. Large Language models in medical education: opportunities, challenges, and future directions. JMIR Med Educ. 2023;9: e48291. https://doi.org/10.2196/48291 .

Ilkka T. The impact of artificial intelligence on learning, teaching, and education. Minist de Educ. 2018. https://doi.org/10.2760/12297 .

Iqbal N, Ahmed H, Azhar KA. Exploring teachers’ attitudes towards using CHATGPT. Globa J Manag Adm Sci. 2022;3(4):97–111. https://doi.org/10.46568/gjmas.v3i4.163 .

Irfan M, Murray L, Ali S. Integration of Artificial intelligence in academia: a case study of critical teaching and learning in Higher education. Globa Soc Sci Rev. 2023;8(1):352–64. https://doi.org/10.31703/gssr.2023(viii-i).32 .

Jeon JH, Lee S. Large language models in education: a focus on the complementary relationship between human teachers and ChatGPT. Educ Inf Technol. 2023. https://doi.org/10.1007/s10639-023-11834-1 .

Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT—Reshaping medical education and clinical management. Pak J Med Sci. 2023. https://doi.org/10.12669/pjms.39.2.7653 .

King MR. A conversation on artificial intelligence, Chatbots, and plagiarism in higher education. Cell Mol Bioeng. 2023;16(1):1–2. https://doi.org/10.1007/s12195-022-00754-8 .

Kooli C. Chatbots in education and research: a critical examination of ethical implications and solutions. Sustainability. 2023;15(7):5614. https://doi.org/10.3390/su15075614 .

Kuhail MA, Alturki N, Alramlawi S, Alhejori K. Interacting with educational chatbots: a systematic review. Educ Inf Technol. 2022;28(1):973–1018. https://doi.org/10.1007/s10639-022-11177-3 .

Lee H. The rise of ChatGPT: exploring its potential in medical education. Anat Sci Educ. 2023. https://doi.org/10.1002/ase.2270 .

Li L, Subbareddy R, Raghavendra CG. AI intelligence Chatbot to improve students learning in the higher education platform. J Interconnect Netw. 2022. https://doi.org/10.1142/s0219265921430325 .

Limna P. A Review of Artificial Intelligence (AI) in Education during the Digital Era. 2022. https://ssrn.com/abstract=4160798

Lo CK. What is the impact of ChatGPT on education? A rapid review of the literature. Educ Sci. 2023;13(4):410. https://doi.org/10.3390/educsci13040410 .

Luo W, He H, Liu J, Berson IR, Berson MJ, Zhou Y, Li H. Aladdin’s genie or pandora’s box For early childhood education? Experts chat on the roles, challenges, and developments of ChatGPT. Early Educ Dev. 2023. https://doi.org/10.1080/10409289.2023.2214181 .

Meyer JG, Urbanowicz RJ, Martin P, O’Connor K, Li R, Peng P, Moore JH. ChatGPT and large language models in academia: opportunities and challenges. Biodata Min. 2023. https://doi.org/10.1186/s13040-023-00339-9 .

Mhlanga D. Open AI in education, the responsible and ethical use of ChatGPT towards lifelong learning. Soc Sci Res Netw. 2023. https://doi.org/10.2139/ssrn.4354422 .

Neumann, M., Rauschenberger, M., & Schön, E. M. (2023). “We Need To Talk About ChatGPT”: The Future of AI and Higher Education.‏ https://doi.org/10.1109/seeng59157.2023.00010

Nolan B. Here are the schools and colleges that have banned the use of ChatGPT over plagiarism and misinformation fears. Business Insider . 2023. https://www.businessinsider.com

O’Leary DE. An analysis of three chatbots: BlenderBot, ChatGPT and LaMDA. Int J Intell Syst Account, Financ Manag. 2023;30(1):41–54. https://doi.org/10.1002/isaf.1531 .

Okoli C. A guide to conducting a standalone systematic literature review. Commun Assoc Inf Syst. 2015. https://doi.org/10.17705/1cais.03743 .

OpenAI. (2023). https://openai.com/blog/chatgpt

Perkins M. Academic integrity considerations of AI large language models in the post-pandemic era: ChatGPT and beyond. J Univ Teach Learn Pract. 2023. https://doi.org/10.53761/1.20.02.07 .

Plevris V, Papazafeiropoulos G, Rios AJ. Chatbots put to the test in math and logic problems: A preliminary comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard. arXiv (Cornell University) . 2023. https://doi.org/10.48550/arxiv.2305.18618

Rahman MM, Watanobe Y (2023) ChatGPT for education and research: opportunities, threats, and strategies. Appl Sci 13(9):5783. https://doi.org/10.3390/app13095783

Ram B, Verma P. Artificial intelligence AI-based Chatbot study of ChatGPT, google AI bard and baidu AI. World J Adv Eng Technol Sci. 2023;8(1):258–61. https://doi.org/10.30574/wjaets.2023.8.1.0045 .

Rasul T, Nair S, Kalendra D, Robin M, de Oliveira Santini F, Ladeira WJ, Heathcote L. The role of ChatGPT in higher education: benefits, challenges, and future research directions. J Appl Learn Teach. 2023. https://doi.org/10.37074/jalt.2023.6.1.29 .

Ratnam M, Sharm B, Tomer A. ChatGPT: educational artificial intelligence. Int J Adv Trends Comput Sci Eng. 2023;12(2):84–91. https://doi.org/10.30534/ijatcse/2023/091222023 .

Ray PP. ChatGPT: a comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet Things Cyber-Phys Syst. 2023;3:121–54. https://doi.org/10.1016/j.iotcps.2023.04.003 .

Roumeliotis KI, Tselikas ND. ChatGPT and Open-AI models: a preliminary review. Future Internet. 2023;15(6):192. https://doi.org/10.3390/fi15060192 .

Rudolph J, Tan S, Tan S. War of the chatbots: Bard, Bing Chat, ChatGPT, Ernie and beyond. The new AI gold rush and its impact on higher education. J Appl Learn Teach. 2023. https://doi.org/10.37074/jalt.2023.6.1.23 .

Ruiz LMS, Moll-López S, Nuñez-Pérez A, Moraño J, Vega-Fleitas E. ChatGPT challenges blended learning methodologies in engineering education: a case study in mathematics. Appl Sci. 2023;13(10):6039. https://doi.org/10.3390/app13106039 .

Sallam M, Salim NA, Barakat M, Al-Tammemi AB. ChatGPT applications in medical, dental, pharmacy, and public health education: a descriptive study highlighting the advantages and limitations. Narra J. 2023;3(1): e103. https://doi.org/10.52225/narra.v3i1.103 .

Salvagno M, Taccone FS, Gerli AG. Can artificial intelligence help for scientific writing? Crit Care. 2023. https://doi.org/10.1186/s13054-023-04380-2 .

Saqr M, López-Pernas S, Helske S, Hrastinski S. The longitudinal association between engagement and achievement varies by time, students’ profiles, and achievement state: a full program study. Comput Educ. 2023;199:104787. https://doi.org/10.1016/j.compedu.2023.104787 .

Saqr M, Matcha W, Uzir N, Jovanović J, Gašević D, López-Pernas S. Transferring effective learning strategies across learning contexts matters: a study in problem-based learning. Australas J Educ Technol. 2023;39(3):9.

Schöbel S, Schmitt A, Benner D, Saqr M, Janson A, Leimeister JM. Charting the evolution and future of conversational agents: a research agenda along five waves and new frontiers. Inf Syst Front. 2023. https://doi.org/10.1007/s10796-023-10375-9 .

Shoufan A. Exploring students’ perceptions of CHATGPT: thematic analysis and follow-up survey. IEEE Access. 2023. https://doi.org/10.1109/access.2023.3268224 .

Sonderegger S, Seufert S. Chatbot-mediated learning: conceptual framework for the design of Chatbot use cases in education. Gallen: Institute for Educational Management and Technologies, University of St; 2022. https://doi.org/10.5220/0010999200003182 .

Book   Google Scholar  

Strzelecki A. To use or not to use ChatGPT in higher education? A study of students’ acceptance and use of technology. Interact Learn Environ. 2023. https://doi.org/10.1080/10494820.2023.2209881 .

Su J, Yang W. Unlocking the power of ChatGPT: a framework for applying generative AI in education. ECNU Rev Educ. 2023. https://doi.org/10.1177/20965311231168423 .

Sullivan M, Kelly A, McLaughlan P. ChatGPT in higher education: Considerations for academic integrity and student learning. J ApplLearn Teach. 2023;6(1):1–10. https://doi.org/10.37074/jalt.2023.6.1.17 .

Szabo A. ChatGPT is a breakthrough in science and education but fails a test in sports and exercise psychology. Balt J Sport Health Sci. 2023;1(128):25–40. https://doi.org/10.33607/bjshs.v127i4.1233 .

Taecharungroj V. “What can ChatGPT do?” analyzing early reactions to the innovative AI chatbot on Twitter. Big Data Cognit Comput. 2023;7(1):35. https://doi.org/10.3390/bdcc7010035 .

Tam S, Said RB. User preferences for ChatGPT-powered conversational interfaces versus traditional methods. Biomed Eng Soc. 2023. https://doi.org/10.58496/mjcsc/2023/004 .

Tedre M, Kahila J, Vartiainen H. (2023). Exploration on how co-designing with AI facilitates critical evaluation of ethics of AI in craft education. In: Langran E, Christensen P, Sanson J (Eds).  Proceedings of Society for Information Technology and Teacher Education International Conference . 2023. pp. 2289–2296.

Tlili A, Shehata B, Adarkwah MA, Bozkurt A, Hickey DT, Huang R, Agyemang B. What if the devil is my guardian angel: ChatGPT as a case study of using chatbots in education. Smart Learn Environ. 2023. https://doi.org/10.1186/s40561-023-00237-x .

Uddin SMJ, Albert A, Ovid A, Alsharef A. Leveraging CHATGPT to aid construction hazard recognition and support safety education and training. Sustainability. 2023;15(9):7121. https://doi.org/10.3390/su15097121 .

Valtonen T, López-Pernas S, Saqr M, Vartiainen H, Sointu E, Tedre M. The nature and building blocks of educational technology research. Comput Hum Behav. 2022;128:107123. https://doi.org/10.1016/j.chb.2021.107123 .

Vartiainen H, Tedre M. Using artificial intelligence in craft education: crafting with text-to-image generative models. Digit Creat. 2023;34(1):1–21. https://doi.org/10.1080/14626268.2023.2174557 .

Ventayen RJM. OpenAI ChatGPT generated results: similarity index of artificial intelligence-based contents. Soc Sci Res Netw. 2023. https://doi.org/10.2139/ssrn.4332664 .

Wagner MW, Ertl-Wagner BB. Accuracy of information and references using ChatGPT-3 for retrieval of clinical radiological information. Can Assoc Radiol J. 2023. https://doi.org/10.1177/08465371231171125 .

Wardat Y, Tashtoush MA, AlAli R, Jarrah AM. ChatGPT: a revolutionary tool for teaching and learning mathematics. Eurasia J Math, Sci Technol Educ. 2023;19(7):em2286. https://doi.org/10.29333/ejmste/13272 .

Webster J, Watson RT. Analyzing the past to prepare for the future: writing a literature review. Manag Inf Syst Quart. 2002;26(2):3.

Xiao Y, Watson ME. Guidance on conducting a systematic literature review. J Plan Educ Res. 2017;39(1):93–112. https://doi.org/10.1177/0739456x17723971 .

Yan D. Impact of ChatGPT on learners in a L2 writing practicum: an exploratory investigation. Educ Inf Technol. 2023. https://doi.org/10.1007/s10639-023-11742-4 .

Yu H. Reflection on whether Chat GPT should be banned by academia from the perspective of education and teaching. Front Psychol. 2023;14:1181712. https://doi.org/10.3389/fpsyg.2023.1181712 .

Zhu C, Sun M, Luo J, Li T, Wang M. How to harness the potential of ChatGPT in education? Knowl Manag ELearn. 2023;15(2):133–52. https://doi.org/10.34105/j.kmel.2023.15.008 .

Download references

The paper is co-funded by the Academy of Finland (Suomen Akatemia) Research Council for Natural Sciences and Engineering for the project Towards precision education: Idiographic learning analytics (TOPEILA), Decision Number 350560.

Author information

Authors and affiliations.

School of Computing, University of Eastern Finland, 80100, Joensuu, Finland

Yazid Albadarin, Mohammed Saqr, Nicolas Pope & Markku Tukiainen

You can also search for this author in PubMed   Google Scholar

Contributions

YA contributed to the literature search, data analysis, discussion, and conclusion. Additionally, YA contributed to the manuscript’s writing, editing, and finalization. MS contributed to the study’s design, conceptualization, acquisition of funding, project administration, allocation of resources, supervision, validation, literature search, and analysis of results. Furthermore, MS contributed to the manuscript's writing, revising, and approving it in its finalized state. NP contributed to the results, and discussions, and provided supervision. NP also contributed to the writing process, revisions, and the final approval of the manuscript in its finalized state. MT contributed to the study's conceptualization, resource management, supervision, writing, revising the manuscript, and approving it.

Corresponding author

Correspondence to Yazid Albadarin .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

See Table  4

The process of synthesizing the data presented in Table  4 involved identifying the relevant studies through a search process of databases (ERIC, Scopus, Web of Knowledge, Dimensions.ai, and lens.org) using specific keywords "ChatGPT" and "education". Following this, inclusion/exclusion criteria were applied, and data extraction was performed using Creswell's [ 15 ] coding techniques to capture key information and identify common themes across the included studies.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Albadarin, Y., Saqr, M., Pope, N. et al. A systematic literature review of empirical research on ChatGPT in education. Discov Educ 3 , 60 (2024). https://doi.org/10.1007/s44217-024-00138-2

Download citation

Received : 22 October 2023

Accepted : 10 May 2024

Published : 26 May 2024

DOI : https://doi.org/10.1007/s44217-024-00138-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Large language models
  • Educational technology
  • Systematic review

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

INNOVATIONS in pharmacy

Vol. 15 No. 2 (2024)

Copyright (c) 2024 Patrick Gallegos, Salaar, Michael

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License .

Copyright of content published in INNOVATIONS in pharmacy  belongs to the author(s).

Leadership and Followership in Health Professions: A Systematic Review

Patrick Gallegos

Cleveland Clinic Akron General

Muhammad Salaar Riaz

Nassau University Medical Center

Michael Peeters

University of Toledo

DOI: https://doi.org/10.24926/iip.v15i2.5987

Keywords: Leadership, Followership, Health Professions

Objective: Leadership discussion, including leadership development programs, is common. However, discussion of followership as a component of leadership seems less frequently discussed. With a focus on leadership and followership, this investigation reviewed the health-professions education literature and characterized leadership-followership within health-professions education.

Methods : Using PubMed, ERIC, and Google Scholar, two investigators independently and systematically searched health-professions education literature for articles related to leadership and followership. Reports were categorized based on the articles by type, application, profession, leadership, and followership qualities.

Results: Eighty-one articles were included. More than half (48/81, 59%) were theoretical, 27% (22/81) empirical, 7% (6/81) commentaries, and 6% (5/81) letters-to-the-editor). Empirical studies did not share outcomes that could be meaningfully combined quantitatively by meta-analysis; however, the vast majority (96%) of theoretical articles discussed a healthcare-related application of leadership and followership (e.g., improving patient care, improving communication, improving organizational efficiency). Thus, a qualitative review was completed. Of the 81 articles, 57% (n=46) involved multiple professions, while 43% (n=35) focused on a specific profession [Nursing (n=16), Medicine (n=7), Others (n=5) Surgery (n=3), Pharmacy (n=2), Veterinary Medicine (n=2)]. While most articles (75%) discussed leadership qualities (with top qualities of effective communication, visionary, and delegating tasks), fewer (57%) discussed followership qualities (with top qualities of being responsible, committed, and supportive). Of note, some qualities overlapped in both leadership and followership (with top qualities of effective communication, being supportive, and providing/receiving feedback).

Conclusions: Leadership-Followership was described in many health-professions’ education literature. However, Pharmacy and Veterinary Medicine had substantially fewer articles published on this topic. Notably, followership did not receive nearly as much attention as leadership. Leadership has a dynamic and complex interaction with followership highlighting that an effective leader must know how to be an effective follower and vice versa. To improve leadership within healthcare teamwork, education should focus on both leadership-followership.  

Author Biographies

Muhammad salaar riaz, nassau university medical center.

Internal Medicine Resident

Michael Peeters, University of Toledo

Director of Interprofessional Education

Image of University M logo with text Libraries Publishing

Contact Publishing Services | Acceptable Use of IT Resources

The copyright of these individual works published by the University of Minnesota Libraries Publishing remains with the original creator or editorial team. For uses beyond those covered by law or the Creative Commons license, permission to reuse should be sought directly from the copyright owner listed on each article.

IMAGES

  1. Writing Literature Reviews

    literature review differentiation in education

  2. Literature Review Guidelines

    literature review differentiation in education

  3. education systematic literature review

    literature review differentiation in education

  4. What is Differentiation in Education?

    literature review differentiation in education

  5. steps of literature review slideshare

    literature review differentiation in education

  6. Literature Review

    literature review differentiation in education

VIDEO

  1. basic of differentiation rule قوانين الاشتقاق الاساسية

  2. Review of Basic Differentiation Rules (Calculus I)

  3. Differentiated Instruction

  4. Successive Differentiation

  5. Basic Math Review: Differentiation Rules Hacks Tutorial

  6. Herman Yeung

COMMENTS

  1. PDF Literature Review: Differentiation in Education Chantel Bushie Abstract

    The purpose of this literature review is to explore the nature of differentiated instruction in education. Through the duration of the graduate course Interpreting Educational Research, I ... differentiated instruction in education is discussed, and a suggestion for further research is noted. Education is an integral component in my life. As a ...

  2. Differentiation in education: a configurative review

    The same goes for differentiation. This review finds that the concept of differentiation is used in different ways and within different educational contexts, and this supports the results of other recent literature reviews on differentiation (Bondi et al., Citation 2019; Graham et al., Citation 2021). At the same time, our endeavour to give an ...

  3. [PDF] Literature Review: Differentiation in Education ...

    The purpose of this literature review is to explore the nature of differentiated instruction in education. Through the duration of the graduate course Interpreting Educational Research, I extensively researched the topic of differentiated instruction. My belief is that differentiated instruction is an expected approach to teaching and learning, because teachers offer choice of authentic and ...

  4. How Does Changing "One-Size-Fits-All" to Differentiated Instruction

    This rigorous literature review analyzed how 28 U.S.-based research studies conducted between 2001 and 2015 have defined, described, and measured changes in teaching practices related to implementation of Differentiated Instruction (DI) in P-12 classrooms.

  5. Meeting the Needs and Potentials of High-Ability, High-Performing, and

    In the field of education, differentiation has broadly been defined as an educators' strategic use of different approaches that accommodate student diversity in ways that will maximize learning, ... They differ from a literature review in that they follow a set of guidelines that make the process more systematic, transparent, and reproducible

  6. Differentiated instruction in secondary education: A systematic review

    Differentiated instruction is a pedagogical-didactical approach that provides teachers with a starting point for meeting students' diverse learning needs. Although differentiated instruction has gained a lot of attention in practice and research, not much is known about the status of the empirical evidence and its benefits for enhancing student achievement in secondary education. The current ...

  7. Five different ways of conceptualizing differentiation in education

    Introduction. Differentiation is an ambiguous term in education. On the one hand, it is associated with pedagogical and didactic coping strategies when confronted with diverse classrooms and mixed-ability environments (cf. e. g. Tomlinson, Citation 2010).Differentiation, in this strand of research, is an expected ingredient in the educator's set of skills and competences.

  8. Differentiated Instruction in Secondary Education: A Systematic Review

    Then, by means of a systematic review of the literature from 2006 to 2016, empirical evidence on the effects of within-class differentiated instruction for secondary school students' academic achievement is evaluated and summarized. ... The bulk of studies in secondary education focus on differentiation of students between classes by means of ...

  9. Differentiating Instruction in Response to Student Readiness, Interest

    This review of literature examines a need for "differentiated" or academically responsive instruction. It provides support in theory and research for differentiating instruction based on a model of addressing student readiness, interest, and learning profile for a broad range of learners in mixed-ability classroom settings.

  10. Utilization of differentiated instruction in K-12 classrooms: a

    Differentiated instruction (DI) is a beneficial approach to addressing students' diverse learning needs, abilities, and interests to ensure that each student has the opportunity to make academic progress. To answer the question of how teachers utilize DI in K-12 classrooms, this systematic review was based on 61 empirical studies on DI published between 2000 and 2022. It examined the current ...

  11. Differentiated teaching practices of Australian mainstream classroom

    A grey literature review via Google Scholar was also completed, using the same combinations of words used for the database literature search. ... Some teachers used differentiation, special education, and learning support interchangeably, while others indicated mixed feelings about differentiation due to a perceived alignment to high academic ...

  12. Barriers in Differentiated Instruction: a Systematic Review of The

    Then, by means of a systematic review of the literature from 2006 to 2016, empirical evidence on the effects of within-class differentiated instruction for secondary school students' academic ...

  13. PDF Personalised and Differentiated Learning: a systematic literature review

    Personalisation and differentiation are rooted within inclusive education philosophy that argues that diversity is to be found in any group of learners, and therefore, educators should adjust their instruction accordingly (Lindner, Alnahdi, Wahl, & Schwab, 2019). The same holds true for adult education.

  14. Differentiation in education: a configurative review

    Conclusion. Differentiation is a multi-faceted and contextual con- cept that is not easy to constrain or study. This configurative review has aimed at investigating how the concept of differentiation has been conceptua- lized in recent empirical and literature reviews.

  15. Addressing the Needs of Diverse Learners Through Differentiated

    philosophy and practice of differentiated instruction as a way to meet the needs of diverse learners in heterogeneous classrooms. The literature review ends with a summary of the literature on change processes in education, recommendations for further research in the field, and a proposal for this study. Impact of Inclusion History of inclusion

  16. Differentiated Instruction in Secondary Education: A Systematic Review

    Then, by means of a systematic review of the literature from 2006 to 2016, empirical evidence on the effects of within-class differentiated instruction for secondary school students' academic ...

  17. Differentiated Instruction in Secondary Education: A Systematic Review

    Then, by means of a systematic review of the literature from 2006 to 2016, empirical evidence on the effects of within-class differentiated instruction for secondary school students' academic achievement is evaluated and summarized. After a rigorous search and selection process, only 14 papers about 12 unique empirical studies on the topic were ...

  18. Differentiation and individualisation in inclusive education: a

    This systematic literature review could focus on only a small amount of literature on inclusive teaching practices in self-proclaimed inclusive education settings. Asserting that the educational setting was within the scope of inclusion by studies and authors is one of the major limitations within the narrative synthesis.

  19. Do we have to rethink inclusive pedagogies for secondary schools? A

    This systematic literature review 'speaks to' and builds on two previous literature reviews: one on inclusive practices (Finkelstein et al., 2021) and one on individualisation and differentiation (Lindner & Schwab, 2020). These reviews focus largely on mapping instructional and organisational practices that are conducted 'in the name ...

  20. (PDF) Assessing the Effectiveness of Differentiated Instruction

    Then, by means of a systematic review of the literature from 2006 to 2016, empirical evidence on the effects of within-class differentiated instruction for secondary school students' academic ...

  21. A systematic literature review of empirical research on ChatGPT in

    Over the last four decades, studies have investigated the incorporation of Artificial Intelligence (AI) into education. A recent prominent AI-powered technology that has impacted the education sector is ChatGPT. This article provides a systematic review of 14 empirical studies incorporating ChatGPT into various educational settings, published in 2022 and before the 10th of April 2023—the ...

  22. PDF Educational strategies that can reduce child labour in India: A

    Sudeshna Maitra for their support in literature review and analysis. We are further grateful to the tremendous support and technical assistance of UNICEF India colleagues in New Delhi, Lucknow and Patna offices, and to all UNICEF colleagues who provided review and valuable feedback on earlier drafts of this working paper.

  23. Impact of the Newspaper in Education Program and Parental Mediation on

    The need for media literacy education is being increasingly emphasized in the current media environment, where large amounts of information are being easily and widely dissipated owing to the development of digital technology (Bai, 2014; Mun & Lee, 2015).Scholars in the education and communication fields are attempting to find systematic and effective ways to introduce media education to ...

  24. Leadership and Followership in Health Professions: A Systematic Review

    Abstract. Objective: Leadership discussion, including leadership development programs, is common. However, discussion of followership as a component of leadership seems less frequently discussed. With a focus on leadership and followership, this investigation reviewed the health-professions education literature and characterized leadership-followership within health-professions education.

  25. (PDF) Differentiation in education: a configurative review

    Differentiation in education: a configurative review. Ingunn Eikeland and Stein Erik Ohna. Department of Education and Sports Science, University of Stavanger, Stavanger, Norway. ABSTRACT ...

  26. Texas Education Agency Unveils Newly Developed Texas Open Education

    AUSTIN, TX - May 29, 2024 - The Texas Education Agency (TEA) today announced the availability of the Texas Open Education Resources (OER) textbooks, to begin a public feedback process. House Bill (HB) 1605 (88th Regular Session) directed the TEA to develop a set of state-owned instructional materials, Texas OER textbooks, to support ...