• Open access
  • Published: 17 February 2022

Effectiveness of problem-based learning methodology in undergraduate medical education: a scoping review

  • Joan Carles Trullàs   ORCID: orcid.org/0000-0002-7380-3475 1 , 2 , 3 ,
  • Carles Blay   ORCID: orcid.org/0000-0003-3962-5887 1 , 4 ,
  • Elisabet Sarri   ORCID: orcid.org/0000-0002-2435-399X 3 &
  • Ramon Pujol   ORCID: orcid.org/0000-0003-2527-385X 1  

BMC Medical Education volume  22 , Article number:  104 ( 2022 ) Cite this article

34k Accesses

92 Citations

11 Altmetric

Metrics details

Problem-based learning (PBL) is a pedagogical approach that shifts the role of the teacher to the student (student-centered) and is based on self-directed learning. Although PBL has been adopted in undergraduate and postgraduate medical education, the effectiveness of the method is still under discussion. The author’s purpose was to appraise available international evidence concerning to the effectiveness and usefulness of PBL methodology in undergraduate medical teaching programs.

The authors applied the Arksey and O’Malley framework to undertake a scoping review. The search was carried out in February 2021 in PubMed and Web of Science including all publications in English and Spanish with no limits on publication date, study design or country of origin.

The literature search identified one hundred and twenty-four publications eligible for this review. Despite the fact that this review included many studies, their design was heterogeneous and only a few provided a high scientific evidence methodology (randomized design and/or systematic reviews with meta-analysis). Furthermore, most were single-center experiences with small sample size and there were no large multi-center studies. PBL methodology obtained a high level of satisfaction, especially among students. It was more effective than other more traditional (or lecture-based methods) at improving social and communication skills, problem-solving and self-learning skills. Knowledge retention and academic performance weren’t worse (and in many studies were better) than with traditional methods. PBL was not universally widespread, probably because requires greater human resources and continuous training for its implementation.

PBL is an effective and satisfactory methodology for medical education. It is likely that through PBL medical students will not only acquire knowledge but also other competencies that are needed in medical professionalism.

Peer Review reports

There has always been enormous interest in identifying the best learning methods. In the mid-twentieth century, US educator Edgar Dale proposed which actions would lead to deeper learning than others and published the well-known (and at the same time controversial) “Cone of Experience or Cone of Dale”. At the apex of the cone are oral representations (verbal descriptions, written descriptions, etc.) and at the base is direct experience (based on a person carrying out the activity that they aim to learn), which represents the greatest depth of our learning. In other words, each level of the cone corresponds to various learning methods. At the base are the most effective, participative methods (what we do and what we say) and at the apex are the least effective, abstract methods (what we read and what we hear) [ 1 ]. In 1990, psychologist George Miller proposed a framework pyramid to assess clinical competence. At the lowest level of the pyramid is knowledge (knows), followed by the competence (knows how), execution (shows how) and finally the action (does) [ 2 ]. Both Miller’s pyramid and Dale’s cone propose a very efficient way of training and, at the same time, of evaluation. Miller suggested that the learning curve passes through various levels, from the acquisition of theoretical knowledge to knowing how to put this knowledge into practice and demonstrate it. Dale stated that to remember a high percentage of the acquired knowledge, a theatrical representation should be carried out or real experiences should be simulated. It is difficult to situate methodologies such as problem-based learning (PBL), case-based learning (CBL) and team-based learning (TBL) in the context of these learning frameworks.

In the last 50 years, various university education models have emerged and have attempted to reconcile teaching with learning, according to the principle that students should lead their own learning process. Perhaps one of the most successful models is PBL that came out of the English-speaking environment. There are many descriptions of PBL in the literature, but in practice there is great variability in what people understand by this methodology. The original conception of PBL as an educational strategy in medicine was initiated at McMaster University (Canada) in 1969, leaving aside the traditional methodology (which is often based on lectures) and introducing student-centered learning. The new formulation of medical education proposed by McMaster did not separate the basic sciences from the clinical sciences, and partially abandoned theoretical classes, which were taught after the presentation of the problem. In its original version, PBL is a methodology in which the starting point is a problem or a problematic situation. The situation enables students to develop a hypothesis and identify learning needs so that they can better understand the problem and meet the established learning objectives [ 3 , 4 ]. PBL is taught using small groups (usually around 8–10 students) with a tutor. The aim of the group sessions is to identify a problem or scenario, define the key concepts identified, brainstorm ideas and discuss key learning objectives, research these and share this information with each other at subsequent sessions. Tutors are used to guide students, so they stay on track with the learning objectives of the task. Contemporary medical education also employs other small group learning methods including CBL and TBL. Characteristics common to the pedagogy of both CBL and TBL include the use of an authentic clinical case, active small-group learning, activation of existing knowledge and application of newly acquired knowledge. In CBL students are encouraged to engage in peer learning and apply new knowledge to these authentic clinical problems under the guidance of a facilitator. CBL encourages a structured and critical approach to clinical problem-solving, and, in contrast to PBL, is designed to allow the facilitator to correct and redirect students [ 5 ]. On the other hand, TBL offers a student-centered, instructional approach for large classes of students who are divided into small teams of typically five to seven students to solve clinically relevant problems. The overall similarities between PBL and TBL relate to the use of professionally relevant problems and small group learning, while the main difference relates to one teacher facilitating interactions between multiple self-managed teams in TBL, whereas each small group in PBL is facilitated by one teacher. Further differences are related to mandatory pre-reading assignments in TBL, testing of prior knowledge in TBL and activating prior knowledge in PBL, teacher-initiated clarifying of concepts that students struggled with in TBL versus students-generated issues that need further study in PBL, inter-team discussions in TBL and structured feedback and problems with related questions in TBL [ 6 ].

In the present study we have focused on PBL methodology, and, as attractive as the method may seem, we should consider whether it is really useful and effective as a learning method. Although PBL has been adopted in undergraduate and postgraduate medical education, the effectiveness (in terms of academic performance and/or skill improvement) of the method is still under discussion. This is due partly to the methodological difficulty in comparing PBL with traditional curricula based on lectures. To our knowledge, there is no systematic scoping review in the literature that has analyzed these aspects.

The main motivation for carrying out this research and writing this article was scientific but also professional interest. We believe that reviewing the state of the art of this methodology once it was already underway in our young Faculty of Medicine, could allow us to know if we were on the right track and if we should implement changes in the training of future doctors.

The primary goal of this study was to appraise available international evidence concerning to the effectiveness and usefulness of PBL methodology in undergraduate medical teaching programs. As the intention was to synthesize the scattered evidence available, the option was to conduct a scoping review. A scoping study tends to address broader topics where many different study designs might be applicable. Scoping studies may be particularly relevant to disciplines, such as medical education, in which the paucity of randomized controlled trials makes it difficult for researchers to undertake systematic reviews [ 7 , 8 ]. Even though the scoping review methodology is not widely used in medical education, it is well established for synthesizing heterogeneous research evidence [ 9 ].

The specific aims were: 1) to determine the effectiveness of PBL in academic performance (learning and retention of knowledge) in medical education; 2) to determine the effectiveness of PBL in other skills (social and communication skills, problem solving or self-learning) in medical education; 3) to know the level of satisfaction perceived by the medical students (and/or tutors) when they are taught with the PBL methodology (or when they teach in case of tutors).

This review was guided by Arksey and O’Malley’s methodological framework for conducting scoping reviews. The five main stages of the framework are: (1) identifying the research question; (2) ascertaining relevant studies; (3) determining study selection; (4) charting the data; and (5) collating, summarizing and reporting the results [ 7 ]. We reported our process according to the PRISMA Extension for Scoping Reviews [ 10 ].

Stage 1: Identifying the research question

With the goals of the study established, the four members of the research team established the research questions. The primary research question was “What is the effectiveness of PBL methodology for learning in undergraduate medicine?” and the secondary question “What is the perception and satisfaction of medical students and tutors in relation to PBL methodology?”.

Stage 2: Identifying relevant studies

After the research questions and a search strategy were defined, the searches were conducted in PubMed and Web of Science using the MeSH terms “problem-based learning” and “Medicine” (the Boolean operator “AND” was applied to the search terms). No limits were set on language, publication date, study design or country of origin. The search was carried out on 14th February 2021. Citations were uploaded to the reference manager software Mendeley Desktop (version 1.19.8) for title and abstract screening, and data characterization.

Stage 3: Study selection

The searching strategy in our scoping study generated a total of 2399 references. The literature search and screening of title, abstract and full text for suitability was performed independently by one author (JCT) based on predetermined inclusion criteria. The inclusion criteria were: 1) PBL methodology was the major research topic; 2) participants were undergraduate medical students or tutors; 3) the main outcome was academic performance (learning and knowledge retention); 4) the secondary outcomes were one of the following: social and communication skills, problem solving or self-learning and/or student/tutor satisfaction; 5) all types of studies were included including descriptive papers, qualitative, quantitative and mixed studies methods, perspectives, opinion, commentary pieces and editorials. Exclusion criteria were studies including other types of participants such as postgraduate medical students, residents and other health non-medical specialties such as pharmacy, veterinary, dentistry or nursing. Studies published in languages other than Spanish and English were also excluded. Situations in which uncertainty arose, all authors (CB, ES, RP) discussed the publication together to reach a final consensus. The outcomes of the search results and screening are presented in Fig.  1 . One-hundred and twenty-four articles met the inclusion criteria and were included in the final analysis.

figure 1

Study flow PRISMA diagram. Details the review process through the different stages of the review; includes the number of records identified, included and excluded

Stage 4: Charting the data

A data extraction table was developed by the research team. Data extracted from each of the 124 publications included general publication details (year, author, and country), sample size, study population, design/methodology, main and secondary outcomes and relevant results and/or conclusions. We compiled all data into a single spreadsheet in Microsoft Excel for coding and analysis. The characteristics and the study subject of the 124 articles included in this review are summarized in Tables 1 and 2 . The detailed results of the Microsoft Excel file is also available in Additional file 1 .

Stage 5: Collating, summarizing and reporting the results

As indicated in the search strategy (Fig.  1 ) this review resulted in the inclusion of 124 publications. Publication years of the final sample ranged from 1990 to 2020, the majority of the publications (51, 41%) were identified for the years 2010–2020 and the years in which there were more publications were 2001, 2009 and 2015. Countries from the six continents were represented in this review. Most of the publications were from Asia (especially China and Saudi Arabia) and North America followed by Europe, and few studies were from Africa, Oceania and South America. The country with more publications was the United States of America ( n  = 27). The most frequent designs of the selected studies were surveys or questionnaires ( n  = 45) and comparative studies ( n  = 48, only 16 were randomized) with traditional or lecture-based learning methodologies (in two studies the comparison was with simulation) and the most frequently measured outcomes were academic performance followed by student satisfaction (48 studies measured more than one outcome). The few studies with the highest level of scientific evidence (systematic review and meta-analysis and randomized studies) were conducted mostly in Asian countries (Tables  1 and 2 ). The study subject was specified in 81 publications finding a high variability but at the same time great representability of almost all disciplines of the medical studies.

The sample size was available in 99 publications and the median [range] of the participants was 132 [14–2061]. According to study population, there were more participants in the students’ focused studies (median 134 and range 16–2061) in comparison with the tutors’ studies (median 53 and range 14–494).

Finally, after reviewing in detail the measured outcomes (main and secondary) according to the study design (Table 2 and Additional file 1 ) we present a narrative overview and a synthesis of the main findings.

Main outcome: academic performance (learning and knowledge retention)

Seventy-one of the 124 publications had learning and/or knowledge retention as a measured outcome, most of them ( n  = 45) were comparative studies with traditional or lecture-based learning and 16 were randomized. These studies were varied in their methodology, were performed in different geographic zones, and normally analyzed the experience of just one education center. Most studies ( n  = 49) reported superiority of PBL in learning and knowledge acquisition [ 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 ] but there was no difference between traditional and PBL curriculums in another 19 studies [ 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 ]. Only three studies reported that PBL was less effective [ 79 , 80 , 81 ], two of them were randomized (in one case favoring simulation-based learning [ 80 ] and another favoring lectures [ 81 ]) and the remaining study was based on tutors’ opinion rather than real academic performance [ 79 ]. It is noteworthy that the four systematic reviews and meta-analysis included in this scoping review, all carried out in China, found that PBL was more effective than lecture-based learning in improving knowledge and other skills (clinical, problem-solving, self-learning and collaborative) [ 40 , 51 , 53 , 58 ]. Another relevant example of the superiority of the PBL method over the traditional method is the experience reported by Hoffman et al. from the University of Missouri-Columbia. The authors analyzed the impact of implementing the PBL methodology in its Faculty of Medicine and revealed an improvement in the academic results that lasted for over a decade [ 31 ].

Secondary outcomes

Social and communication skills.

We found five studies in this scoping review that focused on these outcomes and all of them described that a curriculum centered on PBL seems to instill more confidence in social and communication skills among students. Students perceived PBL positively for teamwork, communication skills and interpersonal relations [ 44 , 45 , 67 , 75 , 82 ].

Student satisfaction

Sixty publications analyzed student satisfaction with PBL methodology. The most frequent methodology were surveys or questionnaires (30 studies) followed by comparative studies with traditional or lecture-based methodology (19 studies, 7 of them were randomized). Almost all the studies (51) have shown that PBL is generally well-received [ 11 , 13 , 18 , 19 , 20 , 21 , 22 , 26 , 29 , 34 , 37 , 39 , 41 , 42 , 46 , 50 , 56 , 58 , 63 , 64 , 66 , 78 , 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 ] but in 9 studies the overall satisfaction scores for the PBL program were neutral [ 76 , 111 , 112 , 113 , 114 , 115 , 116 ] or negative [ 117 , 118 ]. Some factors that have been identified as key components for PBL to be successful include: a small group size, the use of scenarios of realistic cases and good management of group dynamics. Despite a mostly positive assessment of the PBL methodology by the students, there were some negative aspects that could be criticized or improved. These include unclear communication of the learning methodology, objectives and assessment method; bad management and organization of the sessions; tutors having little experience of the method; and a lack of standardization in the implementation of the method by the tutors.

Tutor satisfaction

There are only 15 publications that analyze the satisfaction of tutors, most of them surveys or questionnaires [ 85 , 88 , 92 , 98 , 108 , 110 , 119 ]. In comparison with the satisfaction of the students, here the results are more neutral [ 112 , 113 , 115 , 120 , 121 ] and even unfavorable to the PBL methodology in two publications [ 117 , 122 ]. PBL teaching was favored by tutors when the institutions train them in the subject, when there was administrative support and adequate infrastructure and coordination [ 123 ]. In some experiences, the PBL modules created an unacceptable toll of anxiety, unhappiness and strained relations.

Other skills (problem solving and self-learning)

The effectiveness of the PBL methodology has also been explored in other outcomes such as the ability to solve problems and to self-directed learning. All studies have shown that PBL is more effective than lecture-based learning in problem-solving and self-learning skills [ 18 , 24 , 40 , 48 , 67 , 75 , 93 , 104 , 124 ]. One single study found a poor accuracy of the students’ self-assessment when compared to their own performance [ 125 ]. In addition, there are studies that support PBL methodology for integration between basic and clinical sciences [ 126 ].

Finally, other publications have reported the experience of some faculties in the implementation of the PBL methodology. Different experiences have demonstrated that it is both possible and feasible to shift from a traditional curriculum to a PBL program, recognizing that PBL methodology is complex to plan and structure, needs a large number of human and material resources, requiring an immense teacher effort [ 28 , 31 , 94 , 127 , 128 , 129 , 130 , 131 , 132 , 133 ]. In addition, and despite its cost implication, a PBL curriculum can be successfully implemented in resource-constrained settings [ 134 , 135 ].

We conducted this scoping review to explore the effectiveness and satisfaction of PBL methodology for teaching in undergraduate medicine and, to our knowledge, it is the only study of its kind (systematic scoping review) that has been carried out in the last years. Similarly, Vernon et al. conducted a meta-analysis of articles published between 1970 and 1992 and their results generally supported the superiority of the PBL approach over more traditional methods of medical education [ 136 ]. PBL methodology is implemented in medical studies on the six continents but there is more experience (or at least more publications) from Asian countries and North America. Despite its apparent difficulties on implementation, a PBL curriculum can be successfully implemented in resource-constrained settings [ 134 , 135 ]. Although it is true that the few studies with the highest level of scientific evidence (randomized studies and meta-analysis) were carried out mainly in Asian countries (and some in North America and Europe), there were no significant differences in the main results according to geographical origin.

In this scoping review we have included a large number of publications that, despite their heterogeneity, tend to show favorable results for the usefulness of the PBL methodology in teaching and learning medicine. The results tend to be especially favorable to PBL methodology when it is compared with traditional or lecture-based teaching methods, but when compared with simulation it is not so clear. There are two studies that show neutral [ 71 ] or superior [ 80 ] results to simulation for the acquisition of specific clinical skills. It seems important to highlight that the four meta-analysis included in this review, which included a high number of participants, show results that are clearly favorable to the PBL methodology in terms of knowledge, clinical skills, problem-solving, self-learning and satisfaction [ 40 , 51 , 53 , 58 ].

Regarding the level of satisfaction described in the surveys or questionnaires, the overall satisfaction rate was higher in the PBL students when compared with traditional learning students. Students work in small groups, allowing and promoting teamwork and facilitating social and communication skills. As sessions are more attractive and dynamic than traditional classes, this could lead to a greater degree of motivation for learning.

These satisfaction results are not so favorable when tutors are asked and this may be due to different reasons; first, some studies are from the 90s, when the methodology was not yet fully implemented; second, the number of tutors included in these studies is low; and third, and perhaps most importantly, the complaints are not usually due to the methodology itself, but rather due to lack of administrative support, and/or work overload. PBL methodology implies more human and material resources. The lack of experience in guided self-learning by lecturers requires more training. Some teachers may not feel comfortable with the method and therefore do not apply it correctly.

Despite how effective and/or attractive the PBL methodology may seem, some (not many) authors are clearly detractors and have published opinion articles with fierce criticism to this methodology. Some of the arguments against are as follows: clinical problem solving is the wrong task for preclinical medical students, self-directed learning interpreted as self-teaching is not appropriate in undergraduate medical education, relegation to the role of facilitators is a misuse of the faculty, small-group experience is inherently variable and sometimes dysfunctional, etc. [ 137 ].

In light of the results found in our study, we believe that PBL is an adequate methodology for the training of future doctors and reinforces the idea that the PBL should have an important weight in the curriculum of our medical school. It is likely that training through PBL, the doctors of the future will not only have great knowledge but may also acquire greater capacity for communication, problem solving and self-learning, all of which are characteristics that are required in medical professionalism. For this purpose, Koh et al. analyzed the effect that PBL during medical school had on physician competencies after graduation, finding a positive effect mainly in social and cognitive dimensions [ 138 ].

Despite its defects and limitations, we must not abandon this methodology and, in any case, perhaps PBL should evolve, adapt, and improve to enhance its strengths and improve its weaknesses. It is likely that the new generations, trained in schools using new technologies and methodologies far from lectures, will feel more comfortable (either as students or as tutors) with methodologies more like PBL (small groups and work focused on problems or projects). It would be interesting to examine the implementation of technologies and even social media into PBL sessions, an issue that has been poorly explorer [ 139 ].

Limitations

Scoping reviews are not without limitations. Our review includes 124 articles from the 2399 initially identified and despite our efforts to be as comprehensive as possible, we may have missed some (probably few) articles. Even though this review includes many studies, their design is very heterogeneous, only a few include a large sample size and high scientific evidence methodology. Furthermore, most are single-center experiences and there are no large multi-center studies. Finally, the frequency of the PBL sessions (from once or twice a year to the whole curriculum) was not considered, in part, because most of the revised studies did not specify this information. This factor could affect the efficiency of PBL and the perceptions of students and tutors about PBL. However, the adoption of a scoping review methodology was effective in terms of summarizing the research findings, identifying limitations in studies’ methodologies and findings and provided a more rigorous vision of the international state of the art.

Conclusions

This systematic scoping review provides a broad overview of the efficacy of PBL methodology in undergraduate medicine teaching from different countries and institutions. PBL is not a new teaching method given that it has already been 50 years since it was implemented in medicine courses. It is a method that shifts the leading role from teachers to students and is based on guided self-learning. If it is applied properly, the degree of satisfaction is high, especially for students. PBL is more effective than traditional methods (based mainly on lectures) at improving social and communication skills, problem-solving and self-learning skills, and has no worse results (and in many studies better results) in relation to academic performance. Despite that, its use is not universally widespread, probably because it requires greater human resources and continuous training for its implementation. In any case, more comparative and randomized studies and/or other systematic reviews and meta-analysis are required to determine which educational strategies could be most suitable for the training of future doctors.

Abbreviations

  • Problem-based learning

Case-based learning

Team-based learning

References:

Dale E. Methods for analyzing the content of motion pictures. J Educ Sociol. 1932;6:244–50.

Google Scholar  

Miller GE. The assessment of clinical skills/competence/performance. Acad Med. 1990;65(9 Suppl):S63–7. https://doi.org/10.1097/00001888-199009000-00045 .

Article   Google Scholar  

Bodagh N, Bloomfield J, Birch P, Ricketts W. Problem-based learning: a review. Br J Hosp Med (Lond). 2017;78:C167–70. https://doi.org/10.12968/hmed.2017.78.11.C167 .

- Branda LA. El abc del ABP: Lo esencial del aprendizaje basado en problemas. In: Fundación Dr. Esteve, Cuadernos de la fundación Dr. Antonio Esteve nº27: El aprendizaje basado en problemas en sus textos, pp.1–16. 2013. Barcelona.

Burgess A, Matar E, Roberts C, et al. Scaffolding medical student knowledge and skills: team-based learning (TBL) and case-based learning (CBL). BMC Med Educ. 2021;21:238. https://doi.org/10.1186/s12909-021-02638-3 .

Dolmans D, Michaelsen L, van Merriënboer J, van der Vleuten C. Should we choose between problem-based learning and team-based learning? No, combine the best of both worlds! Med Teach. 2015;37:354–9. https://doi.org/10.3109/0142159X.2014.948828 .

Arksey H, O’Malley L. Scoping studies: towards a methodological framework. In J Soc Res Methodol. 2005;8:19–32. https://doi.org/10.1080/1364557032000119616 .

Levac D, Colquhoun H, O’Brien KK. Scoping studies: advancing the methodology. Implement Sci. 2010;5:69. https://doi.org/10.1186/1748-5908-5-69 .

Pham MT, Rajić A, Greig JD, Sargeant JM, Papadopoulos A, McEwen SA. A scoping review of scoping reviews: advancing the approach and enhancing the consistency. Res Synth Methods. 2014;5:371–85. https://doi.org/10.1002/jrsm.1123 .

Tricco AC, Lillie E, Zarin W, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169:467–73. https://doi.org/10.7326/M18-0850 .

Sokas RK, Diserens D, Johnston MA. Integrating occupational-health into the internal medicine clerkship using problem-based learning. Clin Res. 1990;38:A735.

Richards BF, Ober KP, Cariaga-Lo L, et al. Ratings of students’ performances in a third-year internal medicine clerkship: a comparison between problem-based and lecture-based curricula. Acad Med. 1996;71:187–9. https://doi.org/10.1097/00001888-199602000-00028 .

Gresham CL, Philp JR. Problem-based learning in clinical medicine. Teach Learn Med. 1996;8:111–5. https://doi.org/10.1080/10401339609539776 .

Hill J, Rolfe IE, Pearson SA, Heathcote A. Do junior doctors feel they are prepared for hospital practice? A study of graduates from traditional and non-traditional medical schools. Med Educ. 1998;32:19–24. https://doi.org/10.1046/j.1365-2923.1998.00152.x .

Blake RL, Parkison L. Faculty evaluation of the clinical performances of students in a problem-based learning curriculum. Teach Learn Med. 1998;10:69–73. https://doi.org/10.1207/S15328015TLM1002\_3 .

Hmelo CE. Problem-based learning: effects on the early acquisition of cognitive skill in medicine. J Learn Sc. 1998;7:173–208. https://doi.org/10.1207/s15327809jls0702\_2 .

Finch PN. The effect of problem-based learning on the academic performance of students studying podiatric medicine in Ontario. Med Educ. 1999;33:411–7.

Casassus P, Hivon R, Gagnayre R, d’Ivernois JF. An initial experiment in haematology instruction using the problem-based learning method in third-year medical training in France. Hematol Cell Ther. 1999;41:137–44. https://doi.org/10.1007/s00282-999-0137-0 .

Purdy RA, Benstead TJ, Holmes DB, Kaufman DM. Using problem-based learning in neurosciences education for medical students. Can J Neurol Sci. 1999;26:211–6. https://doi.org/10.1017/S0317167100000287 .

Farrell TA, Albanese MA, Pomrehn PRJ. Problem-based learning in ophthalmology: a pilot program for curricular renewal. Arch Ophthalmol. 1999;117:1223–6. https://doi.org/10.1001/archopht.117.9.1223 .

Curtis JA, Indyk D, Taylor B. Successful use of problem-based learning in a third-year pediatric clerkship. Ambul Pediatr. 2001;1:132–5. https://doi.org/10.1367/1539-4409(2001)001%3c0132:suopbl%3e2.0.co;2 .

Trevena LJ, Clarke RM. Self-directed learning in population health. a clinically relevant approach for medical students. Am J Prev Med. 2002;22:59–65. https://doi.org/10.1016/s0749-3797(01)00395-6 .

Astin J, Jenkins T, Moore L. Medical students’ perspective on the teaching of medical statistics in the undergraduate medical curriculum. Stat Med. 2002;21:1003–7. https://doi.org/10.1002/sim.1132 .

Whitfield CR, Manger EA, Zwicker J, Lehman EB. Differences between students in problem-based and lecture-based curricula measured by clerkship performance ratings at the beginning of the third year. Teach Learn Med. 2002;14:211–7. https://doi.org/10.1207/S15328015TLM1404\_2 .

McParland M, Noble LM, Livingston G. The effectiveness of problem-based learning compared to traditional teaching in undergraduate psychiatry. Med Educ. 2004;38:859–67. https://doi.org/10.1111/j.1365-2929.2004.01818.x .

Casey PM, Magrane D, Lesnick TG. Improved performance and student satisfaction after implementation of a problem-based preclinical obstetrics and gynecology curriculum. Am J Obstet Gynecol. 2005;193:1874–8. https://doi.org/10.1016/j.ajog.2005.07.061 .

Gurpinar E, Musal B, Aksakoglu G, Ucku R. Comparison of knowledge scores of medical students in problem-based learning and traditional curriculum on public health topics. BMC Med Educ. 2005;5:7. https://doi.org/10.1186/1472-6920-5-7 .

Tamblyn R, Abrahamowicz M, Dauphinee D, et al. Effect of a community oriented problem based learning curriculum on quality of primary care delivered by graduates: historical cohort comparison study. BMJ. 2005;331:1002. https://doi.org/10.1136/bmj.38636.582546.7C .

Abu-Hijleh MF, Chakravarty M, Al-Shboul Q, Kassab S, Hamdy H. Integrating applied anatomy in surgical clerkship in a problem-based learning curriculum. Surg Radiol Anat. 2005;27:152–7. https://doi.org/10.1007/s00276-004-0293-4 .

Distlehorst LH, Dawson E, Robbs RS, Barrows HS. Problem-based learning outcomes: the glass half-full. Acad Med. 2005;80:294–9. https://doi.org/10.1097/00001888-200503000-00020 .

Hoffman K, Hosokawa M, Blake R Jr, Headrick L, Johnson G. Problem-based learning outcomes: ten years of experience at the University of Missouri-Columbia school of medicine. Acad Med. 2006;81:617–25. https://doi.org/10.1097/01.ACM.0000232411.97399.c6 .

Kong J, Li X, Wang Y, Sun W, Zhang J. Effect of digital problem-based learning cases on student learning outcomes in ophthalmology courses. Arch Ophthalmol. 2009;127:1211–4. https://doi.org/10.1001/archophthalmol.2009.110 .

Tsou KI, Cho SL, Lin CS, et al. Short-term outcomes of a near-full PBL curriculum in a new Taiwan medical school. Kaohsiung J Med Sci. 2009;25:282–93. https://doi.org/10.1016/S1607-551X(09)70075-0 .

Wang J, Zhang W, Qin L, et al. Problem-based learning in regional anatomy education at Peking University. Anat Sci Educ. 2010;3:121–6. https://doi.org/10.1002/ase.151 .

Abou-Elhamd KA, Rashad UM, Al-Sultan AI. Applying problem-based learning to otolaryngology teaching. J Laryngol Otol. 2011;125:117–20. https://doi.org/10.1017/S0022215110001702 .

Urrutia Aguilar ME, Hamui-Sutton A, Castaneda Figueiras S, van der Goes TI, Guevara-Guzman R. Impact of problem-based learning on the cognitive processes of medical students. Gac Med Mex. 2011;147:385–93.

Tian J-H, Yang K-H, Liu A-P. Problem-based learning in evidence-based medicine courses at Lanzhou University. Med Teach. 2012;34:341. https://doi.org/10.3109/0142159X.2011.531169 .

Hoover CR, Wong CC, Azzam A. From primary care to public health: using problem-based Learning and the ecological model to teach public health to first year medical students. J Community Health. 2012;37:647–52. https://doi.org/10.1007/s10900-011-9495-y .

Li J, Li QL, Li J, et al. Comparison of three problem-based learning conditions (real patients, digital and paper) with lecture-based learning in a dermatology course: a prospective randomized study from China. Med Teach. 2013;35:e963–70. https://doi.org/10.3109/0142159X.2012.719651 .

Ding X, Zhao L, Chu H, et al. Assessing the effectiveness of problem-based learning of preventive medicine education in China. Sci Rep. 2014;4:5126. https://doi.org/10.1038/srep05126 .

Meo SA. Undergraduate medical student’s perceptions on traditional and problem based curricula: pilot study. J Pak Med Assoc. 2014;64:775–9.

Khoshnevisasl P, Sadeghzadeh M, Mazloomzadeh S, Hashemi Feshareki R, Ahmadiafshar A. Comparison of problem-based learning with lecture-based learning. Iran Red Crescent Med J. 2014;16: e5186. https://doi.org/10.5812/ircmj.5186 .

Al-Drees AA, Khalil MS, Irshad M, Abdulghani HM. Students’ perception towards the problem based learning tutorial session in a system-based hybrid curriculum. Saudi Med J. 2015;36:341–8. https://doi.org/10.15537/smj.2015.3.10216 .

Al-Shaikh G, Al Mussaed EM, Altamimi TN, Elmorshedy H, Syed S, Habib F. Perception of medical students regarding problem based learning. Kuwait Med J. 2015;47:133–8.

Hande S, Mohammed CA, Komattil R. Acquisition of knowledge, generic skills and attitudes through problem-based learning: student perspectives in a hybrid curriculum. J Taibah Univ Medical Sci. 2015;10:21–5. https://doi.org/10.1016/j.jtumed.2014.01.008 .

González Mirasol E, Gómez García MT, Lobo Abascal P, Moreno Selva R, Fuentes Rozalén AM, González MG. Analysis of perception of training in graduates of the faculty of medicine at Universidad de Castilla-Mancha. Eval Program Plann. 2015;52:169–75. https://doi.org/10.1016/j.evalprogplan.2015.06.001 .

Yanamadala M, Kaprielian VS, O’Connor Grochowski C, Reed T, Heflin MT. A problem-based learning curriculum in geriatrics for medical students. Gerontol Geriatr Educ. 2018;39:122–31. https://doi.org/10.1080/02701960.2016.1152268 .

Balendran K, John L. Comparison of learning outcomes in problem based learning and lecture based learning in teaching forensic medicine. J Evol Med Dent Sci. 2017;6:89–92. https://doi.org/10.14260/jemds/2017/22 .

Chang H-C, Wang N-Y, Ko W-R, Yu Y-T, Lin L-Y, Tsai H-F. The effectiveness of clinical problem-based learning model of medico-jurisprudence education on general law knowledge for obstetrics/gynecological interns. Taiwan J Obstet Gynecol. 2017;56:325–30. https://doi.org/10.1016/j.tjog.2017.04.011 .

Eltony SA, El-Sayed NH, El-Araby SE-S, Kassab SE. Implementation and evaluation of a patient safety course in a problem-based learning program. Educ Heal. 2017;30:44–9. https://doi.org/10.4103/1357-6283.210512 .

Zhang S, Xu J, Wang H, Zhang D, Zhang Q, Zou L. Effects of problem-based learning in Chinese radiology education: a systematic review and meta-analysis. Medicine (Baltimore). 2018;97: e0069. https://doi.org/10.1097/MD.0000000000010069 .

Hincapie Parra DA, Ramos Monobe A, Chrino-Barcelo V. Problem based learning as an active learning strategy and its impact on academic performance and critical thinking of medical students. Rev Complut Educ. 2018;29:665–81. https://doi.org/10.5209/RCED.53581 .

Ma Y, Lu X. The effectiveness of problem-based learning in pediatric medical education in China: a meta-analysis of randomized controlled trials. Medicine (Baltimore). 2019;98: e14052. https://doi.org/10.1097/MD.0000000000014052 .

Berger C, Brinkrolf P, Ertmer C, et al. Combination of problem-based learning with high-fidelity simulation in CPR training improves short and long-term CPR skills: a randomised single blinded trial. BMC Med Educ. 2019;19:180. https://doi.org/10.1186/s12909-019-1626-7 .

Aboonq M, Alquliti A, Abdulmonem I, Alpuq N, Jalali K, Arabi S. Students’ approaches to learning and perception of learning environment: a comparison between traditional and problem-based learning medical curricula. Indo Am J Pharm Sci. 2019;6:3610–9. https://doi.org/10.5281/zenodo.2562660 .

Li X, Xie F, Li X, et al. Development, application, and evaluation of a problem-based learning method in clinical laboratory education. Clin Chim ACTA. 2020;510:681–4. https://doi.org/10.1016/j.cca.2020.08.037 .

Zhao W, He L, Deng W, Zhu J, Su A, Zhang Y. The effectiveness of the combined problem-based learning (PBL) and case-based learning (CBL) teaching method in the clinical practical teaching of thyroid disease. BMC Med Educ. 2020;20:381. https://doi.org/10.1186/s12909-020-02306 .

Liu C-X, Ouyang W-W, Wang X-W, Chen D, Jiang Z-L. Comparing hybrid problem-based and lecture learning (PBL plus LBL) with LBL pedagogy on clinical curriculum learning for medical students in China: a meta-analysis of randomized controlled trials. Medicine (Baltimore). 2020;99:e19687. https://doi.org/10.1097/MD.0000000000019687 .

Margolius SW, Papp KK, Altose MD, Wilson-Delfosse AL. Students perceive skills learned in pre-clerkship PBL valuable in core clinical rotations. Med Teach. 2020;42:902–8. https://doi.org/10.1080/0142159X.2020.1762031 .

Schwartz RW, Donnelly MB, Nash PP, Young B. Developing students cognitive skills in a problem-based surgery clerkship. Acad Med. 1992;67:694–6. https://doi.org/10.1097/00001888-199210000-00016 .

Mennin SP, Friedman M, Skipper B, Kalishman S, Snyder J. Performances on the NBME-I, NBME-II, and NBME-III by medical-students in the problem-based learning and conventional tracks at the university-of-new-mexico. Acad Med. 1993;68:616–24. https://doi.org/10.1097/00001888-199308000-00012 .

Kaufman DM, Mann KV. Comparing achievement on the medical council of Canada qualifying examination part I of students in conventional and problem-based learning curricula. Acad Med. 1998;73:1211–3. https://doi.org/10.1097/00001888-199811000-00022 .

Kaufman DM, Mann KV. Achievement of students in a conventional and Problem-Based Learning (PBL) curriculum. Adv Heal Sci Educ. 1999;4:245–60. https://doi.org/10.1023/A:1009829831978 .

Antepohl W, Herzig S. Problem-based learning versus lecture-based learning in a course of basic pharmacology: a controlled, randomized study. Med Educ. 1999;33:106–13. https://doi.org/10.1046/j.1365-2923.1999.00289.x .

Dyke P, Jamrozik K, Plant AJ. A randomized trial of a problem-based learning approach for teaching epidemiology. Acad Med. 2001;76:373–9. https://doi.org/10.1097/00001888-200104000-00016 .

Brewer DW. Endocrine PBL in the year 2000. Adv Physiol Educ. 2001;25:249–55. https://doi.org/10.1152/advances.2001.25.4.249 .

Seneviratne RD, Samarasekera DD, Karunathilake IM, Ponnamperuma GG. Students’ perception of problem-based learning in the medical curriculum of the faculty of medicine, University of Colombo. Ann Acad Med Singapore. 2001;30:379–81.

Alleyne T, Shirley A, Bennett C, et al. Problem-based compared with traditional methods at the faculty of medical sciences, University of the West Indies: a model study. Med Teach. 2002;24:273–9. https://doi.org/10.1080/01421590220125286 .

Norman GR, Wenghofer E, Klass D. Predicting doctor performance outcomes of curriculum interventions: problem-based learning and continuing competence. Med Educ. 2008;42:794–9. https://doi.org/10.1111/j.1365-2923.2008.03131.x .

Cohen-Schotanus J, Muijtjens AMM, Schoenrock-Adema J, Geertsma J, van der Vleuten CPM. Effects of conventional and problem-based learning on clinical and general competencies and career development. Med Educ. 2008;42:256–65. https://doi.org/10.1111/j.1365-2923.2007.02959.x .

Wenk M, Waurick R, Schotes D, et al. Simulation-based medical education is no better than problem-based discussions and induces misjudgment in self-assessment. Adv Health Sci Educ Theory Pract. 2009;14:159–71. https://doi.org/10.1007/s10459-008-9098-2 .

Collard A, Gelaes S, Vanbelle S, et al. Reasoning versus knowledge retention and ascertainment throughout a problem-based learning curriculum. Med Educ. 2009;43:854–65. https://doi.org/10.1111/j.1365-2923.2009.03410.x .

Nouns Z, Schauber S, Witt C, Kingreen H, Schuettpelz-Brauns K. Development of knowledge in basic sciences: a comparison of two medical curricula. Med Educ. 2012;46:1206–14. https://doi.org/10.1111/medu.12047 .

Saloojee S, van Wyk J. The impact of a problem-based learning curriculum on the psychiatric knowledge and skills of final-year students at the Nelson R Mandela school of medicine. South African J Psychiatry. 2012;18:116.

Mughal AM, Shaikh SH. Assessment of collaborative problem solving skills in undergraduate medical students at Ziauddin college of medicine. Karachi Pakistan J Med Sci. 2018;34:185–9. https://doi.org/10.12669/pjms.341.13485 .

Hu X, Zhang H, Song Y, et al. Implementation of flipped classroom combined with problem-based learning: an approach to promote learning about hyperthyroidism in the endocrinology internship. BMC Med Educ. 2019;19:290. https://doi.org/10.1186/s12909-019-1714-8 .

Thompson KL, Gendreau JL, Strickling JE, Young HE. Cadaveric dissection in relation to problem-based learning case sequencing: a report of medical student musculoskeletal examination performances and self-confidence. Anat Sci Educ. 2019;12:619–26. https://doi.org/10.1002/ase.1891 .

Chang G, Cook D, Maguire T, Skakun E, Yakimets WW, Warnock GL. Problem-based learning: its role in undergraduate surgical education. Can J Surg. 1995;38:13–21.

Vernon DTA, Hosokawa MC. Faculty attitudes and opinions about problem-based learning. Acad Med. 1996;71:1233–8. https://doi.org/10.1097/00001888-199611000-00020 .

Steadman RH, Coates WC, Huang YM, et al. Simulation-based training is superior to problem-based learning for the acquisition of critical assessment and management skills. Crit Care Med. 2006;34:151–7. https://doi.org/10.1097/01.CCM.0000190619.42013.94 .

Johnston JM, Schooling CM, Leung GM. A randomised-controlled trial of two educational modes for undergraduate evidence-based medicine learning in Asia. BMC Med Educ. 2009;9:63. https://doi.org/10.1186/1472-6920-9-63 .

Suleman W, Iqbal R, Alsultan A, Baig SM. Perception of 4(th) year medical students about problem based learning. Pakistan J Med Sci. 2010;26:871–4.

Blosser A, Jones B. Problem-based learning in a surgery clerkship. Med Teach. 1991;13:289–93. https://doi.org/10.3109/01421599109089907 .

Usherwood T, Joesbury H, Hannay D. Student-directed problem-based learning in general-practice and public-health medicine. Med Educ. 1991;25:421–9. https://doi.org/10.1111/j.1365-2923.1991.tb00090.x .

Bernstein P, Tipping J, Bercovitz K, Skinner HA. Shifting students and faculty to a PBL curriculum - attitudes changed and lessons learned. Acad Med. 1995;70:245–7. https://doi.org/10.1097/00001888-199503000-00019 .

Kaufman DM, Mann KV. Comparing students’ attitudes in problem-based and conventional curricula. Acad Med. 1996;71:1096–9. https://doi.org/10.1097/00001888-199610000-00018 .

Kalaian HA, Mullan PB. Exploratory factor analysis of students’ ratings of a problem-based learning curriculum. Acad Med. 1996;71:390–2. https://doi.org/10.1097/00001888-199604000-00019 .

Vincelette J, Lalande R, Delorme P, Goudreau J, Lalonde V, Jean P. A pilot course as a model for implementing a PBL curriculum. Acad Med. 1997;72:698–701. https://doi.org/10.1097/00001888-199708000-00015 .

Ghosh S, Dawka V. Combination of didactic lecture with problem-based learning sessions in physiology teaching in a developing medical college in Nepal. Adv Physiol Educ. 2000;24:8–12.

Walters MR. Problem-based learning within endocrine physiology lectures. Adv Physiol Educ. 2001;25:225–7. https://doi.org/10.1152/advances.2001.25.4.225 .

Leung GM, Lam TH, Hedley AJ. Problem-based public health learning - from the classroom to the community. Med Educ. 2001;35:1071–2.

Khoo HE, Chhem RK, Gwee MCE, Balasubramaniam P. Introduction of problem-based learning in a traditional medical curriculum in Singapore - students’ and tutors’ perspectives. Ann Acad Med Singapore. 2001;30:371–4.

Villamor MCA. Problem-based learning (PBL) as an approach in the teaching of biochemistry of the endocrine system at the Angeles University College of Medicine. Ann Acad Med Singapore. 2001;30:382–6.

Chang C-H, Yang C-Y, See L-C, Lui P-W. High satisfaction with problem-based learning for anesthesia. Chang Gung Med J. 2004;27:654–62.

McLean M. A comparison of students who chose a traditional or a problem-based learning curriculum after failing year 2 in the traditional curriculum: a unique case study at the Nelson R. Mandela school of medicine. Teach Learn Med. 2004;16:301–3. https://doi.org/10.1207/s15328015tlm1603\_15 .

Lucas M, García Guasch R, Moret E, Llasera R, Melero A. Canet J [Problem-based learning in an undergraduate medical school course on anesthesiology, recovery care, and pain management]. Rev Esp Anestesiol Reanim. 2006;53:419–25.

Burgun A, Darmoni S, Le Duff F, Weber J. Problem-based learning in medical informatics for undergraduate medical students: an experiment in two medical schools. Int J Med Inform. 2006;75:396–402. https://doi.org/10.1016/j.ijmedinf.2005.07.014 .

Gurpinar E, Senol Y, Aktekin MR. Evaluation of problem based learning by tutors and students in a medical faculty of Turkey. Kuwait Med J. 2009;41:123–7.

Elzubeir MA. Teaching of the renal system in an integrated, problem-based curriculum. Saudi J Kidney Dis Transpl. 2012;23:93–8.

Sulaiman N, Hamdy H. Problem-based learning: where are we now? Guide supplement 36.3–practical application. Med Teach. 2013;35:160–2. https://doi.org/10.3109/0142159X.2012.737965 .

Albarrak AI, Mohammed R, Abalhassan MF, Almutairi NK. Academic satisfaction among traditional and problem based learning medical students a comparative study. Saudi Med J. 2013;34:1179–88.

Nosair E, Mirghani Z, Mostafa RM. Measuring students’ perceptions of educational environment in the PBL program of Sharjah Medical College. J Med Educ Curric Dev. 2015;2:71–9. https://doi.org/10.4137/JMECDECDECD.S29926 .

Tshitenge ST, Ndhlovu CE, Ogundipe R. Evaluation of problem-based learning curriculum implementation in a clerkship rotation of a newly established African medical training institution: lessons from the University of Botswana. Pan Afr Med J. 2017;27:13. https://doi.org/10.11604/pamj.2017.27.13.10623 .

Yadav RL, Piryani RM, Deo GP, Shah DK, Yadav LK, Islam MN. Attitude and perception of undergraduate medical students toward the problem-based learning in Chitwan Medical College. Nepal Adv Med Educ Pract. 2018;9:317–22. https://doi.org/10.2147/AMEP.S160814 .

Asad MR, Tadvi N, Amir KM, Afzal K, Irfan A, Hussain SA. Medical student’s feedback towards problem based learning and interactive lectures as a teaching and learning method in an outcome-based curriculum. Int J Med Res & Heal Sci. 2019;8:78–84. https://doi.org/10.33844/ijol.2019.60392 .

Mpalanyi M, Nalweyiso ID, Mubuuke AG. Perceptions of radiography students toward problem-based learning almost two decades after its introduction at Makerere University. Uganda J Med imaging Radiat Sci. 2020;51:639–44. https://doi.org/10.1016/j.jmir.2020.06.009 .

Korkmaz NS, Ozcelik S. Evaluation of the opinions of the first, second and third term medical students about problem based learning sessions in Bezmialem Vakif University. Bezmialem Sci. 2020;8:144–9. https://doi.org/10.14235/bas.galenos.2019.3471 .

McGrew MC, Skipper B, Palley T, Kaufman A. Student and faculty perceptions of problem-based learning on a family medicine clerkship. Fam Med. 1999;31:171–6.

Kelly AM. A problem-based learning resource in emergency medicine for medical students. J Accid Emerg Med. 2000;17:320–3. https://doi.org/10.1136/emj.17.5.320 .

Bui-Mansfield LT, Chew FS. Radiologists as clinical tutors in a problem-based medical school curriculum. Acad Radiol. 2001;8:657–63. https://doi.org/10.1016/S1076-6332(03)80693-1 .

Macallan DC, Kent A, Holmes SC, Farmer EA, McCrorie P. A model of clinical problem-based learning for clinical attachments in medicine. Med Educ. 2009;43:799–807. https://doi.org/10.1111/j.1365-2923.2009.03406.x .

Grisham JW, Martiniuk ALC, Negin J, Wright EP. Problem-based learning (PBL) and public health: an initial exploration of perceptions of PBL in Vietnam. Asia-Pacific J public Heal. 2015;27:NP2019-27. https://doi.org/10.1177/1010539512436875 .

Khan IA, Al-Swailmi FK. Perceptions of faculty and students regarding Problem Based Learning: a mixed methods study. J Pak Med Assoc. 2015;65:1334–8.

Alduraywish AA, Mohager MO, Alenezi MJ, Nail AM, Aljafari AS. Evaluation of students’ experience with Problem-based Learning (PBL) applied at the College of Medicine, Al-Jouf University. Saudi Arabia J Pak Med Assoc. 2017;67:1870–3.

Yoo DM, Cho AR, Kim S. Satisfaction with and suitability of the problem-based learning program at the Catholic University of Korea College of Medicine. J Educ Eval Health Prof. 2019;16:20. https://doi.org/10.3352/jeehp.2019.16.20 .

Aldayel AA, Alali AO, Altuwaim AA, et al. Problem-based learning: medical students’ perception toward their educational environment at Al-Imam Mohammad Ibn Saud Islamic University. Adv Med Educ Pract. 2019;10:95–104. https://doi.org/10.2147/AMEP.S189062 .

DeLowerntal E. An evaluation of a module in problem-based learning. Int J Educ Dev. 1996;16:303–7. https://doi.org/10.1016/0738-0593(96)00001-6 .

Tufts MA, Higgins-Opitz SB. What makes the learning of physiology in a PBL medical curriculum challenging? Student perceptions. Adv Physiol Educ. 2009;33:187–95. https://doi.org/10.1152/advan.90214.2008 .

Aboonq M. Perception of the faculty regarding problem-based learning as an educational approach in Northwestern Saudi Arabia. Saudi Med J. 2015;36:1329–35. https://doi.org/10.15537/smj.2015.11.12263 .

Subramaniam RM, Scally P, Gibson R. Problem-based learning and medical student radiology teaching. Australas Radiol. 2004;48:335–8. https://doi.org/10.1111/j.0004-8461.2004.01317.x .

Chang BJ. Problem-based learning in medical school: a student’s perspective. Ann Med Surg. 2016;12:88–9. https://doi.org/10.1016/j.amsu.2016.11.011 .

Griffith CD, Blue AV, Mainous AG, DeSimone PA. Housestaff attitudes toward a problem-based clerkship. Med Teach. 1996;18:133–4. https://doi.org/10.3109/01421599609034147 .

Navarro HN, Zamora SJ. The opinion of teachers about tutorial problem based learning. Rev Med Chil. 2014;142:989–97. https://doi.org/10.4067/S0034-98872014000800006 .

Demiroren M, Turan S, Oztuna D. Medical students’ self-efficacy in problem-based learning and its relationship with self-regulated learning. Med Educ Online. 2016;21:30049. https://doi.org/10.3402/meo.v21.30049 .

Tousignant M, DesMarchais JE. Accuracy of student self-assessment ability compared to their own performance in a problem-based learning medical program: a correlation study. Adv Heal Sci Educ. 2002;7:19–27. https://doi.org/10.1023/A:1014516206120 .

Brynhildsen J, Dahle LO, Behrbohm Fallsberg M, Rundquist I, Hammar M. Attitudes among students and teachers on vertical integration between clinical medicine and basic science within a problem-based undergraduate medical curriculum. Med Teach. 2002;24:286–8. https://doi.org/10.1080/01421590220134105 .

Desmarchais JE. A student-centered, problem-based curriculum - 5 years experience. Can Med Assoc J. 1993;148:1567–72.

Doig K, Werner E. The marriage of a traditional lecture-based curriculum and problem-based learning: are the offspring vigorous? Med Teach. 2000;22:173–8.

Kemahli S. Hematology education in a problem-based curriculum. Hematology. 2005;10(Suppl 1):161–3. https://doi.org/10.1080/10245330512331390267 .

Grkovic I. Transition of the medical curriculum from classical to integrated: problem-based approach and Australian way of keeping academia in medicine. Croat Med J. 2005;46:16–20.

Bosch-Barrera J, Briceno Garcia HC, Capella D, et al. Teaching bioethics to students of medicine with Problem-Based Learning (PBL). Cuad Bioet. 2015;26:303–9.

Lin Y-C, Huang Y-S, Lai C-S, Yen J-H, Tsai W-C. Problem-based learning curriculum in medical education at Kaohsiung Medical University. Kaohsiung J Med Sci. 2009;25:264–9. https://doi.org/10.1016/S1607-551X(09)70072-5 .

Salinas Sánchez AS, Hernández Millán I, Virseda Rodríguez JA, et al. Problem-based learning in urology training the faculty of medicine of the Universidad de Castilla-La Mancha model. Actas Urol Esp. 2005;29:8–15. https://doi.org/10.1016/s0210-4806(05)73193-4 .

Amoako-Sakyi D, Amonoo-Kuofi H. Problem-based learning in resource-poor settings: lessons from a medical school in Ghana. BMC Med Educ. 2015;15:221. https://doi.org/10.1186/s12909-015-0501-4 .

Carrera LI, Tellez TE, D’Ottavio AE. Implementing a problem-based learning curriculum in an Argentinean medical school: implications for developing countries. Acad Med. 2003;78:798–801. https://doi.org/10.1097/00001888-200308000-00010 .

Vernon DT, Blake RL. Does problem-based learning work? A meta-analysis of evaluative research. Acad Med. 1993;68:550–63. https://doi.org/10.1097/00001888-199307000-00015 .

Shanley PF. Viewpoint: leaving the “empty glass” of problem-based learning behind: new assumptions and a revised model for case study in preclinical medical education. Acad Med. 2007;82:479–85. https://doi.org/10.1097/ACM.0b013e31803eac4c .

Koh GC, Khoo HE, Wong ML, Koh D. The effects of problem-based learning during medical school on physician competency: a systematic review. CMAJ. 2008;178:34–41. https://doi.org/10.1503/cmaj.070565 .

Awan ZA, Awan AA, Alshawwa L, Tekian A, Park YS. Assisting the integration of social media in problem-based learning sessions in the faculty of medicine at King Abdulaziz University. Med Teach. 2018;40:S37–42. https://doi.org/10.1080/0142159X.2018.1465179 .

Download references

Acknowledgements

Not applicable

No funding was received for conducting this study.

Author information

Authors and affiliations.

Medical Education Cathedra, School of Medicine, University of Vic-Central University of Catalonia, Vic, Barcelona, Spain

Joan Carles Trullàs, Carles Blay & Ramon Pujol

Internal Medicine Service, Hospital de Olot i Comarcal de La Garrotxa, Olot, Girona, Spain

Joan Carles Trullàs

The Tissue Repair and Regeneration Laboratory (TR2Lab), University of Vic-Central University of Catalonia, Vic, Barcelona, Spain

Joan Carles Trullàs & Elisabet Sarri

Catalan Institute of Health (ICS) – Catalunya Central, Barcelona, Spain

Carles Blay

You can also search for this author in PubMed   Google Scholar

Contributions

JCT had the idea for the article, performed the literature search and data analysis and drafted the first version of the manuscript. CB, ES and RP contributed to the data analysis and suggested revisions to the manuscript. All authors read and approved the final manuscript.

Ethics declarations

Availability of data and materials.

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Ethics approval and consent to participate

Not applicable for a literature review.

Consent for publication

Competing interests.

All authors declare that they have no conflict of interest.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1..

Characteristics ofthe 124 included studies.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Trullàs, J.C., Blay, C., Sarri, E. et al. Effectiveness of problem-based learning methodology in undergraduate medical education: a scoping review. BMC Med Educ 22 , 104 (2022). https://doi.org/10.1186/s12909-022-03154-8

Download citation

Received : 03 October 2021

Accepted : 02 February 2022

Published : 17 February 2022

DOI : https://doi.org/10.1186/s12909-022-03154-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Systematic review

BMC Medical Education

ISSN: 1472-6920

problem based learning research articles

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

The effectiveness of problem based learning in improving critical thinking, problem-solving and self-directed learning in first-year medical students: A meta-analysis

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review & editing

¶ ‡ IBAPM are sole first author to this work.

Affiliations College of Medicine, Taipei Medical University, Taipei, Taiwan, Medical and Health Education Development, Faculty of Medicine, Udayana University, Bali, Indonesia

ORCID logo

Roles Conceptualization, Formal analysis, Methodology, Supervision, Validation, Writing – review & editing

Affiliation Department of Education and Humanities in Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan

Roles Conceptualization, Methodology, Supervision, Validation, Writing – review & editing

* E-mail: [email protected]

Affiliations Department of Education and Humanities in Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan, Department of Urology, Taipei Medical University Hospital, Taipei, Taiwan

  • Ida Bagus Amertha Putra Manuaba, 
  • Yi -No, 
  • Chien-Chih Wu

PLOS

  • Published: November 22, 2022
  • https://doi.org/10.1371/journal.pone.0277339
  • Peer Review
  • Reader Comments

9 May 2024: Manuaba IBAP, -No Y, Wu CC (2024) Correction: The effectiveness of problem based learning in improving critical thinking, problem-solving and self-directed learning in first-year medical students: A meta-analysis. PLOS ONE 19(5): e0303724. https://doi.org/10.1371/journal.pone.0303724 View correction

Fig 1

The adaptation process for first-year medical students is an important problem because it significantly affects educational activities. The previous study showed that 63% of students had difficulties adapting to the learning process in their first year at medical school. Therefore, students need the most suitable learning style to support the educational process, such as Problem-based learning (PBL). This method can improve critical thinking skills, problem-solving and self-directed learning. Although PBL has been adopted in medical education, the effectiveness of PBL in first-year medical students is still not yet clear. The purpose of this meta-analysis is to verify whether the PBL approach has a positive effect in improving knowledge, problem-solving and self-directed learning in first-year medical students compared with the conventional method.

We searched PubMed, ScienceDirect, Cochrane, and Google Scholar databases until June 5, 2021. Search terms included problem-based learning, effectiveness, effectivity, and medical student. We excluded studies with the final-year medical student populations. All analyses in our study were carried out using Review Manager version 5.3 (RevMan Cochrane, London, UK).

Seven eligible studies (622 patients) were included. The pooled analysis demonstrated no significant difference between PBL with conventional learning method in critical thinking/knowledge assessment (p = 0.29), problem-solving aspect (p = 0.47), and self-directed learning aspect (p = 0.34).

The present study concluded that the PBL approach in first-year medical students appeared to be ineffective in improving critical thinking/knowledge, problem-solving, and self-directed compared with the conventional teaching method.

Citation: Manuaba IBAP, -No Y, Wu C-C (2022) The effectiveness of problem based learning in improving critical thinking, problem-solving and self-directed learning in first-year medical students: A meta-analysis. PLoS ONE 17(11): e0277339. https://doi.org/10.1371/journal.pone.0277339

Editor: Huijuan Cao, Beijing University of Chinese Medicine, CHINA

Received: July 14, 2021; Accepted: October 25, 2022; Published: November 22, 2022

Copyright: © 2022 Manuaba et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The adaptation process for first-year medical students is an important problem because it is one of the factors that significantly affect educational outcomes [ 1 ]. Struggling can occur at any time, but first-year students are particularly susceptible as they adapt to new learning methods at university [ 2 ]. A study on the adaptation process of first-year medical students involving 200 participants showed that 63% of students had problems adapting to the learning process [ 3 ]. Consequently, students need to know the most suitable learning style to support the educational process. In addition, the appropriate learning approach can also help the adaptation process of first-year medical students and maximize their study outcomes. Therefore, educational institutions need to ensure that applied learning methods improve the learning atmosphere for first-year medical students [ 4 ].

Problem-based learning (PBL) encourages students to identify their knowledge and skills to achieve specific goals [ 5 ]. Many studies have evaluated the effectiveness of PBL in the medical curriculum and found that PBL can improve understanding, team performance, learning motivation, student satisfaction, and critical thinking [ 5 , 6 ]. The PBL method not only helps students to understand in-depth, but it also encourages independent learning in students because they have to formulate their own learning goals after understanding PBL scenarios, solve their problems via literatures and internet, compare scenarios with theories from various sources and actively participate in group discussions [ 7 ]. PBL has three main learning objectives, namely (1) to apply deep content learning, (2) to apply problem analysis skills and develop solutions to solve problems, and (3) to apply self-directed learning as an approach to adapt learning styles [ 8 ]. Therefore, this teaching model has been highly praised in medical education courses in the past two decades [ 9 ]. In conventional lecture methods, students are passively exposed to the material and less likely to learn or apply concepts actively. Meanwhile, in PBL, students will learn actively using case-based peer-to-peer teaching, stimulating students to learn based on lecture materials and independent learning to solve cases under the guidance of a facilitator. The PBL approach aims to promote the integration of learned knowledge, rather than simply implanting knowledge and skills compared with the conventional teaching model [ 8 ] and also has been design to emphasizes active participation, problem-solving, and critical thinking skills compared to conventional medical education practices [ 6 ].

Several reports have showed the effectiveness of PBL for the first-year medical students in improving the final score with the help of map concept compared to PBL only group. The average score was improved significantly, namely 10.07±3.49 versus 5.97±2.09, p<0.001 [ 10 ]. Another study compared the final score between the PBL method and the conventional method accompanied by a workshop for first-year medical students. The final results were also statistically significant, namely 8.25±0.79 versus 5.46±0.96, p<0.01 [ 11 ]. However, due to the limitation of the studies, the effect of PBL for first-year medical students is yet to be concluded. Also, there is still no meta-analysis that evaluates this topic to date. Therefore, we conducted a systematic review and meta-analysis to verify whether the PBL approach has a positive effect in improving knowledge/critical thinking, problem-solving and self-directed learning in first-year medical students compared with the conventional method.

Study design

A Meta-analysis was performed from March to June 2021 to assess the effectiveness of PBL in improving knowledge/critical thinking, problem-solving and self-directed learning in first-year medical students. To attain our goal, potentially relevant papers were identified and collected from PubMed, Cochrane, ScienceDirect, and Google Scholar to calculate the mean difference and 95% confidence interval (95%CI) using a random and fixed-effect model. We used meta-analysis protocols as a guide in our present study [ 12 ].

Search strategy

We conducted a systematic search in PubMed, Cochrane, ScienceDirect and Google Scholar for search strategy up to June 5, 2021. The search strategy conformed to medical subjects heading (MeSH), involving the use of a combination of the following keywords: (Problem-based Learning [MeSH Major Topic]) AND (effectiveness OR effectivity AND medical student AND first-year). Language constraints were applied in our quest policy. We only used the bigger sample size analysis, which was up to date when we saw the same results in the experiments. We also scanned the possible papers of the appropriate or qualifying studies reference list by searching "Articles linked”. Two independent inspectors found potentially vital records (I.B.A.P.M, Y.N). Disagreements between two independent researchers related to the article were settled by a debate and/or consultation with the senior investigator for finding the third opinion (C.C.W).

Eligibility criteria and data extraction

The inclusion criteria for this study included: (1) research subjects were medical students at the first year (first or second semester), (2) study that evaluated the knowledge/critical thinking, problem-solving and self-directed learning of the student, (3) study that provided sufficient data for calculation of mean difference and 95%CI, p-value, and study heterogeneity. Meanwhile, the exclusion criteria were as follows: (1) studies with insufficient data, (2) samples size less than 50, (3) intervention duration less than one year, (4) review, letter to the editor, and comments articles. Data extraction was conducted by two authors (I.BA.P.M, Y.N). Both of those authors independently screened the collected article’s title, abstract, and full text. Two reviewers extracted the data, which was then extracted to Google Spreadsheet by two reviewers (I.BA.P.M, Y.N). Information was derived from each article included in this study as follows: (1) first author’s name and year of release, (2) age of the participant, (3) interventional and control method, (4) sample cases and control sizes, (5) country of study, (6) study program, (7) duration of PBL intervention, (8) score of PBL and control group. Two independent authors carried out data extraction to prevent human mistakes. If there were a disagreement, a discussion would be held to discuss the solution.

Quality assessment

Two independent authors (I.BA.P.M, Y.N) assessed the quality of the studies to ensure each sample’s validity and prevent the possible exaggeration of each study. The authors use major and minor criteria in assessing the risk of bias for quality assessment. There were four major and four minor criteria. The authors assigned 2 points each to the major criteria and 1 point each to the minor criteria so that the total score would be 12 points. If the article got 9–12 points, then it assigned as “low-risk bias,” if the article got 6–8 points, then it assigned as medium risk bias”, and if the article got < 5 points, then it assigned as “high risk of bias”. When there was a disagreement between the two authors, a discussion was held. If the conflict has not been settled, the two authors discuss it with the third author (C.C.W).

Statistical analysis

Assessment of Methodological Quality of Individual Trials in each article was assessed at the risk of bias before enrolling in meta-analysis. The Z-test was used to assess the effectivity learning method from self-directed learning and its sub-group analysis, critical thinking/knowledge, and problem-solving. Forest plots defined the group measurement and impact estimate. Heterogeneity was provided by using several parameters that we provide, such as Chi 2 , Tau 2 , and I 2 . In the beginning, Comprehensive Meta-Analysis (CMA, New Jersey, US) version 2.1. was used to assess effect models. If the p-value was less than 0.10, the random-effect model was used to evaluate heterogeneity. In contrast, a fixed-effect model was used if the P-value > 0.10. Our study’s analyses were carried out using Review Manager version 5.3 (RevMan Cochrane, London, UK) and Comprehensive Meta-Analysis (CMA, New Jersey, US) version 2.1.

Literature searching

This systematic review and meta-analysis extracted articles from four databases: PubMed, Cochrane, ScienceDirect, and Google Scholar. We found 5536 articles for identification. There was 11 article record removed before screening due to duplication. In the first step screening, there were 5407 articles excluded due to a mismatch of the titles and abstracts. Thus, 120 articles were recorded and continued to the next screening. From 120 articles, the full text was not available for 39 articles. Then, 81 articles were assessed for eligibility according to the inclusion and exclusion criteria and bias quality. There were several articles excluded as follows: no information about duration intervention (n = 16), low sample size (<50 samples) (n = 17), not appropriate study method (n = 12), and insufficient data (n = 29). Finally, seven articles were enrolled in this review ( Fig 1 ).

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

From : Moher D, Liberati A, Tetzlaff J, Altman DG, The PRISMA Group (2009). P referred R eporting I tems for S ystematic Reviews and M eta- A nalyses: The PRISMA Statement PLoS Med 6(7): e1000097. doi: 10.1371/journal.pmed1000097 For more information, visit www.prisma-statement.org .

https://doi.org/10.1371/journal.pone.0277339.g001

Baseline characteristic involves the study and quality assessment

All of the studies were published within the last 20 years, and most were in Asia. The sample sizes of the seven studies ranged from 56 to 131 participants, and the pooled sample size was classified into two groups (PBL vs. conventional learning methods). The participants were in the first year of medical student major (three articles), dentist major (one article), nurse major (two articles), and midwife major (one article). The length of the intervention varied from several months to one year. No specific gender was evaluated ( Table 1 ). All of enrolling studies were in various study types. According to our assessment, two studies had a low risk of bias (score range 9–12 points), and the remaining articles had a medium risk of bias (score range 6–8 points).

thumbnail

https://doi.org/10.1371/journal.pone.0277339.t001

The effectiveness comparison of PBL and conventional learning method

The critical thinking/knowledge evaluation..

In our finding conventional learning method consist of the conventional method (two articles), LBL (lecture base learning) (two articles), tutorial learning group (one article), and theory-based discussion (one article). Three studies show a higher PBL pre-test score, and three studies show a higher conventional method pre-test score. Meanwhile, there was not much difference in the mean score of each group in the pre-test. Evaluation post-test score after the intervention was found to be improved in each group. Post-test scores among PBL groups were mostly higher than the conventional group, except Choi et al ., study. It is also in line with Choi et al . ’s conclusion that stated no significant finding. In addition, Lohman et al . study also found that different teaching methods did not significantly influence students’ knowledge ( Table 2 ).

thumbnail

This table provides pre/post-test results in each group and the authors’ interpretation of their funding.

https://doi.org/10.1371/journal.pone.0277339.t002

The problem-solving evaluation.

Two articles investigated the critical thinking or knowledge aspect and problem-solving. There was not much difference in the average value of the pre-test and post-test result; meanwhile, Lohman et al .; found a significant association between the learning method and problem-solving aspect. Meanwhile, Choi et al . did not find a significant difference in each learning method in the problem-solving aspect ( Table 3 ).

thumbnail

https://doi.org/10.1371/journal.pone.0277339.t003

The self-directed learning evaluation.

Three articles evaluated self-directed learning. In the Lohman et al . study, the course instructor assessed the student and scored self-directed learning. The higher score obtained, the better level of self-directed learning. Unfortunately, no significant difference was found in comparing learning methods to enhance self-directed learning in all the included studies ( Table 4 ).

thumbnail

https://doi.org/10.1371/journal.pone.0277339.t004

Meta-analysis assessment

Our meta-analysis assessment classified three groups: critical thinking/knowledge, problem-solving, and self-directed learning.

Critical thinking/knowledge assessment.

Six articles evaluated the critical thinking/knowledge in conventional and PBL groups. This analysis used random effect due to p-value of heterogeneity <0,10. The heterogeneity of these articles was evaluated by using the I 2 parameter. According to ReVMan analysis, we established I 2 was 93%. It belonged to 75% to 100% classification that had good heterogeneity. We found that for developing critical thinking, PBL was a better program. Unfortunately, there is no significant difference between PBL and conventional learning methods (p = 0.29) ( Fig 2 ). This section had a sub-group analysis according to duration intervention and Asia’s critical thinking aspect ( Fig 3 ).

thumbnail

https://doi.org/10.1371/journal.pone.0277339.g002

thumbnail

(A) Fixed effect models. (B) Random effect models.

https://doi.org/10.1371/journal.pone.0277339.g003

Moreover, the critical thinking studies were regrouped according to the duration of the intervention (≤ 6 months vs > 6 months ) and countries (Asia vs. western). The analysis of the duration intervention found no significant difference between PBL and conventional learning methods. Subgroup analysis was assessed by using random effect and fixed-effect models. The studies with the duration intervention at more than six months and learning method comparison in western countries subgroup were found to have low heterogeneity ((I 2 = 0% (might not be important)). However, high heterogeneity scores were found in the studies with duration intervention less than six months (I 2 = 96%) and learning method comparison in Asian countries subgroup (I 2 = 95%). We discovered no statistical difference between PBL and conventional learning methods in each group even though the test for each subgroup analysis’s overall effect from the forest plot graph (diamond) is more inclined to the PBL ( Fig 3 ).

Problem-solving.

We found two studies that discussed the problem-solving aspect between PBL and conventional learning methods. Both studies had good heterogeneity (I 2 = 86% (considerable heterogeneity). The overall results were analyzed by using random effects. It is more toward the conventional teaching for enhancing problem-solving skills, but it was not statistically significant ( Fig 4 ).

thumbnail

https://doi.org/10.1371/journal.pone.0277339.g004

Self-directed learning.

Self-directed learning was evaluated by using a fixed-effect model. The heterogeneity by using the I 2 parameter has shown no heterogeneity (0% = might be unimportant)—the overall effect was more inclined toward the conventional method for enhancing self-directed learning. However, there was no statistical difference (p = 0.34) ( Fig 5 ).

thumbnail

https://doi.org/10.1371/journal.pone.0277339.g005

Problem Based Learning is a learning method developed to be used as a solution to conventional learning methods that have been used in various disciplines, one of which is health science. Problem Based Learning is a learning method that emphasizes the active participation of students in solving and solving a given problem, both in group and individual settings, so that it can improve students’ skills in analyzing and solving problems [ 5 , 6 ].

Various studies have been conducted regarding the effectiveness of PBL to be applied in the teaching and learning process [ 13 , 16 , 18 ]. Several factors may influence the implementation of PBL, such as the number of years of study from students, the material taught, and the field of knowledge pursued by students. According to the critical thinking/ knowledge aspect, we found no significant difference between the conventional learning method group and the PBL group (p = 0.29). This finding likely resulted from the lack of association between PBL in enhancing critical thinking/knowledge in the majority of the study. Three studies showed insignificant results from six studies analyzed, and only Tripathi’s (2015) [ 18 ] has a linear result with our hypothesis. Accordingly, Choi et al . stated that their insignificant (p = 0.7) finding was due to a short amount of time of the intervention to produce any meaningful effects [ 17 ]. Therefore, intervention duration might not be an absolute factor of PBL effectiveness, as found by Tripathi [ 18 ]. Likewise, this study also had the shortest intervention duration but still found significant results. Moreover, research conducted by Li et al. related to critical thinking showed a significant difference between the experimental and control groups (p < 0.001) [ 20 ]. Then, Tseng et al., also reported a significant difference in critical thinking scores between the experimental and control groups, where the experimental group had a higher score (p < 0.0001) [ 21 ].

The research sample characteristics can also affect the PBL results. In this meta-analysis, we analyzed the medical students’ data in their first year. First-year students often experience obstacles in adapting to lecture methods that are different from high school teaching methods [ 1 ]. This problem is influenced by various factors, one of which is the difference in lecture methods in each institution. Adaptation to new environments and habits is also a challenge for medical students in the first year. Adaptation to learning methods is a process of response in terms of mental and individual behavior to a demand from the individual or a formal task related to academic work. Therefore, students familiar with the teacher-centered method tend to face difficulty applying the student-centered with PBL method in higher education. They also tend to experience challenges in accepting the study materials, which impact the teaching and learning process in the first semester of lectures for medical students [ 22 ]. Those factors explained above could also affect problem-solving and self-directed learning.

Other aspects besides critical thinking/knowledge of the PBL are problem-solving and self-directed learning. We found that PBL is not superior to conventional learning in enhancing problem-solving (p = 0.47). It might be due to the limited studies that assessed this issue and included in this study. The problem-solving aspect was only analyzed in two studies, and they have different results. Choi et al. [ 17 ] had a higher total sample, and the study also had a higher weight analysis (56.9%) compared to Lohman et al. [ 13 ]. Therefore the results will tend to follow Choi et al. (insignificant finding) [ 17 ], besides several aspects as explained above.

Similar results with problem-solving aspect, PBL also failed to show any superiority in increasing self-directed learning compared to the conventional learning method. Two studies in this aspect had shown insignificant results, such as Lohman et al. (2002) [ 13 ] and Choi et al., [ 17 ]. However, different findings were reported Hayashi et al. (2013) [ 16 ]. According to the baseline characteristic of the study, Hayashi’s study had a longer duration of intervention than Lohman et al. (2002) [ 13 ] and Choi’s [ 17 ] studies. Thus, it might impact the results because the study subjects were exposed to the intervention much longer, so the desired effect was seen [ 16 ]. The PBL learning system that focuses on increasing the active participation of students is expected to be able to improve those aspects compared to using the conventional approach. Research by Tseng et al., 2011 involving 120 nursing students (51 in the experimental group, 69 in the control group) showed a significant difference in self-directed learning scores, where the experimental group had a higher mean value than the control group (p < 0.0001) [ 21 ]. Three aspects of PBL were evaluated in this meta-analysis, and none were significant. Unfortunately, the specific aspect that might impact the result did not mention or explained in each study in detail.

The problem-based learning method has been used widely, and to the best of our knowledge, further investigation about this learning method is needed. The strength of this study was that our meta-analysis evaluated the specific outcome of PBL such as critical thinking/knowledge assessment, problem-solving, and self-directed learning. Several studies discuss the PBL effect on general learning outcomes and specific backgrounds [ 9 , 14 , 18 , 19 , 23 ]. Our meta-analysis not only provided pre-test and post-test scores in each group, but we also explained the outcome in each study. Furthermore, we noted that high levels of heterogeneity across studies were found in this meta-analysis. Factors that may cause heterogeneity include the sample from different countries with different backgrounds. Second, the instrument used to evaluate the PBL progression in each study was different. Third, the duration of intervention was also varied, bringing different outcomes. All of these factors may contribute to our meta-analysis heterogeneity. Subgroup analysis has been conducted to minimize the heterogeneity. This method can only reduce the heterogeneity in terms of the critical thinking/acknowledgment aspect, especially when the duration of intervention was more than six months and when the learning method was compared in the Western country sub-group. Meanwhile, no effect was found in terms of heterogeneity when duration of intervention was less than six months, and the learning method was conducted in the Asian countries sub-group. It might be due to several factors that have been pointed out above. Unfortunately, we cannot run subgroup analyses due to limited studies discussing this topic.

Additionally, we believe that further primary study is needed to evaluate the effectiveness of PBL. A multicenter approach is suggested as the most appropriate method to identify the cumulative effect and the difference between geographic areas or races. Moreover, researchers can also compare between educational centers as well as the impact of culture and technological progress of the local area in the implementation of PBL due to the rarity of the study regarding these topics. Psychological aspects also need to be discussed because medical students in the first year may still have the learning method from high school, potentially affecting the PBL.

In conclusion, according to our analysis, PBL is not superior to conventional learning in improving critical thinking/knowledge, problem-solving and self-directed learning in first-year medical students. In addition, our meta-analysis had several limitations, such as only evaluating the learning outcomes in the first year, and no studies were found with multiyear approach. We could not equate the instruments used in PBL and did not evaluate specifically based on the study program. We also could not assess the socio-demography that might contribute to their learning process, particularly their social culture. Therefore, a multicenter approach is suggested as the most appropriate method to identify the cumulative effect and the difference between geographic areas or races.

Supporting information

https://doi.org/10.1371/journal.pone.0277339.s001

  • View Article
  • Google Scholar
  • PubMed/NCBI
  • 13. Lohman MC, Finkelstein M. to foster problem-solving skill. 2002;121–7.

ORIGINAL RESEARCH article

A bibliometric analysis of the landscape of problem-based learning research (1981–2021).

\r\nFan Zhang&#x;

  • 1 Department of Nephrology, Longhua Hospital Shanghai University of Traditional Chinese Medicine, Shanghai, China
  • 2 Department of Anorectal, Longhua Hospital Shanghai University of Traditional Chinese Medicine, Shanghai, China
  • 3 Department of Cardiology, Longhua Hospital Shanghai University of Traditional Chinese Medicine, Shanghai, China
  • 4 Department of Nursing, Longhua Hospital Shanghai University of Traditional Chinese Medicine, Shanghai, China

Background: Problem-Based Learning (PBL) is an instructional method of hands-on, active learning centered on investigating and resolving messy, real-world problems. This study aims to systematically analyze the current status and hotspots of PBL research and provide insights for research in the field.

Methods: Problem-based learning-related publications were retrieved from the Web of Science Core Collection using “Problem-Based Learning”. Annual publications, countries, institutions, authors, journals, references, and keywords in the field were visually analyzed using the R, VOSviewer, and Microsoft Excel 2019 software.

Results: A total of 2,790 articles and reviews were analyzed, with a steady increase in publications in the field of PBL. Overall, the United States was the major contributor to the study of PBL. Van Der Vleuten CPM was the key researcher in this field. Moreover, most of the publications were published in Medical Education . Keyword analysis showed that current research hotspots focus on the extensions of PBL teaching mode, application of PBL teaching method, and reform of PBL.

Conclusion: Research on PBL is flourishing. Cooperation and exchange between countries and institutions should be strengthened in the future. These findings will provide a better understanding of the state of PBL research and inform future research ideas.

Introduction

Problem-based learning (PBL) is a pedagogy that has received widespread attention in recent years ( Albanese and Mitchell, 1993 ). It emphasizes setting learning into complex problem situations, allowing learners to solve authenticity problems collaboratively, understand the scientific knowledge implicit behind the problems ( Wood, 2003 ; Dolmans et al., 2005 ). In addition to course content, PBL can promote the development of critical thinking skills, problem-solving abilities, and communication skills ( van der Vleuten and Schuwirth, 2019 ). It can also allow working in groups, finding and evaluating research materials, and life-long learning ( Compton et al., 2020 ).

As a broad approach, PBL first originated in medical education in the 1960s at the medical school at McMaster University in Canada ( Jones, 2006 ) and has since been promoted and modified in more than 60 medical schools ( Servant-Miklos et al., 2019 ). PBL was most used in the first 2 years of medical courses, replacing traditional teaching methods in anatomy, pharmacology, and physiology ( Devine et al., 2020 ). Today, PBL has been widely used in business, dentistry, health sciences, law, engineering, education ( Huth et al., 2021 ; Kühner et al., 2021 ; Michalsky and Cohen, 2021 ).

Bibliometrics analysis refers to the qualitative and quantitative evaluation of specific research areas using mathematical and statistical methods to understand the knowledge structure and explore development trends ( Bornmann and Leydesdorff, 2014 ). In recent years, bibliometric types of research have received extensive attention to provide a comprehensive overview of the published literature and identify research frontiers and future research trends ( Liu et al., 2021 ; Shawahna, 2021 ; Yu et al., 2021 ).

Previously published bibliometric studies on PBL have been limited to highly cited articles ( Azer, 2017 ). In order to understand the research trends of PBL teaching, the aim of the study, therefore, is to analyze international scientific publications using both quantitative and qualitative bibliometric analysis on PBL teaching. This work will provide new perspectives and references for future PBL research.

Materials and Methods

Data sources.

Publications about PBL were retrieved from the Web of Science Core Collection database. The database covers over 21,000 peer-reviewed, high-quality academic journals, including open access journals published in over 250 medical, social science and humanities disciplines worldwide, and is widely used for bibliometric analysis.

Moreover, the database provides access to the authors (country), affiliation, keywords, and references cited for each publication, which is necessary for this study.

Search Strategy

The searched strategy was TS = “Problem-Based Learning” from inception to 27 October 2021. No language restrictions. A total of 3,339 publications was retrieved, and after excluding meeting abstracts, editors, letters, and corrections, 2,790 publications were included, of which 156 were reviews, and 2,634 were articles.

Data Analysis

All downloaded documents were imported to the R (version 4.1.1), VOSviewer (version 1.6.15), and Microsoft Excel 2019.

Bibliometrix R package is an open-source tool for quantitative research in scientometrics and bibliometrics ( Aria, 2017 ). VOSviewer is a software tool for constructing and visualizing bibliometric networks, including countries, journals, and authors based on citation, co-citation, or co-authorship relations. VOSviewer also offers text mining functionality that can be used to construct and visualize co-occurrence networks of important terms extracted from a body of scientific literature ( van Eck and Waltman, 2010 ). Scientific knowledge mapping can intuitively understand the research hotspots and development process of each field in the knowledge system and predict the development trend of each field ( Chen, 2004 ).

Trends in Global Publication

Based on the number of annual publications, this period was preliminarily divided into three phases ( Figure 1 ): the first phase is the initial period (1981–1990), with an average of two publications per year; the second phase, from 1991 to 2009, was considered as the development period, with an average of annual publications of 70; and the third phase, from 2010 to present, was known as the stable period when the annual number in this period was at a relatively stable state, and 120 publications were published annually.

www.frontiersin.org

Figure 1. Annual number of publications in the field of PBL research.

Distribution of Countries/Regions

A world map based on the number of publications published in each country is shown in Figure 2A . A total of 87 countries/regions have been published in the field. The United States contributed the most publications (801, 28.71% of all publications), followed by the United Kingdom (267, 9.57%), Canada (249, 8.92%), Australia (201, 7.20%), and the Netherlands (159, 5.70%) ( Figure 2B ). Publications from the United States (21,139 citations) were the most cited, with the United Kingdom (6,402 citations), the Netherlands (6,002 citations), Canada (5,263 citations), and Australia (3,580 citations) ranking second through fifth, respectively ( Figure 2C ).

www.frontiersin.org

Figure 2. Countries were contributing to PBL research. (A) World map of the number of publications published by countries. (B) Top ten countries with the largest number of publications. (C) Total citations of publications from different countries.

The co-authorship analysis found a total of 56 countries/regions with at least five publications published in this field. The five countries with the highest total link strength were the United States (total link strength = 150), the United Kingdom (total link strength = 132), the Netherlands (total link strength = 93), Canada (total link strength = 79) and Australia (total link strength = 65). The network of cooperative relationships between countries is shown in Figure 2A .

Distribution of Institutions

A total of 1,973 institutions have published papers in this field. Among them, the Maastricht University contributed the most (95 records), followed by McMaster University (66 records), Harvard University (47 records), University of Pennsylvania (43 records), and University of Manchester (42 records) ( Figure 3A ).

www.frontiersin.org

Figure 3. Institutions were contributing to PBL research. (A) Top ten institutions with the largest number of publications. (B) Network map of co-authorship between institutions with more than five publications.

We analyzed co-authorship relationships between 187 institutions with at least five publications. Excluding the 24 unconnected items, Figure 3B shows the collaborations of 163 institutions. The five institutions with the highest connection total link strength were Maastricht University (total link strength = 34), Erasmus University Rotterdam (total link strength = 25), Harvard University (total link strength = 25), the University of Sydney (total link strength = 18), and Johns Hopkins University (total link strength = 14).

Analysis of Journals and Research Areas

There are 2,890 papers published in 608 journals. Table 1 lists the top ten most popular journals for publishing papers on PBL. Medical Education published 235 articles, by far the most, followed by Medical Teacher ( n = 194), International Journal of Engineering Education ( n = 127), Advances in Health Sciences Education ( n = 111), Academic Medicine ( n = 100).

www.frontiersin.org

Table 1. The top ten popular journals and cited journals.

We analyzed a total of 141 journals that were co-cited at least 50 publications ( Figure 4 ). Table 1 lists the top ten journals. Of these, Medical Education has the most citations (4,757 citations), followed by Academic Medicine (4,482 citations), Medical Teacher (2,252 citations), Journal of Dental Education (894 citations), and Advances in Health Sciences Education (817 citations).

www.frontiersin.org

Figure 4. Network map of journals that were co-cited at least 50 publications.

The included publications were categorized into 108 research areas. The most representative research areas were Educational Research (1,573 records), HealthCare Sciences (774 records), Engineering (359 records), General Internal Medicine (230 records), Nursing (212 records) ( Table 2 ).

www.frontiersin.org

Table 2. The top ten representative research areas.

Analysis of Authors

In terms of the number of publications, Van Der Vleuten CPM was the most prolific author ( n = 43), followed by Dolmans DHJM ( n = 40), Schmidt HG ( n = 32), Azer SA ( n = 24), Scherpbier AJJA ( n = 21) ( Figure 5A ). From the author’s influence, Schmidt HG has the largest number of citations in this field (1,074), followed by Dolmans DHJM (561), Van Der Vleuten CPM (540), Norman GR (445), Mitchell S (423) ( Figure 5B ). Publications from Van Der Vleuten CPM had the highest h -index (27), followed by Schmidt HG (22), Dolmans DHJM (22), Scherpbier AJJA (16), Wolfhagen IHAP (13) ( Figure 5C ).

www.frontiersin.org

Figure 5. Analysis of authors. (A) The number of author publications. (B) Total citations from different authors in the field of PBL. (C) h -index for authors. (D) Network map of co-authorship between authors with more than three publications.

We further analyzed a total of 212 authors that were co-authorship in more than three publications. After removing non-connected authors from each other, the network shows the collaboration of 29 authors ( Figure 5D ). The five authors with the highest total link strength were Van Der Vleuten CPM (total link strength = 66 times), Dolmans DHJM (52), Wolfhagen IHAP (40), Scherpbier AJJA (33), and Schmidt HG (32).

Citation and Co-citation Analysis

The citation analysis showed that 243 documents had at least 50 citations ( Figure 6A ). Table 3 lists the top ten documents with the highest citations. In addition, we analyzed the 32 references that were co-cited in more than 50 citations ( Figure 6B ). Table 4 lists the top ten references with the highest citations.

www.frontiersin.org

Figure 6. Citations analysis. (A) Network map of citation analysis of documents with more than 50 citations. (B) Network map of co-citations analysis of references with more than 50 citations.

www.frontiersin.org

Table 3. Top ten citations analysis of publications on this field.

www.frontiersin.org

Table 4. Top ten co-citation analyses of cited references on this field.

Co-occurrence Analysis of Keywords

We analyzed a total of 86 keywords that were identified as having occurred more than five times ( Figure 7A ). The colors in the overlay visualization shown in Figure 7B indicate the average publication year of the identified keywords. Most keywords were published after 2012, with greener or yellower colors. The density visualization showed the exact identified keywords mapped by frequency of appearance ( Figure 7C ).

www.frontiersin.org

Figure 7. Co-occurrence analysis of keywords. (A) Network visualization. (B) Overlay visualization. (C) Density visualization.

This study analyzed the bibliometric properties of 2,790 publications included in a citation index of PBL studies conducted over the past 40 years. The trend of annual publications demonstrated that the studies during this period were stable growth. The bibliometric results provide researchers, policymakers, and teaching staff with valuable insights and enable them to get meaningful references based on objective data.

A quantitative and visual analysis of the distribution of countries/regions and institutions shows that the United States and the United Kingdom are the leading countries where PBL research is being conducted. As shown in Figure 3A , there is a greater density and breadth of collaboration between the various countries. Research teams in the United States mainly collaborated with the United Kingdom, Canada, China, Australia, and Europe. In addition, although each institution has its collaborative network, the breadth and intensity of the collaboration are not ideal. The cooperation center mainly revolves around Maastricht University, Mcmaster University, and Havard University, the three institutions with the largest publications. The intricacies of the mapping illustrate two things: first, the close cooperation among institutions that have contributed to the results of PBL research, and second, the continuous development of PBL in the teaching of different disciplines.

Problem-based learning is a problem-oriented teaching method ( Savery and Duffy, 1997 ). It is a teaching model in which students collect information independently around problems, find out and solve problems, and develop independent learning and innovation abilities ( Domingo-Osle et al., 2021 ). Most studies of PBL were published in influential education-related journals such as Medical Education and Medical Teacher . Regarding co-cited journals, we can see that most studies were from high-impact journals. These journals are equally focused on education and influence the direction of research in the field. As shown in Table 2 , in addition to educational research, PBL teaching has now been extended to clinical medicine, engineering, computer science. This result is similar to the findings of another study, in which Azer found that highly cited articles in the field of PBL were distributed among journals in dental and medical education, general medicine, and teaching psychology ( Azer, 2017 ).

In the past decade, the focus of teaching and learning, including medical education, has gradually shifted to developing students’ problem-solving, critical thinking, and self-directed learning skills ( Merisier et al., 2018 ). PBL is being adopted and valued by an increasing number of universities and hospitals as a teaching model that fits well with constructivist learning theory and medical teaching principles ( Al-Azri and Ratnapalan, 2014 ). This phenomenon is corroborated by the results of PBL posting journals and citations presented in Tables 1 – 4 . Dentistry stands out in medical education as one of the most widely implemented disciplines for PBL teaching. Various branches of dentistry such as prosthodontics and orthodontics are convenient subjects and have close cross-fertilization with many fields such as material science, clinical medicine, pathology, physiology ( Ferro et al., 2019 ). Therefore, dentistry teaching requires students to be proficient in dentistry-related courses and, more importantly, to apply and integrate them. As Azer said, the bibliometric analysis of PBL has implications for dental teaching and research ( Azer, 2017 ).

The most prolific authors in PBL studies and the global citations to their work differed. The most prolific and influential author is Van Der Vleuten CPM, while the most cited author was Schmidt HG. In terms of the number of citations, “PBL: a review of literature on its outcomes and implementation issues” ( Albanese and Mitchell, 1993 ) published by Albanese MA was the more influential article, consistent with Azer’s bibliometric results ( Azer, 2017 ), suggesting that this article is a classic citation in the field of PBL. The study compared the effects of PBL teaching and traditional teaching through meta-analysis, thus pointing out the advantages and disadvantages of PBL teaching. However, as shown in Figure 5D , the range of co-author can be roughly divided into five clusters, and the density of collaboration between authors is lower, which may be related to the interdisciplinary application of PBL.

From keyword analysis, the current research focuses on three orientations: (1) Extensions of PBL teaching mode, such as case-based learning, flipped classroom, team-based learning. The pedagogical research around PBL gradually extends to different teaching modes, which is to promote better active and positive learning of learners and the development of education. (2) Applying PBL methods to clinical medicine, especially nursing; “Question” is the best way to promote critical thinking, which is urgently needed in modern nursing to promote the overall quality of nurses; hence, its proper use is essential in fostering the development of clinical reasoning. (3) The reform of PBL, like think, challenge, and decision making. Today, problem-oriented teaching models often involve computer-based programs. Regardless of the technique used, the core of the approach remains the same: real-world problems. Reflections on the PBL reform could be the future direction of the following research in this field.

It is worth noting that the bibliometric analysis also provides new ideas for teaching research. First, PBL is a student-centered teaching model that has been widely used in various disciplines. However, different teaching characteristics in different disciplines exist, and long-term quantitative assessments of its effectiveness are scarce. Second, Problem-based learning is an advanced teaching method, but the classical, traditional teaching methods cannot be rejected wholesale due to reform needs; both can coexist and complement each other’s strengths ( Payne, 2004 ). From the teachers’ perspective, planning the important and difficult points of learning and developing targeted discussion outlines to motivate students is undoubtedly the key to PBL research. Third, questions are the core of PBL, and all learning activities revolve around questions. However, the purpose of PBL is to accomplish the course objectives, such as developing students’ knowledge base and various abilities. Setting up the curriculum and designing the questions according to the learning objectives are the key issues in PBL.

There are several inevitable limitations in this study. First, bibliometric data change with time, and different conclusions may be drawn with time; Second, the bibliometric analysis is only an auxiliary tool, and the results may differ from real-world research conditions; Third, the literature search was limited to Web of Science Core Collection databases, which might have resulted in an election bias to the outcomes; Forth, we limited the search term for the study topic to “PBL,” some relevant articles may be missed, such as “PBL.”

The current study provides an overview of research findings and valuable insights into PBL worldwide. Research on PBL has continued to increase over the past few decades. The most productive country is the United States, participating in nearly 30% of publications, and the leading institution is the Maastricht University. The most attractive journal in terms of PBL is Medical Education . In addition, collaborative research initiatives need to be established between institutions in developing countries and those in developed countries.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author Contributions

FZ: conceptualization. FZ and HW: methodology and writing-original draft preparation. FZ and YB: software and data curation. HCZ: writing-review and editing. All authors have read and agreed to the published version of the manuscript.

This study was supported by the Teaching Department of Longhua Hospital, Shanghai University of Traditional Chinese Medicine (Teaching 542).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

We are grateful for the support and assistance of Wu Xiaoli from the Teaching Department of Longhua Hospital, Shanghai University of Traditional Chinese Medicine.

Al-Azri, H., and Ratnapalan, S. (2014). Problem-based learning in continuing medical education: review of randomized controlled trials. Can. Fam. Physician 60, 157–165.

PubMed Abstract | Google Scholar

Albanese, M. A., and Mitchell, S. (1993). Problem-based learning: a review of literature on its outcomes and implementation issues. Acad. Med. 68, 52–81. doi: 10.1097/00001888-199301000-00012

PubMed Abstract | CrossRef Full Text | Google Scholar

Aria, M. (2017). Bibliometrix: an R-tool for comprehensive science mapping analysis. J. Informetr. 11, 959–975.

Google Scholar

Azer, S. A. (2017). Top-Cited articles in problem-based learning: a bibliometric analysis and quality of evidence assessment. J. Dent. Educ. 81, 458–478. doi: 10.21815/JDE.016.011

Barrows, H. S. (1986). A taxonomy of problem-based learning methods. Med. Educ. 20, 481–486. doi: 10.1111/j.1365-2923.1986.tb01386.x

Barrows, H. S., and Tamblyn, R. B. (1980). Problem-Based Learning:An Approach to Medical Education. New York, NY: Springer.

Bornmann, L., and Leydesdorff, L. (2014). Scientometrics in a changing research landscape: bibliometrics has become an integral part of research quality evaluation and has been changing the practice of research. EMBO Rep. 15, 1228–1232. doi: 10.15252/embr.201439608

Chen, C. (2004). Searching for intellectual turning points: progressive knowledge domain visualization. Proc. Natl. Acad Sci U S A 101, 5303–5310. doi: 10.1073/pnas.0307513100

Colliver, J. A. (2000). Effectiveness of problem-based learning curricula: research and theory. Acad. Med. 75, 259–266. doi: 10.1097/00001888-200003000-00017

Compton, R. M., Owilli, A. O., Norlin, E. E., and Hubbard Murdoch, N. L. (2020). Does problem-based learning in nursing education empower learning? Nurse Educ. Pract. 44:102752. doi: 10.1016/j.nepr.2020.102752

Devine, O. P., Harborne, A. C., Horsfall, H. L., Joseph, T., Marshall-Andon, T., Samuels, R., et al. (2020). The analysis of teaching of medical schools (AToMS) survey: an analysis of 47,258 timetabled teaching events in 25 UK medical schools relating to timing, duration, teaching formats, teaching content, and problem-based learning. BMC Med. 18:126. doi: 10.1186/s12916-020-01571-4

Dochy, F., Segers, M., Van den Bossche, P., and Gijbels, D. (2003). Effects of problem-based learning: a meta-analysis. Learn. Instr. 13, 533–568.

Dolmans, D. H. J. M., De Grave, W., Wolfhagen, I. H. A. P., and van der Vleuten, C. P. M. (2005). Problem-based learning: future challenges for educational practice and research. Med. Educ. 39, 732–741. doi: 10.1111/j.1365-2929.2005.02205.x

Domingo-Osle, M., La Rosa-Salas, V., Ambrosio, L., Elizondo-Rodriguez, N., and Garcia-Vivar, C. (2021). Educational methods used in cancer training for health sciences students: an integrative review. Nurse Educ. Today 97:104704. doi: 10.1016/j.nedt.2020.104704

Ferro, A. S., Nicholson, K., and Koka, S. (2019). Innovative trends in implant dentistry training and education: a narrative review. J. Clin. Med. 8:1618.

Hmelo-Silver, C. E. (2004). Problem-Based learning: what and how do students learn? Educ. Psychol. Rev. 16, 235–266.

Huth, K. C., von Bronk, L., Kollmuss, M., Lindner, S., Durner, J., Hickel, R., et al. (2021). Special teaching formats during the COVID-19 pandemic-a survey with implications for a crisis-proof education. J. Clin. Med. 10:5099.

Jones, R. W. (2006). Problem-based learning: description, advantages, disadvantages, scenarios and facilitation. Anaesth. Intens. Care 34, 485–488. doi: 10.1177/0310057X0603400417

Kühner, S., Ekblad, S., Larsson, J., and Löfgren, J. (2021). Global surgery for medical students - is it meaningful? A mixed-method study. PLoS One 16:e0257297. doi: 10.1371/journal.pone.0257297

Liu, Y., Li, X., Ma, L., and Wang, Y. (2021). Mapping theme trends and knowledge structures of dignity in nursing: a quantitative and co-word biclustering analysis. J. Adv. Nurs. . doi: 10.1111/jan.15097

Merisier, S., Larue, C., and Boyer, L. (2018). How does questioning influence nursing students’ clinical reasoning in problem-based learning? A scoping review. Nurse Educ. Today 65, 108–115.

Michalsky, T., and Cohen, A. (2021). Prompting socially shared regulation of learning and creativity in solving STEM problems. Front. Psychol. 12:722535. doi: 10.3389/fpsyg.2021.722535

Norman, G. R., and Schmidt, H. G. (1992). The psychological basis of problem-based learning: a review of the evidence. Acad. Med. 67, 557–565. doi: 10.1097/00001888-199209000-00002

Norman, G. R., and Schmidt, H. G. (2000). Effectiveness of problem-based learning curricula: theory, practice and paper darts. Med. Educ. 34, 721–728.

Payne, J. D. (2004). Reform of undergraduate medical teaching in the United Kingdom: “problem based learning” v “traditional” is a false debate. BMJ 329:799. doi: 10.1136/bmj.329.7469.799-b

Savery, J., and Duffy, T. (1997). Problem based learning: an instructional model and its constructivist framework. Educ. Technol. 35. 31–38.

Schmidt, H. G. (1983). Problem-based learning: rationale and description. Med. Educ. 17, 11–16. doi: 10.1111/j.1365-2923.1983.tb01086.x

Schmidt, H. G. (1993). Foundations of problem-based learning: some explanatory notes. Med. Educ. 27, 422–432.

Servant-Miklos, V. F. C., Woods, N. N., and Dolmans, D. H. J. M. (2019). Celebrating 50 years of problem-based learning: progress, pitfalls and possibilities. Adv. Health Sci. Educ. 24, 849–851. doi: 10.1007/s10459-019-09947-9

Shawahna, R. (2021). Scoping and bibliometric analysis of promoters of therapeutic inertia in hypertension. Am. J. Manag. Care 27, e386–e394. doi: 10.37765/ajmc.2021.88782

van der Vleuten, C. P. M., and Schuwirth, L. W. T. (2019). Assessment in the context of problem-based learning. Adv. Health Sci. Educ. 24, 903–914. doi: 10.1007/s10459-019-09909-1

van Eck, N. J., and Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 84, 523–538. doi: 10.1007/s11192-009-0146-3

Vernon, D. T., and Blake, R. L. (1993). Does problem-based learning work? A meta-analysis of evaluative research. Acad. Med. 68, 550–563. doi: 10.1097/00001888-199307000-00015

Wood, D. F. (2003). Problem based learning. BMJ 326, 328–330.

Wood, D. F. (2008). Problem based learning. BMJ 336:971.

Yu, H., Wang, Q., Wu, W., Zeng, W., and Feng, Y. (2021). Therapeutic effects of melatonin on ocular diseases: knowledge map and perspective. Front. Pharmacol. 12:721869. doi: 10.3389/fphar.2021.721869

Keywords : problem-based learning, bibliometric analysis, education, citation, research

Citation: Zhang F, Wang H, Bai Y and Zhang H (2022) A Bibliometric Analysis of the Landscape of Problem-Based Learning Research (1981–2021). Front. Psychol. 13:828390. doi: 10.3389/fpsyg.2022.828390

Received: 03 December 2021; Accepted: 22 February 2022; Published: 15 March 2022.

Reviewed by:

Copyright © 2022 Zhang, Wang, Bai and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Huachun Zhang, [email protected]

† These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

problem based learning research articles

The Interdisciplinary Journal of Problem-based Learning (IJPBL) publishes relevant, interesting, and challenging peer-reviewed articles of research, analysis, or promising practice related to all aspects of implementing problem-based learning (PBL) in K–12 and post-secondary classrooms. ISSN 1541-5015.

Announcements

Special issue of the interdisciplinary journal of problem-based learning: designing for equity within problem-based and project-based learning.

It is our hope that this special issue of the IJPBL creates an opportunity for collecting and sharing some of the effective and innovative ideas related to the design and implementation of PBL for equity and social justice. We look forward to helping disseminate collected ideas to other educators so that more students might benefit from improved practices.

Current Issue

Vol. 17 No. 2 (2023): Special issue: Research Methodologies for Studying PBL

Published: 2023-12-31

Special Issue: Research Methodologies for studying PBL

Introduction to special issue “research methodologies for studying problem-based and project-based learning, bibliometric review methodology and state of the science bibliometric review of research on problem-based learning, 2017-2021, scoping review methodology and its use to review online project-based learning in higher education, 2020-2023, conducting problem-based learning meta-analysis: complexities, implications, and best practices, path analysis: the predictive relationships of problem-based learning processes on preservice teachers’ learning strategies, how realist reviews might be helpful to further insights in problem-based learning from theoretical grounding to practical application, design-based research method in pbl/pjbl: a case in nursing education, an interactional ethnographic exploration of in-time and over time mentor-student interactions in invention education, visual representations for studying collaborative inquiry.

Xun Ge, The University of North Texas Krista Glazewski, North Carolina State University Woei Hung, The University of North Dakota

Associate Editors

Susan Bridges, The University of Hong Kong Stefanie Chye, National Institute of Education 

Interim Associate Editors

Victor Law, The Unversity of New Mexico Kun Huang, The University of Kentucky Nada Debbaugh, George Mason University Heather Leary, Brighman Young University Nachamma Sockalingam, Singapore University of Technology and Design

Editorial Assistant

Ceyhun Muftuoglu, The University of Oklahoma

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Med Educ

Logo of bmcmedu

Effectiveness of problem-based learning methodology in undergraduate medical education: a scoping review

Joan carles trullàs.

1 Medical Education Cathedra, School of Medicine, University of Vic-Central University of Catalonia, Vic, Barcelona, Spain

2 Internal Medicine Service, Hospital de Olot i Comarcal de La Garrotxa, Olot, Girona, Spain

3 The Tissue Repair and Regeneration Laboratory (TR2Lab), University of Vic-Central University of Catalonia, Vic, Barcelona, Spain

Carles Blay

4 Catalan Institute of Health (ICS) – Catalunya Central, Barcelona, Spain

Elisabet Sarri

Ramon pujol, associated data.

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Problem-based learning (PBL) is a pedagogical approach that shifts the role of the teacher to the student (student-centered) and is based on self-directed learning. Although PBL has been adopted in undergraduate and postgraduate medical education, the effectiveness of the method is still under discussion. The author’s purpose was to appraise available international evidence concerning to the effectiveness and usefulness of PBL methodology in undergraduate medical teaching programs.

The authors applied the Arksey and O’Malley framework to undertake a scoping review. The search was carried out in February 2021 in PubMed and Web of Science including all publications in English and Spanish with no limits on publication date, study design or country of origin.

The literature search identified one hundred and twenty-four publications eligible for this review. Despite the fact that this review included many studies, their design was heterogeneous and only a few provided a high scientific evidence methodology (randomized design and/or systematic reviews with meta-analysis). Furthermore, most were single-center experiences with small sample size and there were no large multi-center studies. PBL methodology obtained a high level of satisfaction, especially among students. It was more effective than other more traditional (or lecture-based methods) at improving social and communication skills, problem-solving and self-learning skills. Knowledge retention and academic performance weren’t worse (and in many studies were better) than with traditional methods. PBL was not universally widespread, probably because requires greater human resources and continuous training for its implementation.

PBL is an effective and satisfactory methodology for medical education. It is likely that through PBL medical students will not only acquire knowledge but also other competencies that are needed in medical professionalism.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12909-022-03154-8.

There has always been enormous interest in identifying the best learning methods. In the mid-twentieth century, US educator Edgar Dale proposed which actions would lead to deeper learning than others and published the well-known (and at the same time controversial) “Cone of Experience or Cone of Dale”. At the apex of the cone are oral representations (verbal descriptions, written descriptions, etc.) and at the base is direct experience (based on a person carrying out the activity that they aim to learn), which represents the greatest depth of our learning. In other words, each level of the cone corresponds to various learning methods. At the base are the most effective, participative methods (what we do and what we say) and at the apex are the least effective, abstract methods (what we read and what we hear) [ 1 ]. In 1990, psychologist George Miller proposed a framework pyramid to assess clinical competence. At the lowest level of the pyramid is knowledge (knows), followed by the competence (knows how), execution (shows how) and finally the action (does) [ 2 ]. Both Miller’s pyramid and Dale’s cone propose a very efficient way of training and, at the same time, of evaluation. Miller suggested that the learning curve passes through various levels, from the acquisition of theoretical knowledge to knowing how to put this knowledge into practice and demonstrate it. Dale stated that to remember a high percentage of the acquired knowledge, a theatrical representation should be carried out or real experiences should be simulated. It is difficult to situate methodologies such as problem-based learning (PBL), case-based learning (CBL) and team-based learning (TBL) in the context of these learning frameworks.

In the last 50 years, various university education models have emerged and have attempted to reconcile teaching with learning, according to the principle that students should lead their own learning process. Perhaps one of the most successful models is PBL that came out of the English-speaking environment. There are many descriptions of PBL in the literature, but in practice there is great variability in what people understand by this methodology. The original conception of PBL as an educational strategy in medicine was initiated at McMaster University (Canada) in 1969, leaving aside the traditional methodology (which is often based on lectures) and introducing student-centered learning. The new formulation of medical education proposed by McMaster did not separate the basic sciences from the clinical sciences, and partially abandoned theoretical classes, which were taught after the presentation of the problem. In its original version, PBL is a methodology in which the starting point is a problem or a problematic situation. The situation enables students to develop a hypothesis and identify learning needs so that they can better understand the problem and meet the established learning objectives [ 3 , 4 ]. PBL is taught using small groups (usually around 8–10 students) with a tutor. The aim of the group sessions is to identify a problem or scenario, define the key concepts identified, brainstorm ideas and discuss key learning objectives, research these and share this information with each other at subsequent sessions. Tutors are used to guide students, so they stay on track with the learning objectives of the task. Contemporary medical education also employs other small group learning methods including CBL and TBL. Characteristics common to the pedagogy of both CBL and TBL include the use of an authentic clinical case, active small-group learning, activation of existing knowledge and application of newly acquired knowledge. In CBL students are encouraged to engage in peer learning and apply new knowledge to these authentic clinical problems under the guidance of a facilitator. CBL encourages a structured and critical approach to clinical problem-solving, and, in contrast to PBL, is designed to allow the facilitator to correct and redirect students [ 5 ]. On the other hand, TBL offers a student-centered, instructional approach for large classes of students who are divided into small teams of typically five to seven students to solve clinically relevant problems. The overall similarities between PBL and TBL relate to the use of professionally relevant problems and small group learning, while the main difference relates to one teacher facilitating interactions between multiple self-managed teams in TBL, whereas each small group in PBL is facilitated by one teacher. Further differences are related to mandatory pre-reading assignments in TBL, testing of prior knowledge in TBL and activating prior knowledge in PBL, teacher-initiated clarifying of concepts that students struggled with in TBL versus students-generated issues that need further study in PBL, inter-team discussions in TBL and structured feedback and problems with related questions in TBL [ 6 ].

In the present study we have focused on PBL methodology, and, as attractive as the method may seem, we should consider whether it is really useful and effective as a learning method. Although PBL has been adopted in undergraduate and postgraduate medical education, the effectiveness (in terms of academic performance and/or skill improvement) of the method is still under discussion. This is due partly to the methodological difficulty in comparing PBL with traditional curricula based on lectures. To our knowledge, there is no systematic scoping review in the literature that has analyzed these aspects.

The main motivation for carrying out this research and writing this article was scientific but also professional interest. We believe that reviewing the state of the art of this methodology once it was already underway in our young Faculty of Medicine, could allow us to know if we were on the right track and if we should implement changes in the training of future doctors.

The primary goal of this study was to appraise available international evidence concerning to the effectiveness and usefulness of PBL methodology in undergraduate medical teaching programs. As the intention was to synthesize the scattered evidence available, the option was to conduct a scoping review. A scoping study tends to address broader topics where many different study designs might be applicable. Scoping studies may be particularly relevant to disciplines, such as medical education, in which the paucity of randomized controlled trials makes it difficult for researchers to undertake systematic reviews [ 7 , 8 ]. Even though the scoping review methodology is not widely used in medical education, it is well established for synthesizing heterogeneous research evidence [ 9 ].

The specific aims were: 1) to determine the effectiveness of PBL in academic performance (learning and retention of knowledge) in medical education; 2) to determine the effectiveness of PBL in other skills (social and communication skills, problem solving or self-learning) in medical education; 3) to know the level of satisfaction perceived by the medical students (and/or tutors) when they are taught with the PBL methodology (or when they teach in case of tutors).

This review was guided by Arksey and O’Malley’s methodological framework for conducting scoping reviews. The five main stages of the framework are: (1) identifying the research question; (2) ascertaining relevant studies; (3) determining study selection; (4) charting the data; and (5) collating, summarizing and reporting the results [ 7 ]. We reported our process according to the PRISMA Extension for Scoping Reviews [ 10 ].

Stage 1: Identifying the research question

With the goals of the study established, the four members of the research team established the research questions. The primary research question was “What is the effectiveness of PBL methodology for learning in undergraduate medicine?” and the secondary question “What is the perception and satisfaction of medical students and tutors in relation to PBL methodology?”.

Stage 2: Identifying relevant studies

After the research questions and a search strategy were defined, the searches were conducted in PubMed and Web of Science using the MeSH terms “problem-based learning” and “Medicine” (the Boolean operator “AND” was applied to the search terms). No limits were set on language, publication date, study design or country of origin. The search was carried out on 14th February 2021. Citations were uploaded to the reference manager software Mendeley Desktop (version 1.19.8) for title and abstract screening, and data characterization.

Stage 3: Study selection

The searching strategy in our scoping study generated a total of 2399 references. The literature search and screening of title, abstract and full text for suitability was performed independently by one author (JCT) based on predetermined inclusion criteria. The inclusion criteria were: 1) PBL methodology was the major research topic; 2) participants were undergraduate medical students or tutors; 3) the main outcome was academic performance (learning and knowledge retention); 4) the secondary outcomes were one of the following: social and communication skills, problem solving or self-learning and/or student/tutor satisfaction; 5) all types of studies were included including descriptive papers, qualitative, quantitative and mixed studies methods, perspectives, opinion, commentary pieces and editorials. Exclusion criteria were studies including other types of participants such as postgraduate medical students, residents and other health non-medical specialties such as pharmacy, veterinary, dentistry or nursing. Studies published in languages other than Spanish and English were also excluded. Situations in which uncertainty arose, all authors (CB, ES, RP) discussed the publication together to reach a final consensus. The outcomes of the search results and screening are presented in Fig.  1 . One-hundred and twenty-four articles met the inclusion criteria and were included in the final analysis.

An external file that holds a picture, illustration, etc.
Object name is 12909_2022_3154_Fig1_HTML.jpg

Study flow PRISMA diagram. Details the review process through the different stages of the review; includes the number of records identified, included and excluded

Stage 4: Charting the data

A data extraction table was developed by the research team. Data extracted from each of the 124 publications included general publication details (year, author, and country), sample size, study population, design/methodology, main and secondary outcomes and relevant results and/or conclusions. We compiled all data into a single spreadsheet in Microsoft Excel for coding and analysis. The characteristics and the study subject of the 124 articles included in this review are summarized in Tables ​ Tables1 1 and ​ and2. 2 . The detailed results of the Microsoft Excel file is also available in Additional file 1 .

Characteristics of the 124 publications included in the scoping review

a The number of publications of each country appears in parentheses.

b Including: Bahrain, Iran, South Korea, Pakistan, Philippines, Singapore, Sri Lanka, Taiwan and Vietnam.

c Including: Belgium, Georgia, Netherlands and Sweden.

d Forty-eight studies included secondary outcomes: including student satisfaction (24), tutor satisfaction (9), knowledge retention (5), social and/or communication skills (5), reasoning (1) and other outcomes (4)

Study design according to main and secondary outcomes and continents

a Sample size was available in 99 studies. Results are expressed in median and [range]

Stage 5: Collating, summarizing and reporting the results

As indicated in the search strategy (Fig.  1 ) this review resulted in the inclusion of 124 publications. Publication years of the final sample ranged from 1990 to 2020, the majority of the publications (51, 41%) were identified for the years 2010–2020 and the years in which there were more publications were 2001, 2009 and 2015. Countries from the six continents were represented in this review. Most of the publications were from Asia (especially China and Saudi Arabia) and North America followed by Europe, and few studies were from Africa, Oceania and South America. The country with more publications was the United States of America ( n  = 27). The most frequent designs of the selected studies were surveys or questionnaires ( n  = 45) and comparative studies ( n  = 48, only 16 were randomized) with traditional or lecture-based learning methodologies (in two studies the comparison was with simulation) and the most frequently measured outcomes were academic performance followed by student satisfaction (48 studies measured more than one outcome). The few studies with the highest level of scientific evidence (systematic review and meta-analysis and randomized studies) were conducted mostly in Asian countries (Tables  1 and ​ and2). 2 ). The study subject was specified in 81 publications finding a high variability but at the same time great representability of almost all disciplines of the medical studies.

The sample size was available in 99 publications and the median [range] of the participants was 132 [14–2061]. According to study population, there were more participants in the students’ focused studies (median 134 and range 16–2061) in comparison with the tutors’ studies (median 53 and range 14–494).

Finally, after reviewing in detail the measured outcomes (main and secondary) according to the study design (Table ​ (Table2 2 and Additional file 1 ) we present a narrative overview and a synthesis of the main findings.

Main outcome: academic performance (learning and knowledge retention)

Seventy-one of the 124 publications had learning and/or knowledge retention as a measured outcome, most of them ( n  = 45) were comparative studies with traditional or lecture-based learning and 16 were randomized. These studies were varied in their methodology, were performed in different geographic zones, and normally analyzed the experience of just one education center. Most studies ( n  = 49) reported superiority of PBL in learning and knowledge acquisition [ 11 – 59 ] but there was no difference between traditional and PBL curriculums in another 19 studies [ 60 – 78 ]. Only three studies reported that PBL was less effective [ 79 – 81 ], two of them were randomized (in one case favoring simulation-based learning [ 80 ] and another favoring lectures [ 81 ]) and the remaining study was based on tutors’ opinion rather than real academic performance [ 79 ]. It is noteworthy that the four systematic reviews and meta-analysis included in this scoping review, all carried out in China, found that PBL was more effective than lecture-based learning in improving knowledge and other skills (clinical, problem-solving, self-learning and collaborative) [ 40 , 51 , 53 , 58 ]. Another relevant example of the superiority of the PBL method over the traditional method is the experience reported by Hoffman et al. from the University of Missouri-Columbia. The authors analyzed the impact of implementing the PBL methodology in its Faculty of Medicine and revealed an improvement in the academic results that lasted for over a decade [ 31 ].

Secondary outcomes

Social and communication skills.

We found five studies in this scoping review that focused on these outcomes and all of them described that a curriculum centered on PBL seems to instill more confidence in social and communication skills among students. Students perceived PBL positively for teamwork, communication skills and interpersonal relations [ 44 , 45 , 67 , 75 , 82 ].

Student satisfaction

Sixty publications analyzed student satisfaction with PBL methodology. The most frequent methodology were surveys or questionnaires (30 studies) followed by comparative studies with traditional or lecture-based methodology (19 studies, 7 of them were randomized). Almost all the studies (51) have shown that PBL is generally well-received [ 11 , 13 , 18 – 22 , 26 , 29 , 34 , 37 , 39 , 41 , 42 , 46 , 50 , 56 , 58 , 63 , 64 , 66 , 78 , 82 – 110 ] but in 9 studies the overall satisfaction scores for the PBL program were neutral [ 76 , 111 – 116 ] or negative [ 117 , 118 ]. Some factors that have been identified as key components for PBL to be successful include: a small group size, the use of scenarios of realistic cases and good management of group dynamics. Despite a mostly positive assessment of the PBL methodology by the students, there were some negative aspects that could be criticized or improved. These include unclear communication of the learning methodology, objectives and assessment method; bad management and organization of the sessions; tutors having little experience of the method; and a lack of standardization in the implementation of the method by the tutors.

Tutor satisfaction

There are only 15 publications that analyze the satisfaction of tutors, most of them surveys or questionnaires [ 85 , 88 , 92 , 98 , 108 , 110 , 119 ]. In comparison with the satisfaction of the students, here the results are more neutral [ 112 , 113 , 115 , 120 , 121 ] and even unfavorable to the PBL methodology in two publications [ 117 , 122 ]. PBL teaching was favored by tutors when the institutions train them in the subject, when there was administrative support and adequate infrastructure and coordination [ 123 ]. In some experiences, the PBL modules created an unacceptable toll of anxiety, unhappiness and strained relations.

Other skills (problem solving and self-learning)

The effectiveness of the PBL methodology has also been explored in other outcomes such as the ability to solve problems and to self-directed learning. All studies have shown that PBL is more effective than lecture-based learning in problem-solving and self-learning skills [ 18 , 24 , 40 , 48 , 67 , 75 , 93 , 104 , 124 ]. One single study found a poor accuracy of the students’ self-assessment when compared to their own performance [ 125 ]. In addition, there are studies that support PBL methodology for integration between basic and clinical sciences [ 126 ].

Finally, other publications have reported the experience of some faculties in the implementation of the PBL methodology. Different experiences have demonstrated that it is both possible and feasible to shift from a traditional curriculum to a PBL program, recognizing that PBL methodology is complex to plan and structure, needs a large number of human and material resources, requiring an immense teacher effort [ 28 , 31 , 94 , 127 – 133 ]. In addition, and despite its cost implication, a PBL curriculum can be successfully implemented in resource-constrained settings [ 134 , 135 ].

We conducted this scoping review to explore the effectiveness and satisfaction of PBL methodology for teaching in undergraduate medicine and, to our knowledge, it is the only study of its kind (systematic scoping review) that has been carried out in the last years. Similarly, Vernon et al. conducted a meta-analysis of articles published between 1970 and 1992 and their results generally supported the superiority of the PBL approach over more traditional methods of medical education [ 136 ]. PBL methodology is implemented in medical studies on the six continents but there is more experience (or at least more publications) from Asian countries and North America. Despite its apparent difficulties on implementation, a PBL curriculum can be successfully implemented in resource-constrained settings [ 134 , 135 ]. Although it is true that the few studies with the highest level of scientific evidence (randomized studies and meta-analysis) were carried out mainly in Asian countries (and some in North America and Europe), there were no significant differences in the main results according to geographical origin.

In this scoping review we have included a large number of publications that, despite their heterogeneity, tend to show favorable results for the usefulness of the PBL methodology in teaching and learning medicine. The results tend to be especially favorable to PBL methodology when it is compared with traditional or lecture-based teaching methods, but when compared with simulation it is not so clear. There are two studies that show neutral [ 71 ] or superior [ 80 ] results to simulation for the acquisition of specific clinical skills. It seems important to highlight that the four meta-analysis included in this review, which included a high number of participants, show results that are clearly favorable to the PBL methodology in terms of knowledge, clinical skills, problem-solving, self-learning and satisfaction [ 40 , 51 , 53 , 58 ].

Regarding the level of satisfaction described in the surveys or questionnaires, the overall satisfaction rate was higher in the PBL students when compared with traditional learning students. Students work in small groups, allowing and promoting teamwork and facilitating social and communication skills. As sessions are more attractive and dynamic than traditional classes, this could lead to a greater degree of motivation for learning.

These satisfaction results are not so favorable when tutors are asked and this may be due to different reasons; first, some studies are from the 90s, when the methodology was not yet fully implemented; second, the number of tutors included in these studies is low; and third, and perhaps most importantly, the complaints are not usually due to the methodology itself, but rather due to lack of administrative support, and/or work overload. PBL methodology implies more human and material resources. The lack of experience in guided self-learning by lecturers requires more training. Some teachers may not feel comfortable with the method and therefore do not apply it correctly.

Despite how effective and/or attractive the PBL methodology may seem, some (not many) authors are clearly detractors and have published opinion articles with fierce criticism to this methodology. Some of the arguments against are as follows: clinical problem solving is the wrong task for preclinical medical students, self-directed learning interpreted as self-teaching is not appropriate in undergraduate medical education, relegation to the role of facilitators is a misuse of the faculty, small-group experience is inherently variable and sometimes dysfunctional, etc. [ 137 ].

In light of the results found in our study, we believe that PBL is an adequate methodology for the training of future doctors and reinforces the idea that the PBL should have an important weight in the curriculum of our medical school. It is likely that training through PBL, the doctors of the future will not only have great knowledge but may also acquire greater capacity for communication, problem solving and self-learning, all of which are characteristics that are required in medical professionalism. For this purpose, Koh et al. analyzed the effect that PBL during medical school had on physician competencies after graduation, finding a positive effect mainly in social and cognitive dimensions [ 138 ].

Despite its defects and limitations, we must not abandon this methodology and, in any case, perhaps PBL should evolve, adapt, and improve to enhance its strengths and improve its weaknesses. It is likely that the new generations, trained in schools using new technologies and methodologies far from lectures, will feel more comfortable (either as students or as tutors) with methodologies more like PBL (small groups and work focused on problems or projects). It would be interesting to examine the implementation of technologies and even social media into PBL sessions, an issue that has been poorly explorer [ 139 ].

Limitations

Scoping reviews are not without limitations. Our review includes 124 articles from the 2399 initially identified and despite our efforts to be as comprehensive as possible, we may have missed some (probably few) articles. Even though this review includes many studies, their design is very heterogeneous, only a few include a large sample size and high scientific evidence methodology. Furthermore, most are single-center experiences and there are no large multi-center studies. Finally, the frequency of the PBL sessions (from once or twice a year to the whole curriculum) was not considered, in part, because most of the revised studies did not specify this information. This factor could affect the efficiency of PBL and the perceptions of students and tutors about PBL. However, the adoption of a scoping review methodology was effective in terms of summarizing the research findings, identifying limitations in studies’ methodologies and findings and provided a more rigorous vision of the international state of the art.

Conclusions

This systematic scoping review provides a broad overview of the efficacy of PBL methodology in undergraduate medicine teaching from different countries and institutions. PBL is not a new teaching method given that it has already been 50 years since it was implemented in medicine courses. It is a method that shifts the leading role from teachers to students and is based on guided self-learning. If it is applied properly, the degree of satisfaction is high, especially for students. PBL is more effective than traditional methods (based mainly on lectures) at improving social and communication skills, problem-solving and self-learning skills, and has no worse results (and in many studies better results) in relation to academic performance. Despite that, its use is not universally widespread, probably because it requires greater human resources and continuous training for its implementation. In any case, more comparative and randomized studies and/or other systematic reviews and meta-analysis are required to determine which educational strategies could be most suitable for the training of future doctors.

Acknowledgements

Not applicable

Abbreviations

Authors’ contributions.

JCT had the idea for the article, performed the literature search and data analysis and drafted the first version of the manuscript. CB, ES and RP contributed to the data analysis and suggested revisions to the manuscript. All authors read and approved the final manuscript.

No funding was received for conducting this study.

Declarations

Not applicable for a literature review.

All authors declare that they have no conflict of interest.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References:

  • Our Mission

Students assemble a model wind turbine they constructed.

New Research Makes a Powerful Case for PBL

Two new gold-standard studies provide compelling evidence that project-based learning is an effective strategy for all students—including historically marginalized ones.

When Gil Leal took AP Environmental Science in his junior year of high school, he was surprised by how different it was from his other AP classes. Instead of spending the bulk of the time sitting through lectures, taking notes, and studying abstract texts, his class visited a strawberry farm in the valley nearby.

It wasn’t just for a tour. Leal and his peers were tasked with thinking about the many challenges that modern farms confront, from water shortages to pest infestations and erosion. More surprising to Leal: Students were asked to design their own solutions, incorporating what they had learned about things like soil composition, ecosystem dynamics, and irrigation systems.

Now an environmental science major at UCLA—and a first-generation college student—Leal sees the visit as a pivotal moment that led to his decision to pursue science in college. He had never visited a farm before, and was used to a traditional sit-and-listen learning model.

“In other classes, it was lecture, readings, test,” said Leal, “but in AP Environmental Science we worked on projects with other students, discussed our ideas, considered different perspectives—and I learned so much more this way.”

Leal’s AP class, taught by Brandie Borges, is part of a new generation of classes that transform traditional teacher-led instruction into a more student-centered, project-based approach—requiring students to work together as they tackle complex, real-world problems that emphasize uncertainty, iterative thinking, and innovation. Proponents of project-based learning (PBL) argue that it fosters a sense of purpose in young learners, pushes them to think critically, and prepares them for modern careers that prize skills like collaboration, problem-solving, and creativity.

Critics say that the pedagogy places too much responsibility on novice learners, and ignores the evidence about the effectiveness of direct instruction by teachers. By de-emphasizing knowledge transfer from experts to beginners, the critics suggest, PBL undermines content knowledge and subject fluency.

While project-based learning and direct instruction aren't incompatible, evidence that might settle the deeper controversy over PBL's effectiveness has been sparse. Only a handful of studies over the last decades have established a causal relationship between structured project-based learning and student outcomes—in either direction.

But two major new gold-standard studies—both funded by Lucas Education Research , a sister division of Edutopia—conducted by researchers from the University of Southern California and Michigan State University, provide compelling evidence that project-based learning is an effective strategy for all students, outperforming traditional curricula not only for high achieving students, but across grade levels and racial and socioeconomic groups.

Reimagining Advanced Placement Courses

The two studies involved over 6,000 students in 114 schools across the nation, with more than 50 percent of students coming from low-income households.

In the AP study , which included Gil Leal’s class along with over 3,600 students in both AP Environmental Science and AP U.S. Government and Politics courses from five districts serving a diverse student body, researchers looked at a broad range of project-based activities in the sciences and humanities.

In one example, students in Amber Graeber’s AP Government class took part in a simulation of an electoral caucus. Meanwhile, instead of simply reading about Supreme Court cases, students in Erin Fisher’s class studied historic cases and then took on real-world roles, arguing the cases in mock court, acting as reporters, and designing campaign ads and stump speeches to make their case.

Researchers found that nearly half of students in project-based classrooms passed their AP tests, outperforming students in traditional classrooms by 8 percentage points. Students from low-income households saw similar gains compared to their wealthier peers, making a strong case that well-structured PBL can be a more equitable approach than teacher-centered ones. Importantly, the improvements in teaching efficacy were both significant and durable: When teachers in the study taught the same curriculum for a second year, PBL students outperformed students in traditional classrooms by 10 percentage points.

The study results nudged at entrenched ideas about how to best teach students from different backgrounds. “There’s a belief among some educators and some policymakers that students from underserved backgrounds… aren’t ready to have student-centered instruction where they’re driving their own learning,” said USC researcher Anna Saavedra, the lead researcher on the AP study. “And so there’s this idea, and the results of this study really challenged that notion.”

Nationally, the researchers concluded, 30 percent of students from low-income households take AP tests, but that number jumped to 38 percent for students in PBL classrooms—there are more low-income students taking AP tests using project-based learning, and more are passing as well.

It may seem counterintuitive that a student-centered approach is effective in an environment that’s so focused on high-stakes testing, but the results suggest otherwise.

“Students felt like the work was more authentic,” said Saavedra, suggesting a possible explanation for the improvements. “There were more connections to their real lives. For example, in the AP Environmental Science course, they were learning about their ecological footprint and thinking: How do my behaviors affect the health of my community and of the larger world?”

Authentic Learning

But project-based learning isn’t just for high school kids. In Billie Freeland’s third-grade class, PBL not only builds students’ interest in science but also helps them make more connections with the world around them, generating a deep understanding of—and appreciation for—science, she says.

“Third-grade students work on the ‘Toy Unit,’” said Freeland. “But don’t let the name fool you.... Third graders learn the concepts of gravity, friction, force, and direction by designing toys from simple objects such as water bottles, straws, and recycled milk cartons. The unit ends with them designing their own toys that use magnetic or electrical force,” she told researchers, while emphasizing that the projects are aligned with Next Generation Science Standards (NGSS).

Freeland’s class was one of dozens involved in the large-scale study examining the effectiveness of PBL in elementary science classes . In the study, researchers from Michigan State University and the University of Michigan studied 2,371 third-grade students in 46 schools who were randomly assigned to a business-as-usual control group or a treatment group. The schools selected for the study were diverse: 62 percent of the schools’ student bodies qualified for free or reduced-price lunch, and 58 percent were students of color.

Like the high school students in the AP study, elementary students in PBL classrooms outperformed their peers, this time by 8 percentage points on a test of science learning. The pattern held across socioeconomic class and across all reading ability levels: In the project-based learning group, all boats rose on the tide—and both struggling readers and highly proficient readers outperformed their counterparts in traditional classrooms.

“The beauty of all of this, which is really quite lovely, is that we have PBL in science, a progression of it, from elementary through high school,” said Barbara Schneider, a professor of education at Michigan State University who worked on the study. “Our findings are consistent all across elementary and secondary school, which is really quite remarkable. And in both cases, we’re looking at substantial increases in science achievement.”

The Takeaway: In two gold-standard, randomized, controlled trials of thousands of students in diverse school systems across the U.S., project-based learning significantly outperformed traditional curricula, raising academic performance across grade levels, socioeconomic subgroups, and reading ability. To learn more about the AP courses and the research, watch the videos Reinventing AP Courses With Rigorous Project-Based Learning  and  A Project-Based Approach to Teaching Elementary Science .

  • Research article
  • Open access
  • Published: 29 April 2021

Does problem-based learning education improve knowledge, attitude, and perception toward patient safety among nursing students? A randomized controlled trial

  • Hossein Jamshidi 1 ,
  • Masumeh Hemmati Maslakpak 2 &
  • Naser Parizad   ORCID: orcid.org/0000-0001-7393-3010 3  

BMC Nursing volume  20 , Article number:  70 ( 2021 ) Cite this article

11k Accesses

12 Citations

Metrics details

Patient safety is a top priority for any health care system. Most universities are looking for teaching methods through which they would be able to enhance students’ clinical decision-making capabilities and their self-centered learning to ensure safe and quality nursing care. Therefore, this study aimed to determine the effect of patient safety education through problem-based learning (PBL) on nursing students’ knowledge, attitude, and perceptions toward patient safety.

This randomized, controlled trial was conducted from September 2019 to January 2020. A total of 78 fourth-year nursing students participated in this study. The participants were randomly assigned to either the intervention group or the control group. In the intervention group, the educational materials were presented to the students using the PBL method during eight sessions of 45–60 min. In each control group, nursing students received eight education sessions through lectures and discussing the same educational content. Data were gathered 1 month after the intervention using demographic information and knowledge, attitudes, and perception questionnaires. Data were analyzed in SPSS ver. 22.0 using descriptive (mean and standard deviation) and inferential (chi-square test, independent t-test, paired t-test, and analysis of covariance (ANCOVA)) statistics.

The results indicated that the difference in the mean scores of knowledge, attitudes, and perceptions of the nursing students about patient safety was statistically significant between the two groups after the PBL education ( p  = 0.001). The mean scores of students’ knowledge, attitude, and perceptions of patient safety increased significantly in the intervention group.

Conclusions

Implementing patient safety education through PBL positively affects knowledge, attitudes, and perceptions of patient safety among nursing students. Thus, the research team recommended the PBL method to be used by nursing professors to improve nursing students’ clinical skills and cognitive abilities to ensure safe patient care.

Trial registration

IRCT20190925044881N1 ; October 17, 2019.

Peer Review reports

Patient safety is a priority issue for all health care systems worldwide [ 1 , 2 ]. Providing safe and error-free care is the ultimate goal of all healthcare systems [ 3 ]. Nurses are leading healthcare team members [ 4 ], and they have a fundamental responsibility to ensure patient safety [ 5 ]. It is estimated that there are 421 million hospital admissions worldwide every year. Meanwhile, approximately 7.42 million cases of adverse events occur during these hospitalizations, making patient harm is the 14th leading cause of global deaths [ 6 ]. Furthermore, one in every ten patients is harmed while receiving hospital care as the world health organization (WHO) considers patient safety as an endemic and epidemic concern [ 7 ]. Annually, more than 400,000 premature deaths occur due to preventable adverse events, and the incidence of serious harm is 10 to 20 times higher than the mortality rate [ 8 ]. In clinical settings, nursing students sometimes participate directly in unsafe care, errors, adverse events, and poor patient care. For that reason, like other healthcare team members, they should use their knowledge, attitude, and perception of patient safety while caring for the patient [ 9 ]. Lack of patient safety knowledge is one of the nursing students’ educational problems that lead to unsafe practice [ 10 ]. Mansour and Francis (2013) stated that graduate nurses should have sufficient knowledge to identify potential safety risks, and they should have the confidence to protect patients against preventable harm or adverse events [ 11 , 12 ].

On the other hand, an unsafe attitude is a precursor to adverse events because it shapes and influences the behavior, so any change in attitude has a significant effect on people’s safety behavior [ 13 ]. Nowadays, it is widely accepted that optimal patient safety development is not possible without a safe attitude in health care facilities [ 14 ]. Therefore, nurses’ attitude toward patient safety is very important to promote a safe environment for patients [ 15 ]. Nurses’ perceptions are the foundation of any behavior and lead to actions that affect patient safety and are vital for all hospitals and healthcare providers [ 16 ].

Hence, evaluating nursing and medical students’ knowledge, attitude, and perceptions toward patient safety is necessary because they are future healthcare professionals [ 17 ]. Most universities around the world are looking for teaching methods through which they would be able to enhance students’ clinical decision-making capabilities and self-centered learning [ 18 ]. In recent decades, the use of new and active student-centered learning methods has been trending strongly with educational systems [ 19 ]. The PBL is an innovative educational method that focuses on one problem, either assigned by the students or by the teacher [ 20 ], and it has been adopted in medical sciences such as nursing, midwifery, dentistry, and medicine in many universities around the world [ 21 ]. The PBL is a student-centered pedagogy in which students and professors are responsible partners in the learning-teaching process, and teaching is a way to facilitate learning [ 22 ]. The purpose of this method in medical education is to acquire basic clinical knowledge, make progress in personal learning skills, and evolve in dealing effectively with challenges at the patient’s bedside, and ultimately improve dynamism and motivation for learning [ 23 ].

As members of the healthcare team, nurses play a vital role in improving patient safety, originating from their attitudes, knowledge, and skill in patient safety [ 15 , 24 ]. Also, the WHO emphasizes teaching patient safety to medical and nursing students, and the ministries of health focus on patient safety programs [ 25 ]. Given the widespread adoption of PBL in medical and nursing schools worldwide and many nursing education, experts believe that PBL can bridge the gap between theory and practice [ 26 , 27 ]. Thus, this study aimed to determine the effect of patient safety education through PBL on nursing students’ knowledge, attitude, and perceptions toward patient safety.

Research design and setting

This randomized, controlled trial was conducted in the Urmia School of Nursing and Midwifery from September 2019 to January 2020. This study was permitted by the Review Board of Urmia University of Medical Sciences (IR.UMSU.REC.1398.219) and obtained a registration code from the Iranian registry of clinical trials (IRCT20190925044881N1).

Study participants

In this study, all fourth-year nursing students who met the inclusion criteria constituted the study population. The inclusion criteria consisted of the following: (i) willing to participate in the study, (ii) being a fourth-year nursing student, and (iii) having no involvement in the same educational programs. The exclusion criteria included the following: (i) unwilling to stay in the research, and (ii) having more than two absences from the educational sessions. Based on the previous similar study (the mean and standard deviation of the problem-based learning score was 6 ± 2.14 and 7.76 ± 2.18 in the control and intervention groups, respectively) and considering the effect size (ES) = 0.814, α = 0.05, and power of 90%, the sample size was measured 32 for each group [ 28 ]. A total of 78 fourth-year nursing students were recruited into the study to consider a drop-out rate of 10%. Cohen (1992) suggested that an effect size of 0.80 is large enough to enable us to compare an experiment’s effect-size findings to a known benchmark [ 29 ].

Randomization

The department manager had divided the participants into nine groups based on the internship curriculum. The second researcher randomly allocated the participants into five intervention ( n  = 43) and four control ( n  = 35) groups. The simple randomization was used to allocate nursing students to either control or intervention groups. The random allocation was as follows: the first researcher assigned a name to each of the nine groups and placed the groups’ names inside opaque envelopes. The first five groups picked from the envelope were considered the intervention groups. The remaining four groups were recognized as the control groups.

Outcome measure

A two-part questionnaire was used to collect data: a demographic information questionnaire and a questionnaire on nursing students’ knowledge, attitude, and perceptions toward patient safety. It was adopted from Leung’s (2010) [ 30 ] and Madigoskay et al. (2006) [ 31 ] studies. This questionnaire comprises 26 questions, of which six questions assess students’ knowledge about patient safety (primary outcome), eight questions assess their attitude or tendency towards patient safety (secondary outcome), and 12 questions assess students’ perception of patient safety (secondary outcome). This questionnaire is scored on a 5-point Likert scale. In section 1 (attitude and perception items) of the questionnaire on patient safety, the 5-point Likert scale is scored as follows: 1 = strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, 5 = strongly agree, and in section 2 (knowledge items), this scale is scored as follows: 1 = very poor, 2 = poor, 3 = fair, 4 = good, 5 = very good. In a study by Nabilou et al. (2015), this questionnaire was first translated into Persian, and then it was re-translated into English (forward and backward translation method) and reviewed by two faculty members who were skillful at the English language. Ultimately, the necessary adaptations were made to the questionnaire. The research team and four patient safety experts reviewed the questionnaire and confirmed its validity. The reliability of the questionnaire was also confirmed using the internal consistency method with Cronbach’s alpha of 0.723 [ 25 ].

Study interventions

The pbl-based education.

Implementing the educational process was such that in each session, the first researcher presented a written scenario to the students about knowledge, attitude, and perception of patient safety in the intervention group. Students had a week to review the scenario. A problem-based learning method was implemented to investigate each scenario in the intervention group. The PBL method’s steps were as follows: In the first step, the instructor asked the students to read the problem scenario and encouraged them to clarify vague concepts. In the second step, the problem was defined by the instructor. In the third step, the students had the brainstorming and group discussion about the problem. In the fourth step, students listed the facts, generated hypotheses based on the scenario content, and answered the questions based on the nursing process to achieve educational goals. In the fifth step, they reached a consensus on learning objectives within the group, and the instructor assured them in achieving complete, comprehensive, and appropriate goals. In the sixth step, they conducted independent and group study to gather information by using the library and the internet from resources introduced. In the seventh step, the instructor presented and analyzed the solutions based on the hypotheses, goals, and questions, conducted the interdisciplinary discussion, summarized and evaluated the proposed solutions (Table  1 ).

In each of the five intervention groups, eight education sessions of 45–60 min were conducted. Then, a total of 40 sessions was carried out using the PBL in this study. The instructor reviewed the scenario delivered to the students last week at the beginning of each session. At the end of the session, the instructor presented the following week’s scenario to the students.

Routine education

In the control groups, the researcher performed routine education to teach the same educational content regarding patient safety. The hospital’s routine method was to lecture and discuss educational content. The students had eight routine sessions in each control group. A total of 32 sessions took place in the control groups (See supplementary file ).

Data collection procedure

The second researcher held the introductory session at Urmia School of Nursing and Midwifery. He introduced himself to the participants and presented the study process and objectives for them. Participants completed questionnaires after they signed a written informed consent form. The study intervention lasted for 4 months. Then, all the participants filled in the questionnaire on patient safety 1 month after the intervention. The second researcher held the PBL educational sessions for nursing students in the control group after the intervention finished.

Data analysis

Collected data were entered into SPSS software version 22.0 (IBM Corp., Armonk, NY. USA) and analyzed using descriptive (mean and standard deviation) and inferential (chi-square test, independent t-test, paired t-test, and analysis of covariance (ANCOVA) statistics. The CONSORT flow diagram of the study is presented in Fig.  1 . The CONSORT 2010 checklist was utilized to ensure quality reporting in the present study [ 32 ].

figure 1

CONSORT flow diagram of the study

Seventy-eight nursing students entered the analysis, with no attrition in this study. The results indicated no statistically significant difference between the two groups in terms of age, gender, semester, marital status, residency, interest in the nursing major, clinical work experience, and grade point average (GPA), indicating that the two groups were homogeneous (Table  2 ).

The results of the paired t-test indicated that the mean score of patient safety knowledge in the control group did not differ significantly before and after the intervention ( p  = 0.279). However, the mean score of patient safety knowledge in the intervention group increased significantly after the intervention ( p  = 0.001) (primary outcome). Moreover, based on the paired t-test result, the mean score of students’ attitudes toward patient safety was not significantly different in the control group after the PBL education ( p  = 0.529). However, the difference was statistically significant in the intervention group after the PBL education as the mean score of students’ attitudes about patient safety increased significantly after the intervention ( p  = 0.016) (secondary outcome). The paired t-test also showed no significant difference in the mean score of the students’ perception of patient safety in the control group before and after PBL education ( p  = 0.122). Nevertheless, the mean score of students’ perception of patient safety increased significantly in the intervention group after PBL education ( p  = 0.037) (secondary outcome) (Table  3 ).

A significant difference was found in the mean score of patient safety knowledge between the control and the intervention group before and after the intervention ( p  = 0.001). No statistically significant difference was revealed in the mean score of students’ attitudes toward patient safety between the two groups before the intervention ( p  = 0.152). However, the difference was statistically significant between the two groups after the PBL education ( p  = 0.006). Consequently, the PBL positively affected students’ attitudes about patient safety in the intervention group. The independent t-test demonstrated that the difference in the mean score of the students’ perception of patient safety was not statistically significant between the two groups before the intervention ( p  = 0.264). Moreover, after the intervention, the mean score of students’ perception toward patient safety increased significantly in the intervention group compared to the control group ( p  = 0.001). Accordingly, the PBL had a positive effect on the mean score of students’ perceptions of patient safety in the intervention group.

Because there was a significant difference in the mean score of patient safety knowledge between the two groups before the intervention (Table 3 ), we used ANCOVA analysis to ensure that the significant difference in the mean score of patient safety knowledge after the intervention is due to the PBL educational approach, not the effect of the pre-interventional knowledge in the intervention group. After checking Levene’s test to confirm the homogeneity of variance between the two groups, we used ANCOVA analysis and confirmed the effect of PBL on the mean score of knowledge differences between the two groups after intervention (f = 40.90, p  < 0.05) (Table  4 ).

The results showed that the students’ knowledge about patient safety increased significantly after PBL educational approach. The results of the following studies are consistent with our study results. Meo (2013) showed that the students who were educated through the PBL method acquired significantly higher knowledge and skill compared to the students who were educated through lecture-based learning [ 33 ]. A study conducted by Yew and Goh (2016) showed that PBL is an effective teaching and learning approach, especially when evaluated for long-term knowledge retention and applications [ 34 ]. PBL is a preferential method for both the long-term retention of course content and the use of clinical skills [ 35 ]. It plays an important role in improving the knowledge horizons and learning skills and enriching the teamwork experience. Moreover, the tutor’s role as facilitators and motivators for appropriate activities is one of the main reasons for improving knowledge in PBL sessions [ 36 ]. PBL can improve nurses’ education by teaching them how to apply theory to clinical practice and develop their problem-solving skills [ 37 ]. It encourages students to be self-centered and promotes their critical thinking, leadership, and teamwork skills [ 38 ]. Dring (2019) revealed that PBL prepares students to work together and effectively communicate to provide more patient-focused care [ 39 ]. Contrary to our findings, Arpanantikul and Luecha (2010) reported that engaging in collaborative learning is considered a challenge, and the PBL method has failed to improve learning processes and knowledge acquisition. They concluded that nursing students in the PBL method discuss non-specific issues, fail to create group ideas, and obtain incomplete and superficial knowledge [ 40 ].

Our result revealed that nursing students’ attitudes toward patient safety increased significantly after the intervention. In line with our findings, Terashita et al. (2016) concluded that plain radiography practical training through PBL promoted students’ attitudes toward self-efficacy and increased their self-efficacy through self-centered learning [ 41 ]. Furthermore, Park and Choi’s (2015) study showed that PBL plays a considerable role in improving learning attitude, critical thinking disposition, and problem-solving skills in nursing students [ 42 ]. PBL improves learning by constructing an understanding of the interrelationship between basic science concepts and medical knowledge [ 43 ]. Limited studies have investigated the effect of PBL on nursing students’ knowledge and attitude toward patient safety. Liu et al. (2009) reported that the PBL approach is an effective way for nursing students to improve patient safety knowledge and enhance the integrative capacity [ 44 ]. Sahota (2020) stated that PBL promotes learners’ knowledge and skills in non-technical subjects, including patient safety, and enhances their ability to cope with the challenges they encounter in clinical environments [ 45 ].

Our findings also showed that the perception of patient safety increased significantly in nursing students after PBL education. High nursing students’ perceptions of patient safety were reported in a similar study [ 25 ]. This increased students’ perceptions of patient safety through implementing the PBL method can be explained by its significant effect on students’ learning, motivation, and experience [ 46 ]. Penjvini and Shahsawari (2013) found that students in the PBL group acquired more knowledge and had a higher level of motivation towards learning, and provided better care for patients than students in the lecture group [ 47 ]. Kim and Han (2016) showed that education programs that are implemented to strengthen critical thinking, self-efficacy, and problem-solving promote patient safety competence among clinical nurses [ 48 ]. Despite the many benefits of the PBL method, it can stress students by creating frustration, anxiety, uncertainty, and fear [ 49 , 50 ]. PBL is also known as a time-consuming educational method [ 40 , 49 ].

In general, Liu et al. (2019) concluded that problem-based learning is superior to the conventional teaching methods in areas such as interest in learning, teamwork spirit, problem-solving ability, analysis, knowledge attainment and application, and communicational skills [ 51 ]. Another study reported that problem-based learning enhances active learning and students’ innate motivation, which improves deep learning among students [ 52 ]. Khatiban et al. (2019) conducted a study to compare the effect of two methods of lecture-based and problem-based learning in ethics education among nursing students. They recommended problem-based learning to be used in other nursing areas since it is an effective tool for developing moral reasoning [ 53 ]. In a recent systematic review, authors have shown the effectiveness of problem-based learning in nursing education and student empowerment, so that they called for a widespread acceptance and use of this method for education in nursing schools [ 54 ].

Study limitations

One of our study’s limitations was the participants’ mental and emotional state while completing the questionnaires and answering the questions by which the study results could be influenced. This limitation was beyond the control of the researcher. The short follow-up period was another limitation of our study. Therefore, the authors suggest other studies with a more extended follow-up period to be conducted through which the effect of the PBL educational approach on the persistence of learning over time is determined. Another limitation of the study was that nursing students were from the same nursing faculty in the control and intervention groups. We suggest students be recruited from different nursing schools in future studies. Nursing students’ pre-interventional patient safety knowledge was another weakness of this study. The authors tried to control it with a statistics test of ANCOVA.

Patient safety is of great significance in various nursing education areas, including nursing education and practice. Like other healthcare team members, nursing students have the opportunity to improve the quality of patient safety. Meanwhile, nursing instructors play a vital role in improving the students’ required knowledge, attitude, and perception of patient safety. They can ensure that nursing graduates are well prepared to provide a safe environment and care for patients. Based on this study’s findings, PBL significantly impacted students’ knowledge, attitude, and perception toward patient safety compared to conventional teaching methods. Considering the PBL positive outcomes, including learning improvement, continuous and self-centered learning, concentration on understanding concepts, and innovation, it is recommended that nursing professors apply this teaching method in some courses to promote students’ clinical and cognitive capabilities to ensure safe patient care.

Availability of data and materials

The datasets used and analyzed during the current study are available from the corresponding authors on request.

Abbreviations

  • Problem-based learning

World Health Organization

Statistical Package for the Social Sciences

Analysis of covariance

Consolidated Standards of Reporting Trials

Grade Point Average

Brickell TA, McLean C. Emerging issues and challenges for improving patient safety in mental health: a qualitative analysis of expert perspectives. J Patient Safety. 2011;7(1):39–44. https://doi.org/10.1097/PTS.0b013e31820cd78e .

Article   Google Scholar  

Kanerva A, Lammintakanen J, Kivinen T. Patient safety in psychiatric inpatient care: a literature review. J Psychiatr Ment Health Nurs. 2013;20(6):541–8. https://doi.org/10.1111/j.1365-2850.2012.01949.x .

Article   CAS   PubMed   Google Scholar  

Bassuni EM, Bayoumi MM. Improvement critical care patient safety: using nursing staff development strategies, at Saudi Arabia. Global J Health Sci. 2015;7(2):335–43. https://doi.org/10.5539/gjhs.v7n2p335 .

Minet C, Potton L, Bonadona A, Hamidfar-Roy R, Somohano CA, Lugosi M, et al. Venous thromboembolism in the ICU: main characteristics, diagnosis and thromboprophylaxis. Crit Care. 2015;19(1):287. https://doi.org/10.1186/s13054-015-1003-9 .

Johnstone M-J, Kanitsaki O. The ethics and practical importance of defining, distinguishing and disclosing nursing errors: a discussion paper. Int J Nurs Stud. 2006;43(3):367–76. https://doi.org/10.1016/j.ijnurstu.2005.04.010 .

Article   PubMed   Google Scholar  

World health organization. Patient Safety Fact File. 2019. Available at: https://www.who.int/features/factfiles/patient_safety/patient-safety-fact-file.pdf . Accessed 11 Sept 2018.

Abdelhai R, Abdelaziz SB, Ghanem NS. Assessing patient safety culture and factors affecting it among health care providers at Cairo University hospitals. J Am Sci. 2012;8(7):277–85.

Google Scholar  

James JT. A new, evidence-based estimate of patient harms associated with hospital care. J Patient Safety. 2013;9(3):122–8. https://doi.org/10.1097/PTS.0b013e3182948a69 .

Hughes R. Patient safety and quality: An evidence-based handbook for nurses, vol. 3. Rockville: Agency for Healthcare Research and Quality; 2008.

Usher K, Woods C, Parmenter G, Hutchinson M, Mannix J, Power T, et al. Self-reported confidence in patient safety knowledge among Australian undergraduate nursing students: a multi-site cross-sectional survey study. Int J Nurs Stud. 2017;71:89–96. https://doi.org/10.1016/j.ijnurstu.2017.03.006 .

Mansour M. Examining patient safety education in pre-registration nursing curriculum: qualitative study. J Nurs Educ Pract. 2013;3(12):157–67. https://doi.org/10.5430/jnep.v3n12p157 .

Francis R. Report of the mid Staffordshire NHS Foundation trust public inquiry: executive summary, vol. 947. London: The Stationery Office; 2013.

Sheeran P, Maki A, Montanaro E, Avishai-Yitshak A, Bryan A, Klein WM, et al. The impact of changing attitudes, norms, and self-efficacy on health-related intentions and behavior: a meta-analysis. Health Psychol. 2016;35(11):1178–88. https://doi.org/10.1037/hea0000387 .

Alfaqawi M, Böttcher B, Abuowda Y, Alaloul E, Elnajjar I, Elhout S, et al. Treating patients in a safe environment: a cross-sectional study of patient safety attitudes among doctors in the Gaza strip, Palestine. BMC Health Serv Res. 2020;20(1):1–9. https://doi.org/10.1186/s12913-020-05230-5 .

Brasaite I, Kaunonen M, Suominen T. Healthcare professionals’ knowledge, attitudes and skills regarding patient safety: a systematic literature review. Scand J Caring Sci. 2015;29(1):30–50. https://doi.org/10.1111/scs.12136 .

Mwachofi A, Walston SL, Al-Omar BA. Factors affecting nurses’ perceptions of patient safety. Int J Health Care Q Assurance. 2011;24(4):274–83. https://doi.org/10.1108/09526861111125589 .

Yoshikawa JM, Sousa BEC, Peterlini MAS, Kusahara DM, Pedreira MLG, Avelar AFM. Comprehension of undergraduate students in nursing and medicine on patient safety. Acta Paul Enferm. 2013;26(1):21–9. https://doi.org/10.1590/S0103-21002013000100005 .

Rana S, Ardichvili A, Polesello D. Promoting self-directed learning in a learning organization: tools and practices. Eur J Training Dev. 2016;40(7):470–89. https://doi.org/10.1108/EJTD-10-2015-0076 .

Lupien A, George-Gay B. Fuszard’s innovative teaching strategies in nursing; 2010.

Noordegraaf-Eelens L, Kloeg J, Noordzij G. PBL and sustainable education: addressing the problem of isolation. Adv Health Sci Educ. 2019;24(5):971–9. https://doi.org/10.1007/s10459-019-09927-z .

Azer SA. Introducing a problem-based learning program: 12 tips for success. Med Teach. 2011;33(10):808–13. https://doi.org/10.3109/0142159X.2011.558137 .

Martens SE, Wolfhagen IH, Whittingham JR, Dolmans DHM. Mind the gap: teachers’ conceptions of student-staff partnership and its potential to enhance educational quality. Med Teach. 2020;42(5):529–35. https://doi.org/10.1080/0142159X.2019.1708874 .

Mansoori S, Abedini-baltork M, Lashkari H, Bagheri S. Effectiveness of problem-based learning on student’s academic performance: A quasi-experimental study. Res Med Educ. 2017;9(1):8–1. https://doi.org/10.18869/acadpub.rme.9.1.8 .

Sermeus W, Cullum N, Balzer K, et al. European Academy of Nursing Science 2016 Summer Conference. BMC Nurs. 2016;15:67. https://doi.org/10.1186/s12912-016-0186-y .

Nabilou B, Feizi A, Seyedin H. Patient safety in medical education: students’ perceptions, knowledge and attitudes. PLoS One. 2015;10(8):e0135610. https://doi.org/10.1371/journal.pone.0135610 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Mahmud A. The integration of theory and practice of paramedic curriculum. Int J Sci Res Publ. 2013;3(7):1–4.

Ajani K, Moez S. Gap between knowledge and practice in nursing. Procedia Soc Behav Sci. 2011;15:3927–31. https://doi.org/10.1016/j.sbspro.2011.04.396 .

Hemmati Maslak Pak M, Orujlu S, Khalkhali H. The effect of problem-based learning training on nursing students’ critical thinking skills. J Med Educ Dev. 2014;9(1):24–33 http://jmed.ssu.ac.ir/article-1-211-en.html .

Cohen J. Statistical power analysis. Curr Dir Psychol Sci. 1992;1(3):98–101. https://doi.org/10.1111/1467-8721.ep10768783 .

Leung GK, Patil NG. Patient safety in the undergraduate curriculum: medical students’ perception. Hong Kong Med J. 2010;16(2):101–5.

CAS   PubMed   Google Scholar  

Madigosky WS, Headrick LA, Nelson K, Cox KR, Anderson T. Changing and sustaining medical students’ knowledge, skills, and attitudes about patient safety and medical fallibility. Acad Med. 2006;81(1):94–101. https://doi.org/10.1097/00001888-200601000-00022 .

Schulz KF, Altman DG, Moher D, Group C. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMC Med. 2010;8(1):18. https://doi.org/10.1186/1741-7015-8-18 .

Article   PubMed   PubMed Central   Google Scholar  

Meo SA. Evaluating learning among undergraduate medical students in schools with traditional and problem-based curricula. Adv Physiol Educ. 2013;37(3):249–53. https://doi.org/10.1152/advan.00031.2013 .

Yew EH, Goh K. Problem-based learning: an overview of its process and impact on learning. Health Prof Educ. 2016;2(2):75–9. https://doi.org/10.1016/j.hpe.2016.01.004 .

Prosser M, Sze D. Problem-based learning: student learning experiences and outcomes. Clin Linguist Phon. 2014;28(1–2):131–42. https://doi.org/10.3109/02699206.2013.820351 .

Yadav RL, Piryani RM, Deo GP, Shah DK, Yadav LK, Islam MN. Attitude and perception of undergraduate medical students toward the problem-based learning in Chitwan medical college, Nepal. Adv Med Educ Pract. 2018;9:317–22. https://doi.org/10.2147/AMEP.S160814 .

Shin I-S, Kim J-H. The effect of problem-based learning in nursing education: a meta-analysis. Adv Health Sci Educ. 2013;18(5):1103–20. https://doi.org/10.1007/s10459-012-9436-2 .

Kong L-N, Qin B, Zhou YQ, Mou SY, Gao H-M. The effectiveness of problem-based learning on development of nursing students’ critical thinking: a systematic review and meta-analysis. Int J Nurs Stud. 2014;51(3):458–69. https://doi.org/10.1016/j.ijnurstu.2013.06.009 .

Dring JC. Problem-based learning–experiencing and understanding the prominence during medical school: perspective. Ann Med Surg. 2019;47:27–8. https://doi.org/10.1016/j.amsu.2019.09.004 .

Arpanantikul M, Luecha Y. Problem-based learning: undergraduate Thai nursing students’ perceptions. Pacific Rim Int J Nurs Res. 2010;14(3):262–76.

Terashita T, Tamura N, Kisa K, Kawabata H, Ogasawara K. Problem-based learning for radiological technologists: a comparison of student attitudes toward plain radiography. BMC Med Educ. 2016;16(1):236. https://doi.org/10.1186/s12909-016-0753-7 .

Park S, Choi SH. Effects of problem-based learning on the learning attitudes, critical thinking disposition and problem solving skills of nursing students: Infant Care, doctoral dissertation, Chonnam National University, Gwanju; 2015. https://doi.org/10.14257/astl.2015.103.41 .

Book   Google Scholar  

Chakravarthi S, Haleagrahara N. Implementation of PBL curriculum involving multiple disciplines in undergraduate medical education programme. Int Educ Stud. 2010;3(1):165–9.

Y-l L, L-l C, M-z R, Liu F, Jl Y, Gao CH, et al. The application of problem-based learning approach in patient safety education among nursing students [J]. Chin J Nurs. 2009;10.

Sahota S. Using problem-based learning to improve patient safety in the emergency department. Emergency Nurse. 2020;28(2):3–42. https://doi.org/10.7748/en.2020.e1958 .

Matthew-Maich N, Martin L, Hammond C, Palma A, Pavkovic M, Sheremet D, et al. Nursing students’ perceptions of effective problem-based learning tutors. Nurs Stand. 2016;31(12):48–59. https://doi.org/10.7748/ns.2016.e10318 .

Penjvini S, Shahsawari SS. Comparing problem based learning with lecture based learning on medicine giving skill to newborn in nursing students. J Nurs Educ Pract. 2013;3(9):53–9. https://doi.org/10.5430/jnep.v3n9p53 .

Kim H-S, Han S-J. The survey on the influence of clinical Nurse's critical thinking disposition, problem-solving skill and self-efficacy on patients safety competencies. J Korea Acad Industrial Coop Soc. 2016;17(6):598–608. https://doi.org/10.5762/KAIS.2016.17.6.598 .

Biley F. Creating tension: undergraduate student nurses’ responses to a problem-based learning curriculum. Nurse Educ Today. 1999;19(7):586–91. https://doi.org/10.1054/nedt.1999.0371 .

Klunklin A, Subpaiboongid P, Keitlertnapha P, Viseskul N, Turale S. Thai nursing students’ adaption to problem-based learning: a qualitative study. Nurse Educ Pract. 2011;11(6):370–4. https://doi.org/10.1016/j.nepr.2011.03.011 .

Liu L, Du X, Zhang Z, Zhou J. Effect of problem-based learning in pharmacology education: a meta-analysis. Stud Educ Eval. 2019;60:43–58. https://doi.org/10.1016/j.stueduc.2018.11.004 .

Dolmans DH, Loyens SM, Marcq H, Gijbels D. Deep and surface learning in problem-based learning: a review of the literature. Adv Health Sci Educ. 2016;21(5):1087–112. https://doi.org/10.1007/s10459-015-9645-6 .

Khatiban M, Falahan SN, Amini R, Farahanchi A, Soltanian A. Lecture-based versus problem-based learning in ethics education among nursing students. Nurs Ethics. 2019;26(6):1753–64. https://doi.org/10.1177/0969733018767246 .

Hajibabaee F, Ashrafizadeh H. A Comprehensive Review of Problem-based Learning in the Iranian Nursing Education. Iran J Nurs. 2019;32(118):11–28. https://doi.org/10.29252/ijn.32.118.11 .

Download references

Acknowledgments

This article is derived from a master thesis in nursing approved by the Research and Ethics Committee of the Urmia University of Medical Sciences. The researchers would like to express their sincere gratitude and appreciation to the Dean of nursing school, nursing students, and the honorable Research Vice President of Urmia University of Medical Sciences for making this research possible. They also wish to thank Rosemary Carter for reviewing the manuscript and writing assistance.

This research received a grant (No. 2476) from Urmia University of Medical Sciences to support the research in terms of study design, collection, analysis, interpretation of data, and the article’s preparation.

Author information

Authors and affiliations.

School of Nursing & Midwifery, Urmia University of Medical Sciences, Urmia, Iran

Hossein Jamshidi

Maternal and Childhood Obesity Research Center, Nursing and Midwifery School, Urmia University of Medical Sciences, Urmia, Iran

Masumeh Hemmati Maslakpak

Patient Safety Research Center, Clinical Research Institute, Nursing and Midwifery School, Urmia University of Medical Sciences, Urmia, Iran

Naser Parizad

You can also search for this author in PubMed   Google Scholar

Contributions

Design of the study: MHM, HJ, NP; data collection: NP, HJ; analysis and interpretation of data: MHM, HJ, NP; manuscript preparation: NP, HJ, MHM; manuscript revision: NP, HJ. All authors checked and confirmed the final manuscript before submission. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Naser Parizad .

Ethics declarations

Ethics approval and consent to participate.

The participants were explained about the purpose of the study, and they were assured of their privacy and confidentiality of their personal information. They were explained regarding the voluntary nature of the study, and they can leave the study at any time. They signed the consent form before participating in the study. Moreover, the study was approved by the Review Board of Urmia University of Medical Sciences.

Consent for publication

Not applicable .

Competing interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Jamshidi, H., Hemmati Maslakpak, M. & Parizad, N. Does problem-based learning education improve knowledge, attitude, and perception toward patient safety among nursing students? A randomized controlled trial. BMC Nurs 20 , 70 (2021). https://doi.org/10.1186/s12912-021-00588-1

Download citation

Received : 08 October 2020

Accepted : 20 April 2021

Published : 29 April 2021

DOI : https://doi.org/10.1186/s12912-021-00588-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Patient safety

BMC Nursing

ISSN: 1472-6955

problem based learning research articles

  • For Readers
  • For Authors
  • For Librarians
  • Other Journals

orcid

Jurusan Teknik Arsitektur

Free counters!

ADDITIONAL MENU

INSTRUCTION FOR AUTHOR

  • Guide for Authors
  • Online submission
  • Focus and Scope
  • Publication Ethics
  • Article Processing Charge
  • Peer Reviewers Process
  • Editorial Team
  • Open Access Policy
  • Plagiarism Policy
  • Journal History
  • Archiving Policy
  • Abstracting & Indexing
  • Crossmark Policy
  • Visitors Statics

problem based learning research articles

View My Stats

problem based learning research articles

  • Announcements
  • Author Guideline

CONTEXTUAL DESIGN: PENGEMBANGAN MODEL PEMBELAJARAN PROBLEM BASED LEARNING PADA PERKULIAHAN STUDIO PERANCANGAN BANGUNAN GEDUNG TINGGI

Abstract: This research is a learning model development research that aims to achieve learning outcomes in high-rise building design course studios, using a contextual design approach based on the Problem Based Learning model, this research was conducted by applying the principles of contextual inquiry, interpretation, and data consolidation on the assignment, as well as applying storyboarding and prototyping techniques in the final presentation of the assignment. This research concludes that the development of this learning method can be accepted by class participants with a pass rate of 67%, but still requires some improvements in strategy search, integration of structure and utility systems, as well as understanding of the best principles and environments.

Keywords: design studio, contextual design, Problem Based Learning, tall buildings

Abstrak: Penelitian ini merupakan riset pengembangan model pembelajaran yang bertujuan untuk mengevaluasi capaian pembelajaran pada perkuliahan studio perancangan bangunan gedung tinggi, dengan menggunakan metode pedekatan contextual design yang berbasis pada model pembelajaran Problem Based Learning , penelitian ini dilakukan dengan mengaplikasikan prinsip-prinsip  contextual inquiry , interpreasi, dan konsolidasi data pada penugasan, serta menerapkan Teknik storyboarding dan prototyping pada penyajian akhir penugasan. Riset ini menyimpulkan bahwa pengembangan metode pembelajaran ini dapat diterima peserta kelas dengan tingkat kelulusan 67%, namun masih memerukan beberapa perbaikan dalam strategi penelusuran bentuk , integrasi sistem struktur dan utilitas, serta pemahaman tentang prinsip kenyamanan dan keberlanjutan lingkungan.

Kata Kunci: studio perancangan, contextual design , Problem Based Learning , bangunan tinggi

Abel, T. D., & Evans, M. A. (2014). Cross-disciplinary participatory & contextual design research: Creating a teacher dashboard application. Interaction Design and Architecture(S), 19(1), 63–76.

Aleks Catina. (2020). Dialogue and studio space: the architectural design studio as the setting for continuous reflection. Journal of Applied Learning & Teaching, 3(1), 4–13. https://doi.org/http://dx.doi.org/10.1680/geot.2008.T.003

Aranda-Jan, C. B., Jagtap, S., & Moultrie, J. (2016). Towards a framework for holistic contextual design for low-resource settings. International Journal of Design, 10(3), 43–63. https://doi.org/10.17863/CAM.7254

Davies, J., Graaff, E. de, & Kolmos, A. (2011). PBL across the diciplines : Research into best practice. Aaalborg Universitet PBL Acrosss Discipline.

Dinatha, N. M., & Kua, M. Y. (2019). Pengembangan Modul Praktikum Digital Berbasis Nature of Science (Nos) Untuk Meningkatkan Higher Order Thinking Skill (Hots). Journal of Education Technology, 3(4), 293. https://doi.org/10.23887/jet.v3i4.22500

Edström, K., & Kolmos, A. (2014). PBL and CDIO: Complementary models for engineering education development. European Journal of Engineering Education, 39(5), 539–555. https://doi.org/10.1080/03043797.2014.895703

Hadgraft;, G. G., & Kolmos, A. (2020). Aalborg Universitet Emerging learning environments in engineering education This is an Accepted Manuscript of an article Published by Taylor & Francis in Australasian Journal of Engineering Education available online : Australasian Journal of Engineering Education, 25(1).

Hakim, L. (2015). Implementasi Model Pembelajaran Berbasis Masalah (Problem Based Learning ) Pada Lembaga Pendidikan Islam Madrasah. Jurnal Pendidikan Agama Islam Ta‟lim, 13(1), 44. http://jurnal.upi.edu/file/03_IMPLEMENTASI_MODEL_PEMBELAJARAN_BERBASIS_MASALAH_-_Lukman.pdf

Handrianto, C., & Rahman, M. A. (2018). PROJECT BASED

LEARNING : A REVIEW OF LITERATURE ON ITS OUTCOMES AND implementation issues. Linguistics, Literature and Language Teaching Journal, 8(2), 110–129. http://jurnal.uin-antasari.ac.id/index.php

Kolmos, A., Bertel, L. B., Holgaard, J. E., & Routhe, H. W. (2020). Project types and complex problem-solving competencies: Towards a conceptual framework. International Research Symposium on PBL, 56–65.

Lambe, N., & Dongre, A. (2016). Contextualism : An Approach To Achieve Architectural Identity And Continuity. International Journal of Innovative Research and Advanced Studies(IJIRAS), 3(2), 33–42.

Miller, E. C., Severance, S., & Krajcik, J. (2021). Motivating Teaching, Sustaining Change in Practice: Design Principles for Teacher Learning in Project-Based Learning Contexts. Journal of Science Teacher Education, 32(7), 757–779. https://doi.org/10.1080/1046560X.2020.1864099

Saginatari, D. P., & Atmodiwirjo, P. (2018). Reflection on Ecological Learning Through Architectural Design Studio. DIMENSI (Journal of Architecture and Built Environment), 45(1), 73. https://doi.org/10.9744/dimensi.45.1.73-84

Saifudin Mutaqi, A. (2018). Architecture Studio Learning: Strategy to Achieve Architects Competence. SHS Web of Conferences, 41, 04004. https://doi.org/10.1051/shsconf/20184104004

Shima Nikanjam, Badiossadat Hassanpour, A. I. C. A. (2016). Exploration of Influential Factor on Firts Year Achitecture Strudent’s Productivity. World Academic of Science, 10(5), 1594–1599.

Widati, T. (2015). Pendekatan Kontekstual dalam Arsitektur Frank Lloyd Wright. Jurnal Perspektif Arsitektur, 10(1), 38–44. https://e-journal.upr.ac.id/index.php/JTA/article/view/857/696

Zaduqisti, E. (2010). PROBLEM-BASED LEARNING (Konsep Ideal Model Pembelajaran untuk Peningkatan Prestasi Belajar dan Motivasi Berprestasi). Forum Tarbiyah, 8(2), 181–191.

  • There are currently no refbacks.

Creative Commons License

This work is licensed under a  Creative Commons Attribution-ShareAlike 4.0 International License.

A survey on imbalanced learning: latest research, applications and future directions

  • Open access
  • Published: 09 May 2024
  • Volume 57 , article number  137 , ( 2024 )

Cite this article

You have full access to this open access article

problem based learning research articles

  • Wuxing Chen 1 , 2 ,
  • Kaixiang Yang 3 ,
  • Zhiwen Yu 3 ,
  • Yifan Shi 4 &
  • C. L. Philip Chen 3  

632 Accesses

Explore all metrics

Imbalanced learning constitutes one of the most formidable challenges within data mining and machine learning. Despite continuous research advancement over the past decades, learning from data with an imbalanced class distribution remains a compelling research area. Imbalanced class distributions commonly constrain the practical utility of machine learning and even deep learning models in tangible applications. Numerous recent studies have made substantial progress in the field of imbalanced learning, deepening our understanding of its nature while concurrently unearthing new challenges. Given the field’s rapid evolution, this paper aims to encapsulate the recent breakthroughs in imbalanced learning by providing an in-depth review of extant strategies to confront this issue. Unlike most surveys that primarily address classification tasks in machine learning, we also delve into techniques addressing regression tasks and facets of deep long-tail learning. Furthermore, we explore real-world applications of imbalanced learning, devising a broad spectrum of research applications from management science to engineering, and lastly, discuss newly-emerging issues and challenges necessitating further exploration in the realm of imbalanced learning.

Similar content being viewed by others

problem based learning research articles

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

A random forest guided tour.

problem based learning research articles

A survey on semi-supervised learning

Avoid common mistakes on your manuscript.

1 Introduction

In the field of machine learning, it is commonly assumed that the number of samples in each class under study is roughly equal. However, in real-life scenarios, due to practical applications in enterprises or industries, the generated data often exhibits imbalanced distribution. In the case of fault detection, for example, the major class has a large number of samples, while the other class (the class with faults) has only a small number of samples (Ren et al. 2023 ). As depicted in Fig. 1 , this imbalance presents a challenge as traditional learning algorithms tend to favor the more prevalent classes, potentially overlooking the less frequent ones. Nevertheless, from a data mining perspective, minority classes often carry valuable knowledge, making them crucial. Consequently, the objective of imbalanced learning is to develop intelligent systems capable of effectively addressing this bias, thereby enabling learning algorithms to handle imbalanced data more effectively.

figure 1

A binary imbalanced dataset. The decision boundaries learned on such samples tend to be increasingly biased towards the majority class, leading to the neglect of minority class data samples. The dashed line is the decision boundary after the addition of the majority class. Note that the new samples are added arbitrarily, just to show how the added samples affect the decision boundaries

Over the past two decades, imbalanced learning has garnered extensive research and discussion. Numerous methods have been proposed to tackle imbalanced data, encompassing data pre-processing, modification of existing classifiers, and algorithmic parameter tuning. The issue of imbalanced data classification is prevalent in real-world applications, such as fault detection (Fan et al. 2021 ; Kuang et al. 2021 ; Ren et al. 2023 ), fraud detection, medical diagnosis (Hung et al. 2022 ; Fotouhi et al. 2019 ; Behrad and Abadeh 2022 ), and other fields (Yang et al. 2022 ; Haixiang et al. 2017 ; Ding et al. 2021 ). In these application scenarios, datasets typically exhibit a significant class imbalance, where a few classes contain only a limited number of samples, while the majority class is more abundant. This imbalance leads to varying performance of learning algorithms, known as performance bias, on the majority and minority classes. To effectively address this challenge, imbalanced learning has been extensively examined by both academia and industry, resulting in the proposal of various methods and techniques (Chen et al. 2022a , b ; Sun et al. 2022 ; Zhang et al. 2019 ).

The processing of imbalanced data has gained significant importance due to the escalating volume of data originating from intricate, large-scale, and networked systems in domains such as security (Yang et al. 2022 ), finance (Wu and Meng 2016 ) and the Internet (Di Mauro et al. 2021 ). Despite some notable achievements thus far, there remains a lack of systematic research that reviews and discusses recent progress, emerging challenges, and future directions in imbalanced learning. To bridge this gap, our objective is to present a comprehensive survey of recent research on imbalanced learning. Such an endeavor is crucial for sustaining focused and in-depth exploration of imbalanced learning, facilitating the discovery of more effective solutions, and advancing the field of machine learning and data mining.

figure 2

Classification of existing methods

In this paper, we categorize the existing innovative solutions for imbalanced data classification algorithms into five types, as illustrated in Fig. 2 . These types encompass general methods, ensemble learning methods, imbalanced regression and clustering, long-tail learning, and imbalanced data streams. Within these categories, we further delve into more detailed methods, including data-level approaches, algorithm-level techniques, hybrid methods, general ensemble frameworks, boosting, bagging, cost-sensitive ensembles, imbalanced regression, imbalanced clustering, online or ensemble learning, concept drift, incremental learning, class rebalancing, information enhancement, and model improvement. Employing this classification scheme, we conduct a comprehensive review of existing imbalanced learning approaches and outline recent practical application directions.

Table 1 summarizes the differences between recent reviews on imbalanced learning and the present investigation. Distinguishing this survey from previous works, we not only summarize recent solutions to the category imbalance problem in traditional machine learning but also address the long-tailed distribution issue in deep learning, which has gained prominence. Additionally, we extensively elaborate on the imbalance problem within the unsupervised or semi-supervised domain. Moreover, building upon current research, we not only encapsulate emerging solutions in imbalanced learning but also outline new challenges and future research directions with practical potential. The organisational framework of this paper is shown in Fig. 3 .

figure 3

The organizational framework of this paper

We summarize the main contributions of the survey in this paper as follows:

This paper provides a unified and comprehensive review of imbalanced learning and deep imbalanced learning. This paper presents the inaugural comprehensive review and summary of the existing research outcomes in imbalanced learning and deep imbalanced learning. It systematically consolidates a wide range of methods and techniques, thereby facilitating researchers in developing a comprehensive understanding of this field.

A comprehensive survey of long tail learning and imbalanced machine learning applications. This paper presents a comprehensive survey and summary of the applications of long-tail learning and imbalanced machine learning over the past few years, encompassing diverse fields and real-world application scenarios. Through this study, scholars may get a more profound comprehension of the use of imbalanced learning in several fields, so functioning as a beneficial resource for further inquiries.

Six new research challenges and directions were identified. Drawing upon the existing research findings, this paper identifies and proposes six novel research challenges and directions within the realm of imbalanced learning. These challenges and directions hold significant potential and research importance, contributing to the advancement and progression of the imbalanced learning field.

The remainder of this paper is organized as follows: Section 2 outlines the current research methodology employed in this study and presents preliminary statistics on imbalanced learning. Section 3 introduces a comprehensive approach for addressing imbalanced equilibrium learning. Section 4 delves into the methodologies associated with imbalanced regression and clustering. Section 5 explores methods specifically pertaining to long tail learning. Section 6 discusses relevant research on imbalanced data streams. In Section 8 , we categorize existing research on imbalanced learning applications into seven fields and provide an overview of the respective research contents within each field. Section 9 compiles and summarizes six future research directions and challenges in imbalanced learning based on the survey and current research trends. Finally, we conclude this paper with a concise summary.

2 Statistics research methodology

2.1 review methodology.

The statistical classification research method proposed in this paper draws upon the works (Kaur et al. 2019 ; Haixiang et al. 2017 ). This work uses a multi-stage thorough review strategy to capture a wide range of methods and real-world application domains related to imbalanced learning.

In the initial stage, we focus on imbalanced, sampling, and skewed data to obtain preliminary results. Subsequently, in the second stage, we conduct a thorough search of existing research literature. For the third stage, we employ a triple keyword search approach. Initially, we use data mining, machine learning, and classification keywords to assess the research status of machine learning technologies. Next, we incorporate long-tail distribution and neural network as keywords to examine the research status of deep long-tail learning. Finally, we employ keywords such as ”detection,” ”abnormal,” or ”medical” to identify practical applications in the literature.

To ensure comprehensive coverage, we conducted searches across seven databases encompassing various domains within the natural and social sciences. These databases include IEEE Xplore, ScienceDirect, Springer, ACM Digital Library, Wiley Interscience, HPCsage, and Taylor & Francis Online.

2.2 Preliminary statistics

Figure 4 illustrates the search framework employed in this study. Initially, we applied the search terms ”imbalance” or ”unbalance” to filter the initial research works based on these keywords. Subsequently, we employed a trinomial tree search strategy to filter abstracts, conclusions, and full papers, leading to the identification of papers falling into three distinct directions. To ensure comprehensive research coverage, we incorporated the keywords ”machine learning,” ”long-tail learning”, and ”applications” during the trinomial tree screening stage. Following this, we conducted manual screening to eliminate duplicates and obtain the final results, resulting in a total of 2287 papers. This systematic search process ensures an extensive exploration of the field of imbalanced learning, providing a robust support and foundation for this paper.

Figure 5 demonstrates the publication trend of papers in the field of imbalanced learning from January 2013 to July 2023. The number of published papers remained relatively stable until 2017, with a slight decline observed during the 2015-2016 period. However, there has been a remarkable surge in the number of published papers from 2018 onwards. This trend indicates the enduring significance of imbalanced learning as a research topic, with ongoing growth in research hotspots. Furthermore, we compiled a list of the top 20 journals or conferences contributing to paper publications. As depicted in Fig. 6 , prominent publications are found in fields such as computer science, neural computing, management science, and energy applications.

figure 4

Search framework for obtaining papers

figure 5

Trends in imbalanced Learning Dissertation Publications

figure 6

Top 20 journals or conferences that publish papers on imbalanced learning

3 Imbalanced data classification approaches

In this section, we first introduce data-level approaches in imbalanced data classification, then we give various solutions at the algorithmic level as well as hybrid approaches, and finally we introduce more robust ensemble learning methods.

3.1 Data-Level approaches

Data-level approaches to the problem of class imbalance involve techniques and methods that directly modify the training data. These approaches target the challenge of class imbalance by rebalancing the distribution of classes within the training dataset. This rebalancing can be achieved through undersampling, which involves reducing instances from the majority class, or oversampling, which involves increasing instances from the minority class. The objective is to create a more balanced dataset that can effectively train a binary classification model.

The most widely used oversampling methods are random replication of a few samples and synthetic minority over-sampling technique (SMOTE) (Chawla et al. 2002 ). However, SMOTE is prone to generate duplicate samples, ignore sample distributions, introduce false samples, are not suitable for high-dimensional data, and are sensitive to noise. Therefore, there are many variants of methods that improve on SMOTE. Borderline-SMOTE (Han et al. 2005 ) identifies and synthesizes samples near the decision boundary, ADASYN (He et al. 2008 ) adapts the generation of synthetic samples based on minority class density, and Safe-Level-SMOTE (Bunkhumpornpat et al. 2009 ) incorporates a safe-level algorithm to balance class distribution while reducing misclassification risks. Other methods do not rely solely on synthetic sampling techniques. They explore different strategies for the selection of preprocessing stages or categorization algorithms to suit the characteristics and needs of imbalanced datasets. Stefanowski and Wilk ( 2008 ) introduced a novel method for selective pre-processing of imbalanced data that combines local over-sampling of the minority class with filtering challenging cases from the majority classes. To enhance the classification performance of SVM on unbalanced datasets, Akbani et al. ( 2004 ) offered several methodologies and procedures, such as altering class weights, utilizing alternative kernel functions, adjusting decision thresholds.

It is worth noting that oversampling methods sometimes produce redundant noise problems, so some oversampling sampling methods combine noise processing in the sampling process to obtain a more excellent balanced dataset. Stefanowski et al. introduced SMOTE-IPF, an extension of SMOTE that incorporates an iterative ensemble-based noise filter called IPF, effectively addressing the challenges posed by noisy and borderline examples in imbalanced datasets (Sáez et al. 2015 ). In order to handle class imbalance in a noise-free way, Douzas et al. Douzas et al. ( 2018 ) introduced a straightforward and efficient oversampling methodology called k-means SMOTE, which combines k-means clustering and the synthetic minority oversampling technique (SMOTE). Sun et al. ( 2022 ) introduced a disjuncts-robust oversampling (DROS) method that effectively addresses the challenge of negative oversampling results in the presence of small disjuncts, by utilizing light-cone structures to generate new synthetic samples in minority class areas, outperforming existing oversampling methods in handling class imbalance. At present, there are some new innovative data sampling methods that can combine the regional distribution of data and random walk based on data stochastic mapping to synthesize new samples, and even map data to higher-dimensional space to generate artificial data points (Sun et al. 2020 ; Zhang and Li 2014 ; Douzas and Bacao 2017 ; Han et al. 2023 ).

Undersampling is the practice of decreasing the number of examples in the majority class in order to obtain a more balanced class distribution. Undersampling seeks to match the number of instances in the minority class with the number of instances in the majority class by randomly or purposefully deleting samples from the majority class. The most widely used is undersampling using KNN technology (Wilson 1972 ; Mani and Zhang 2003 ). Wilson ( 1972 ) studied the asymptotic characteristics of the nearest neighbor rule using edited data, and discusses the influence of edited data methods on classification accuracy. Mani and Zhang ( 2003 ) proposed a method for processing imbalanced dataset distribution based on k nearest neighbor algorithm by taking information extraction as a case study. Michael and colleagues examine cases that are often misclassified and look at elements that contribute to their difficulty in order to better comprehend machine learning data and develop training procedures and algorithms (Smith et al. 2014 ).

In recent years, some researchers have been investigating variants of sampling methods that take into account the density information, spatial information, and intrinsic characteristics between classes of the data and combine them with reinforcement learning, deep learning, and clustering algorithms for data enhancement. For example, Kang et al. introduced WU-SVM, an improved algorithm that utilizes a weighted undersampling scheme based on space geometry distance to assign different weights to majority samples in subregions, allowing for better retention of the original data distribution information in each learning iteration (Kang et al. 2017 ). Yan et al. proposed Spatial Distribution-based UnderSampling (SDUS), an undersampling method for imbalanced learning that maintains the underlying distribution characteristics by utilizing sphere neighborhoods and employing sample selection strategies from different perspectives (Yan et al. 2023 ). Fan et al. proposed a sample selection strategy based on deep reinforcement learning algorithm, aiming to optimize the sample distribution, by constructing a reward function, using deep reinforcement learning algorithm to train the model to select more representative samples, thereby improving the diagnostic accuracy (Fan et al. 2021 ). Kang et al. ( 2016 ) introduced a new under-sampling scheme that incorporates a noise filter before resampling, resulting in improved performance of four popular under-sampling methods for imbalanced classification. Other researchers have incorporated clustering algorithms into undersampling. In order to successfully handle class imbalance issues, Tsai et al. (Lin et al. 2017 ) presented a unique undersampling strategy called cluster-based instance selection (CBIS), which combines clustering analysis with instance selection. Han et al. ( 2023 ) proposed a combination of global and local oversampling methods to distinguish between minority and majority classes by comparing them to the discretisation values at each class level. Instances are synthesised based on the degree of discrete magnitude.

The oversampling method offers several advantages in addressing imbalanced data, including increased representation of the minority class, retention of original information, improved classifier performance, and reduced bias towards the majority class. By artificially increasing the number of instances in the minority class, the dataset becomes more balanced for training, allowing classifiers to better learn the characteristics and patterns of the minority class. However, oversampling can also have drawbacks, such as the potential for overfitting, increased computational complexity, the introduction of noise into the dataset, and limited information gain. These disadvantages can impact the classifier’s generalization ability and performance on unseen data. When using the oversampling approach, it is necessary to take the specific oversampling methodology and the features of the dataset into account. A comparison of algorithmic studies of the latest relevant data levels in recent years is shown in Table 2 .

Sampling methods have been of interest to researchers mainly because they are described according to the following guidelines: (1) they work independently of the learning algorithm, (2) they lead to a balanced redistribution of data, and (3) they can be easily combined with any learning mechanism. Of course different sampling methods can be categorised based on the complexity of the corresponding sampling method. Under-sampling typically has faster run times and higher robustness and is less prone to overfitting than oversampling.

3.2 Algorithm-level approaches

Algorithm-level approaches concentrate on adapting or developing machine learning algorithms to effectively handle imbalanced datasets. These techniques prioritize enhancing the algorithms’ capability to accurately classify instances from the minority class. By adjusting or designing algorithms to be more responsive to these underrepresented groups, these techniques contribute to improving the overall performance, fairness, and generalizability of machine learning models when confronted with imbalanced data. Among the most prevalent algorithm-level strategies is cost-sensitive learning, wherein the classification performance is enhanced by modifying the algorithm’s objective function. This alteration ensures that the model receives greater emphasis on learning from underrepresented classes (Castro and Braga 2013 ; Yang et al. 2021 ). Table 3 shows the advantages and disadvantages of the main algorithm-level approaches in recent years.

3.2.1 Cost-sensitive learning

Cost-sensitive learning takes into account the different costs associated with different types of misclassification and aims to optimise the model for scenarios where the consequences of errors are uneven. Lately, researchers have incorporated other techniques based on cost-sensitive learning to enable models with higher classification accuracy and better generalization. For example, Zhang and Hu ( 2014 ) proposed cost-free learning (CFL) as a method for achieving optimal classification outcomes without depending on cost information, even when class imbalance exists. CFL approach maximizes normalized mutual information between targets and decision outputs, enabling binary or multi-class classifications with or without abstaining. To enhance the classification performance of support vector machines (SVMs) on imbalanced data sets, Cao et al. ( 2020 ). developed the unique approach ATEC. By assessing its performance in terms of classification accuracy and changing it in the proper direction, ATEC effectively optimizes the error cost for between-class samples. Addressing both concept drift and class imbalance problems in streaming data, Lu et al. ( 2019 ) proposed an adaptive chunk-based dynamic weighted majority technique. In addressing the imbalanced class problem, Fan et al. ( 2017 ) introduced Entropy-based Fuzzy Support Vector Machine (EFSVM) to addresses the imbalanced class problem by assigning fuzzy memberships to samples based on their class certainty, resulting in improved classification performance compared to other algorithms. Datta and Das ( 2019 , 2015 ) proposed another idea where a multi-objective optimisation framework (Datta et al. 2017 ) was used to train SVM to efficiently find the best trade-off between objectives in class imbalanced data sets. Cao et al. ( 2021 ) proposed an adaptive error cost adjustment method for class imbalance learning of support vector machines (SVMs) called ATEC, which has a significant advantage over the traditional grid search strategy in terms of training time by efficiently and automatically adjusting the error cost between samples. This novel domain can effectively solve imbalance problems without expensive parameter tuning.

3.2.2 Weighted shallow neural networks

Shallow neural networks combined with imbalanced learning algorithms have made extensive developments in solving imbalanced data classification problems. Shallow neural networks have fast training capability. Due to the characteristics of the shallow layer structure, the training time of neural networks is relatively short. This is very beneficial for applications that quickly process imbalanced datasets and get instant results. For example, In order to accurately and efficiently identify and categorize power quality disturbances, Sahani and Dash ( 2019 ) proposed an FPGA-based online power quality disturbances monitoring system that makes use of the reduced-sample Hilbert-Huang Transform (HHT) and class-specific weighted Random Vector Functional Link Network (RVFLN). Choudhary and Shukla ( 2021 ) proposed an integrated ELM method for decomposing complex imbalance problems into simpler subproblems for the class bias problem in classification tasks. The technique successfully addresses elements like class overlap and the number of probability distributions present in the unbalanced classification issue by utilizing cost-sensitive classifiers and cluster assessment. Chen et al. proposed a novel approach called double-kernelized weighted broad learning system (DKWBLS) that addresses the challenges of class imbalance and parameter tuning in broad learning systems (BLS). By utilizing a double-kernel mapping strategy, DKWBLS generates robust features without the need for adjusting the number of nodes. Additionally, DKWBLS explicitly considers imbalance problems and achieves improved decision boundaries (Chen et al. 2022b ). Chen et al. proposed a new double kernel-based class-specific generalized learning system (DKCSBLS) to solve the multiclass imbalanced learning problem. DKCSBLS solves the imbalance multiclassification problem based on class distribution adaptively combined with class-specific penalty coefficients and uses a dual kernel mapping mechanism to extract more robust features (Chen et al. 2022a ). Yang et al. addresses the imbalance problem in the broad learning system (BLS) by proposing a weighted BLS and an adaptive weighted BLS (AWBLS) that consider the prior distribution of the data. Additionally, an incremental weighted ensemble broad learning system (IWEB) is proposed to enhance the stability and robustness of AWBLS (Yang et al. 2021 ).

Algorithmic-level approaches optimize the loss function associated with the dataset, concentrating on a limited number of classes to enhance model performance. In contrast to resampling-based methods, these techniques are more computationally efficient and better suited for large data streams. Consequently, considering their ability to enhance AUC and G-mean in imbalanced scenarios along with runtime considerations, algorithmic-level methods may be preferred. Furthermore, they offer flexibility in selecting distinct activation functions and optimization algorithms tailored to specific problem requirements.

These approaches aim to address imbalance classification problems by adapting existing algorithms or creating new ones specifically designed for imbalanced datasets, emphasizing a smaller number of classes. They are easy to implement as they do not necessitate dataset or feature space modifications and are applicable across a wide range of classification algorithms, potentially enhancing classifier performance on unbalanced datasets. However, they may not fully capture the complexity of imbalanced datasets and can be sensitive to hyperparameter selection. Additionally, they do not openly tackle data scarcity among minority groups, and their efficiency may vary based on unique dataset features.

3.3 Hybrid approaches

To address imbalanced learning, hybrid methods incorporate techniques from both the data-level and the algorithm-level. In an effort to improve performance, these techniques try to combine the advantages of both strategies. Table 4 shows the advantages and disadvantages of the main hybrid methods in recent years.

A common hybrid approach is to combine sampling methods with ensemble learning (Sağlam and Cengiz 2022 ; Abedin et al. 2022 ; Razavi-Far et al. 2019 ; Sun et al. 2018 ; Liang et al. 2022 ). Another is the combination of sampling methods using multiple optimisation algorithms or adaptive domain methods. For example, Sağlam and Cengiz ( 2022 ) proposed a method to solve the category imbalance problem in classification called SMOTEWB (SMOTE with boosting), which combines a new noise detection method with SMOYE. By combining noise detection and augmentation in an ensemble algorithm, SMOTEWB overcomes the challenges associated with random oversampling (ROS) and SMOTE by adjusting the number of neighbours per observation in the SMOTE algorithm. Abedin et al. ( 2022 ) presented WSMOTE-ensemble, a unique ensemble technique for small company credit risk assessment that addresses the issue of severely uneven default and nondefault classes. To build robust and varied synthetic examples, the suggested technique combines WSMOTE and Bagging with sampling composite mixes, thereby eliminating class-skewed limitations. Razavi-Far et al. ( 2019 ) presented unique class-imbalanced learning algorithms that combine oversampling methods with bagging and boosting ensembles, with a particular emphasis on two oversampling strategies based on single and multiple imputation methods. To rebalance the datasets for training ensemble algorithms, the suggested strategies attempt to develop synthetic minority class samples with missing values estimate that are similar to the actual minority class samples. For the evaluation of unbalanced business credit, Sun et al. ( 2018 ) developed the DTE-SBD decision tree ensemble model, which combines the SMOTE with the Bagging ensemble learning algorithm with differential sample rates.

Another is the combination of sampling methods using multiple optimisation algorithms or adaptive domain methods. Chen et al. (Pan et al. 2020 ) proposed Gaussian oversampling and Adaptive SMOTE methods. Adaptive SMOTE takes the Inner and Danger data from the minority class and combines them to form a new minority class that improves the distributional features of the original data. Gaussian oversampling is a technique that combines dimensionality reduction with a Gaussian distribution to make the distribution’s tails narrower. Dixit and Mani ( 2023 ) proposed a novel oversampling filter-based method called SMOTE-TLNN-DEPSO, a hybrid variant of the method that combines SMOTE for synthetic sample generation and the Differential Evolution-based Particle Swarm Optimisation (DEPSO) algorithm for iterative attribute optimisation. Yang et al. ( 2019 ) proposed a hybrid optimum ensemble classifier framework that combines density-based under-sampling and cost-effective techniques to overcome the drawbacks of traditional imbalanced learning algorithms.

In order to overcome the cost problems associated with parameter optimisation of traditional hybrid methods. datta et al. (Mullick et al. 2018 , 2019 ) combined neural networks and heuristic algorithms with sampling methods or traditional classifiers. For example, the performance of the k -nearest neighbour classifier on imbalanced data sets was improved by adjusting the k value using neural networks or heuristic learning algorithms. Another is a three-way adversarial strategy that combines convex generators, multi-class classifier networks and discriminators in order to perform oversampling in deep learning systems.

Hybrid methods can integrate various imbalance learning techniques, such as combining clustering algorithms with sampling methods and coupling them with cost-sensitive algorithms to enhance model performance. However, this integration can lead to increased model complexity and necessitate more parameter tuning. Therefore, when selecting hybrid methods to address imbalance algorithms, it is essential to consider their runtime and model complexity.

3.4 Ensemble learning methods

Ensemble learning is a methodology employed for the classification of imbalanced data, which involves amalgamating multiple classifiers or models to enhance the performance of classification tasks on datasets characterized by imbalanced class distributions. Its primary objective is to tackle the challenges presented by imbalanced class distributions by capitalizing on the strengths of different classifiers, thereby enhancing the predictive accuracy for the minority classes (Yang et al. 2021 ). Ensemble methods for imbalanced data classification typically encompass the creation of diverse subsets from the imbalanced dataset through resampling techniques. Individual classifiers are then trained on these subsets, and their predictions are combined using voting or weighted averaging schemes. Ensemble learning efficiently reduces the bias towards the majority class and enhances overall classification performance on imbalanced data sets by integrating different models.

Over the past few years, many scholars have summarised the progress, challenges and potential solutions of ensemble learning on the problem of unbalanced data classification. At the same time, mainstream methods are categorised and discussed, challenges are clarified, and research directions are proposed. Finally, many surveys have also explored the possibility of combining ensemble learning with other machine learning techniques (Dong et al. 2020 ; Yang et al. 2023 ; Galar et al. 2012 ). For example, Galar et al. ( 2012 ) explored the challenges posed by imbalanced datasets in classifier learning. It reviews integrated learning methods for the problem of classifying unbalanced data, proposes classification criteria, and makes empirical comparisons. It is demonstrated that random undersampling combined with either bagging or boosting ensembles works well, emphasizing the superiority of ensemble-based algorithms over stand-alone preprocessing methods. Table 5 summarizes the advantages and disadvantages of some main ensemble learning algorithms over the last three years.

3.4.1 General framework

General ensemble is a generalisability framework for addressing imbalanced learning that is robust to multiple datasets. The goal behind generic integration is to use the diversity and complementary qualities of various models to improve the integration’s overall performance and resilience. This approach aims to overcome the limitations of a single model by aggregating their predictions through various techniques such as voting, averaging or weighted combinations. Liu et al. ( 2020 ) proposed to generate strong integrations for unbalanced classification through self-paced coordination of under-sampled data hardness to cope with the challenges posed by imbalanced and low-quality datasets. This computationally efficient method takes into account class disproportionality, noise and class overlap to achieve robust performance even in highly skewed distributions and overlapping classes, while being applicable to a variety of existing learning methods. Liu et al. ( 2020 ) present a novel integrated imbalanced learning framework called MESA, which adaptively resamples the training set to create multiple classifiers and form a cascaded integrated model. MESA learns the sampling strategy directly from the data, going beyond heuristic assumptions to optimize the final metric. Liu et al. ( 2009 ) proposed two algorithms, EasyEnsemble and BalanceCascade, to address the undersampling in the class imbalance problem. easyEnsemble combines the output of multiple learners trained on a majority class subset, while BalanceCascade sequentially removes the majority class examples that are correctly classified in each step.

The general framework of ensemble learning offers several advantages in addressing the classification of imbalanced data. Firstly, it provides a flexible and adaptable approach that can incorporate various ensemble methods, such as bagging, boosting, or cost-sensitive techniques, depending on the specific characteristics of the imbalanced dataset. This versatility allows for customization and optimization based on the specific problem at hand. Additionally, ensemble learning can effectively leverage the diversity of multiple classifiers to improve the overall classification performance and handle the challenges posed by imbalanced data, such as class overlap or rare class identification. It can also provide robustness against noise or outliers in the data.

However, there are also noteworthy potential disadvantages to consider when employing ensemble learning approaches. Firstly, such approaches may necessitate additional computational resources and increased training time in comparison to single classifier methods. This is due to the requirement of training and combining multiple models within the ensemble. Secondly, the performance of ensemble learning is highly dependent on several factors, including the selection of appropriate base classifiers, ensuring their diversity, and employing an effective ensemble combination strategy. These factors demand careful consideration and tuning to achieve optimal results. Furthermore, the adaptability of ensemble learning to specific domains is limited. While generic framework models may perform well in certain datasets or domains, their efficacy may be compromised in others. As a result, precise tweaking and optimization of the ensemble models may be necessary to achieve high performance in a variety of settings.

3.4.2 Boosting

Boosting ensemble learning is a machine learning technique that combines multiple weak classifier iteratively to create a strong ensemble model. The most widely used of these is the Adaboost (Freund and Schapire 1997 ), smoteboost (Chawla et al. 2003 ) and RUSBoost (Seiffert et al. 2009 ). Every weak classifier is trained on a subset of the data, prioritizing the samples that were misclassified by the preceding models. This iterative process aims to improve the overall performance of the ensemble by focusing on challenging instances during training. The final prediction is obtained by aggregating the predictions of all weak learners, typically using weighted voting or averaging. Boosting effectively improves the overall performance and generalization of the ensemble by emphasizing the challenging instances and reducing the bias in the learning process.

In the past several years, researchers have devoted much thought and attention to Boosting techniques for solving the class imbalance problem, aiming to improve the classification accuracy of minority class and to improve the learning performance in the case of class imbalance. For example, Kim et al. ( 2015 ) proposed GMBoost, which deals with data imbalances based on geometric means. GMBoost can comprehensively consider the learning of majority and minority classes by using the geometric mean of the two classes in error rate and accuracy calculations. Roozbeh et al. et al. (Razavi-Far et al. 2021 ) proposed a new class imbalanced learning technique that combines oversampling methods with bagging and boosting ensembles. The article proposes two strategies based on single and multiple imputation to create synthetic minority class samples by estimating missing values in the original minority class to improve classification performance on imbalanced data. Wang et al. ( 2020 ) proposed the ECUBoost framework, which combines augmented integration-based with novel entropy and confidence-based undersampling methods to maintain the validity and structural distribution of the majority of samples during undersampling to address the imbalance problem. In recent years Datta et al. ( 2020 ) proposed a new Boosted method that achieves the trade-off between majority and minority classes without expensive search costs. The method treats the weight assignment of component classifiers as a game of tug-of-war between classes in the edge space and avoids costly cost-set adjustments by implementing efficient class compromises in a two-stage linear programming framework.

In summary, Boosting has the following advantages in solving the classification of imbalanced data. (1) It can effectively deal with class imbalance by focusing on a small number of classes and assigning higher weights to misclassified instances, thereby improving overall classification performance. (2) Boosting can combine multiple weak classifiers to create strong integration, resulting in better generalisation and robustness. They can adaptively adjust the weights of classifiers during boosting iterations, emphasising difficult instances and reducing bias towards majority classes.

However, it is important to acknowledge the limitations and potential drawbacks of the Boosting algorithm. One notable concern is its relatively higher computational cost, particularly when handling large-scale datasets. The iterative nature of the enhancement process necessitates multiple iterations and the training of weak classifiers, which can significantly increase training time and resource requirements. Furthermore, augmentation algorithms are sensitive to the presence of noisy or mislabeled data, which can have a detrimental impact on their performance. It is crucial to address these limitations and take appropriate measures to mitigate their effects when applying the Boosting in practical settings. Furthermore, Boosting ensemble strategy is a serial iterative method that requires constant updating of the model and therefore a long training time. Therefore runtime needs to be considered when using Boosting with ensemble learning.

3.4.3 Bagging

Bagging Breiman ( 1996 ) is an ensemble learning technique that involves creating multiple subsets of the original dataset by random sampling and replacement. Each subset is used to train a separate base learner, such as a decision tree, using the same learning algorithm. The predictions from all the base learners are then combined by majority voting (for classification) or averaging (for regression) to make the final prediction. By averaging multiple independently trained models, Bagging helps to reduce variation in predictions, thereby improving generalisation and robustness (Wang and Yao 2009 ).

Researchers have recently concentrated on the application of bagging ensemble learning techniques to address issues with class imbalance. For example, Bader-El-Den et al. ( 2018 ) proposed a novel ensemble-based approach, called biased random forest, to address the class imbalance problem in machine learning. The technique concentrates on boosting the number of classifiers representing the minority class in the ensemble rather than oversampling the minority class in the dataset. By identifying critical areas using the nearest neighbor algorithm and generating additional random trees. Błaszczyński and Stefanowski ( 2015 ) investigated extensions of bagging ensembles for imbalanced data, comparing under-sampling and over-sampling approaches, and proposes Neighbourhood Balanced Bagging as a new method that considers the local characteristics of the minority class distribution. Guo et al. ( 2022 ) proposed a dual evolutionary Bagging framework that combines resampling techniques and integration learning to solve the class imbalance problem. The framework aims to find the most compact and accurate integration structure by integrating different base classifiers. After selecting the best base classifiers and using an internal integration model to enhance diversity, the multimodal genetic algorithm finds the optimal combination based on mean G-means

In summary, bagging offers several advantages in addressing the classification of unbalanced data. Firstly, it can obtain a more balanced dataset and improve classification performance by random sampling. In addition, bagging reduces the variance of the model by creating multiple subsets of the original data and training multiple base classifiers independently, which helps to reduce overfitting and improve generalization. Finally, bagging is a simple and straightforward implementation that can be combined with a variety of basic classifiers. Bagging requires neither adjusting the weight update formula nor changing the amount of computation in the algorithm and is able to achieve good generalization with a simple structure. Therefore, when using ensemble learning algorithms to deal with imbalance problems, Bagging may be the better method to choose if the low complexity and running time of the model as well as the robustness of the model are important.

Although bagging can improve overall classification performance, it may still struggle to accurately classify a small number of classes if they are severely under-represented. In addition, bagging may not be effective when dealing with datasets with overlapping classes or complex decision boundaries. It is also worth noting that the performance of bagging relies heavily on the choice of the underlying classifier and the quality of the individual models in the integration.

3.4.4 Cost-sensitive ensemble

Cost-sensitive ensemble learning takes into account the imbalance costs associated with different classes and aims to optimise the overall cost of misclassification. Such methods take into account cost factors in the decision making process. By assigning appropriate weights to adjust the decision thresholds of individual classifiers, cost-sensitive ensemble techniques aim to minimise the overall cost of misclassification and to improve the performance of specific classes that bear higher costs.

Presently, an increasing number of scholars have focused on research that cost-sensitive ensemble learning. The most widely used is Adacost (Fan et al. 1999 ), proposed by Fan in 1996. AdaCost is a misclassification cost-sensitive boosting method that updates the training distribution based on misclassification costs, aiming to minimize cumulative misclassification costs more effectively than AdaBoost, with empirical evaluations demonstrating significant reductions in costs without additional computational overhead. Additionally, many scholars have been delving deeper into cost-sensitive ensemble in recent years from various angles. By employing a cascade of straightforward classifiers trained with a subset of AdaBoost, Viola and Jones ( 2001 ) demonstrated a unique method for quick identification in areas with highly skewed distributions, such as face detection or database retrieval. The suggested approach significantly outperforms traditional AdaBoost in face identification tasks thanks to its high detection rates, extremely low false positive rates, and quick performance. Zhang et al. (Ng et al. 2018 ) introduced a new incremental ensemble learning method that addresses concept drift and class imbalances in a streaming data environment, in which the class imbalances are addressed by an imbalance inversion bagging method, which is specifically applied to predict Australia’s electricity price. Akila and Reddy ( 2018 ) developed a cost-sensitive Risk Induced Bayesian Inference Bagging model, RIBIB, for detecting credit card fraud. RIBIB used a novel bagging architecture that included a limited bag formation approach, a Risk Induced Bayesian Inference base learner, and a cost-sensitive weighted voting combiner. Zhang et al. ( 2022 ) proposed an integrated framework, BPUL-BCSLR, for data-driven mineral prospectivity mapping (MPM) that addresses the challenges of imbalanced geoscience datasets and cost-sensitive classification. The proposed approach integrates Bagging-based positive-unlabeled learning (BPUL) with Bayesian cost-sensitive logistic regression (BCSLR) and was implemented for the study of MPM in the Wulong Au district, China.

Cost-sensitive ensemble learning is valuable for imbalanced datasets. It accounts for misclassification costs, vital in real-world scenarios where rare class errors are expensive. By integrating cost factors, it improves decision-making and resource allocation. These methods balance error types, enhancing sensitivity to minority classes, thus improving overall classification and class distribution representation. While cost-sensitive ensemble learning has its advantages, there are certain challenges associated with its implementation. Estimating error costs demands prior knowledge or expert input, introducing subjectivity. Selecting an effective ensemble combination strategy considering cost factors can be complex. It is not easy to optimize cost-based objectives while maintaining variety. As a result, data distribution bias often affects cost-sensitive ensemble approaches, which might result in comparatively good performance but poor resilience.

4 Regression and Semi/unsupervised learning in imbalanced data

In this section, we first describe the solution of the regression problem in an imbalance scenario. At the same time we give the solution ideas of semi-supervised and unsupervised in imbalance problems.

4.1 Regression in imbalanced scenarios

There is a notable gap in systematically exploring the imbalanced perspective of machine learning algorithms in the context of the regression problem. The regression problem in an unbalanced scenario arises when predicting continuous values, where the target variables in the dataset exhibit an imbalanced distribution. Traditional regression tasks aim to make accurate predictions for all target values. However, in imbalanced scenarios, certain target values have significantly fewer instances, resulting in skewed datasets (Krawczyk 2016 ; Rezvani and Wang 2023 ; Yang et al. 2021 ). This presents a challenge for regression models as the limited availability of training data may hinder their ability to accurately predict these infrequent target values.

The majority of the research on the imbalanced regression problem has focused on developing assessment metrics (Torgo and Ribeiro 2009 ) that consider how important observations are as well as techniques for handling undersampling and outliers in continuous output prediction. For example, Branco et al. ( 2017 ) proposed SMOGN, a novel pre-processing technique tailored for addressing imbalanced regression tasks. SMOGN addresses the performance degradation observed in rare and relevant cases in imbalanced domains. Branco et al. ( 2019 ) proposed three new methods: adapting to random oversampling, introducing Gaussian noise, and proposing a new method called WERCS (Weighted Correlation-based Combinatorial Strategy) to address the problems posed by imbalanced distributions in regression tasks. In order to solve the problem of class imbalance in ordinal regression, Zhu et al. ( 2019 ) developed SMOR (Synthetic Minority oversampling methodology for imbalanced Ordinal Regression). SMOR takes into account the classification order and gives low selection weights to prospective generation directions that can skew the structure of the ordinal sample.

In recent years, a number of researchers have addressed the imbalance regression problem at the algorithmic level and at the ensemble learning level. For example, Branco et al. ( 2018 ) introduced the REsampled BAGGing (REBAGG) algorithm, an ensemble method designed to address imbalanced domains in regression tasks. REBAGG incorporates data pre-processing strategies and utilizes a bagging-based approach. By employing nightly pulse oximetry to diagnose obstructive sleep apnea, Gutiérrez-Tobal et al. ( 2021 ) suggested a least-squares boosting (LSBoost) model for predicting the apnea-hypopnea index (AHI). The model achieves high diagnostic performance in both community-based non-referral and clinical referral cohorts, demonstrating its ability to generalize. Kim et al. ( 2019 ) introduced a novel method for predicting river discharge (Q) utilizing hydraulic variables collected from remotely sensed data termed Ensemble Learning Regression (ELQ). The ELQ method combines multiple functions to reduce errors and outperforms traditional single-rating curve methods. Liu et al. ( 2022 ) proposed an ensemble learning assisted method for accurately predicting fuel properties based on their molecular structures. By comparing two descriptors, COMES and CM, the optimized stacking of various base learners is achieved to efficiently screen potential high energy density fuels (HEDFs) and accurately predict their properties. Steininger et al. ( 2021 ) proposed DenseWeight, a sample weighting approach based on kernel density estimation, and DenseLoss, a cost-sensitive learning approach for neural network regression. DenseLoss adjusts the influence of each data point on the loss function according to its rarity, leading to improved model performance for rare data points. Ren et al. ( 2022 ) proposed a novel loss function specifically designed for the imbalanced regression task. They offer multiple implementations of balanced MSEs, including one that does not require prior knowledge of the training label distribution. Addressing data imbalance in real-world visual regression.

In the context of regression problems, many solutions developed for categorical imbalanced data can be extended, but currently lack adequacy in addressing the challenges specific to regression imbalanced scenarios. To enhance the robustness and predictive power of regression models in unbalanced data, the utilization of integrated learning methods holds promise. By combining the prediction results from multiple regression models, it becomes possible to mitigate errors and biases associated with infrequent groups, thereby improving the overall prediction performance. The advantage lies in effectively leveraging the collective strengths of multiple models. However, caution must be exercised in controlling the diversity of integrated models to prevent overfitting.

4.2 Semi/unsupervised learning in imbalanced scenarios

Semi/unsupervised learning in imbalanced scenarios involves training machine learning models when labeled data is scarce or imbalanced across classes. These approaches leverage both labeled and unlabeled data to improve model performance, addressing challenges posed by class imbalance. Unsupervised learning method aspect involves imbalanced clustering problems, such as the case where some clusters contain more points than others. This is because traditional clustering methods may have difficulty in accurately identifying and representing clusters of a few classes leading to problems such as poor clustering and loss of information about a few cluster classes. In semi-supervised imbalanced learning (Wei et al. 2021 ; Yang and Xu 2020 ), the challenge is not only the lack of sufficient labelled samples, but also that the distribution of these labelled samples exhibits class imbalance.

Semi-supervised imbalanced learning (SSIL) is a key problem in dealing with imbalanced data when labelled samples are scarce. For example, Chen et al. (Wei et al. 2021 ) proposed Class-Rebalancing Self-Training, which iteratively retrains the SSIL model and selects minority class samples more frequently. Distribution Aligning Refinery of Pseudo-label (DARP) (Kim et al. 2020 ) optimises the generated pseudo-labels to fit the models that are biased towards the majority class, improving the generalisation ability of SSIL under the balancing test criterion. In addition, a scalable SSIL algorithm that introduces an auxiliary balanced classifier (ABC) (Lee et al. 2021 ) successfully copes with class imbalance by introducing balance in the auxiliary classifier. Other studies (Yang and Xu 2020 ; Lee et al. 2021 ) have argued for the value of unbalanced labelling. Under more unlabelled data conditions, the original labels can be used for semi-supervised learning along with additional data to reduce label bias and significantly improve the final classifier performance.

To address the imbalance problem in the unsupervised case, specialised algorithms and techniques have been developed to ensure fair validity of clustering results. These methods aim to enhance the representatives of minority clusters, take into account imbalance factors, and achieve a fairer clustering distribution. For example, Nguwi and Cho ( 2010 ) combined support vector machines and ESOM for variable selection and clustering ordered features. Zhang et al. ( 2019 ), Lu et al. ( 2019 ) considered unbalanced clusters by integrating interval-type type II fuzzy local metrics. Zhang et al. ( 2023 ) proposed k-means algorithm for adaptive clustering weights, which optimised the trade-off between each cluster weight to solve the imbalanced clustering problem. Cai et al. ( 2022 ) in order to mine fused location data, developed a unique clustering technique to deal with the problem of imbalanced datasets. The OSRCIH method proposed by Wen et al. ( 2021 ) combines autonomous learning and spectral rotation clustering to tackle the challenges of imbalanced class distribution and high dimensional. These combined considerations such as variable selection, fuzzy metrics, and local density.

Overall, the methods proposed for the problem of clustering and semi-supervised learning of imbalanced data show some promise. Schemes such as automatically determining the centre of clusters and the number of clusters are well suited to arbitrarily shaped imbalanced data sets. However, unsupervised methods may require careful tuning of parameters and evaluation using specialised metrics. Although the performance of the model on imbalanced data can be significantly improved by semi-supervised learning, the correlation between unlabelled data and raw data has a significant impact on the results of semi-supervised learning, and SSIL does not really integrate a strategy for imbalanced learning, even though there is still a lot of room for improvement.

5 Deep learning classification problems under long-tail distribution

The long-tailed class distribution refers to a specific pattern observed in datasets where the occurrence of classes follows a long-tailed or power-law distribution. This distribution exhibits a small number of classes with a high frequency of instances, known as the ”head,” while the remaining classes have significantly fewer instances, forming the ”tail.” Consequently, the tail classes are commonly referred to as the minority classes, whereas the head classes are considered the majority classes. A comprehensive review of current research on deep long-tail distributions and future developments can be found in Zhang et al.’s literature (Zhang et al. 2023 ). This distribution pattern is frequently encountered in various real-world scenarios, including image detection (Zang et al. 2021 ), visual relation learning (Desai et al. 2021 ), and few-shot learning (Wang et al. 2020 ), where certain classes are more prevalent than others. Ghosh et al. ( 2022 ) demonstrate through many experiments that the category imbalance problem is not eliminated by deep learning, and also provide many solutions that have been offered so far to solve this problem, which are generally categorised into post-processing, pre-processing, and dedicated algorithms.

Long-tail learning and class imbalanced learning are two interrelated yet distinct research areas. Long-tail learning can be viewed as a specialized sub-task of class imbalanced learning, with the main distinction being that in long-tail learning, the samples of tail classes are typically very sparse and do not necessarily exhibit an absolute imbalance in the number of classes. In contrast, class imbalanced learning typically involves some minority class samples (Zhang et al. 2023 ; Li et al. 2021 ). Despite these differences, both research areas are dedicated to addressing the challenges posed by class imbalance and share certain ideas and approaches, such as class rebalancing, when developing advanced solutions. At the same time, Ghosh et al. ( 2022 ) explored whether the effect of class imbalance on deep learning models is related to its effect on their shallow learning counterparts, with the aim of exploring the effect of class imbalance on deep learning models and whether deep learning has fully solved the problem in machine learning.

5.1 Class rebalancing

Recent research on long-tail distributions can be divided into the following three categories: class rebalancing, information enhancement, and module improvement (Zhang et al. 2023 ). Class rebalancing is one of the main approaches for long-tail learning, which aims to address the negative effects of class imbalance by rebalancing the number of training samples. Recent deep long-tail research has used various classes of balanced sampling methods, rather than random resampling, for the training of small batches of deep models. However, these strategies require prior knowledge of the frequency of training samples for different categories, which may not be available in practice (Zhang et al. 2023 ; Kang et al. 2019 ).

New research work has recently proposed that rebalancing any imbalanced categorical dataset should essentially just rebalance the classifier, and should not change the distribution of picture features for feature learning with the distribution of categories (Zhou et al. 2020 ). Wang et al. ( 2019 ) propose a unified framework called Dynamic Course Learning (DCL), which adaptively adjusts the sampling strategy and weights in each batch. DCL combines a two-level course scheduler for data distribution and learning importance, resulting in improved generalization and discriminatory power. Zhang et al. ( 2021 ) proposed FrameStack, a frame-level sampling method that dynamically balances the class distribution during training, thereby improving video recognition performance without compromising overall accuracy.

There are also methods that incorporate meta-learning (Liu et al. 2020 ; Hospedales et al. 2021 ). Zang et al. ( 2021 ) propose a method called Feature Augmentation and Sampling Adaptive (FASA) to address the challenge of data scarcity for rare object classes in long-tail instance segmentation. FASA uses an adaptive feature augmentation and sampling strategy to augment the demand space for rare classes, using information from past iterations and adjusting the sampling process to prevent overfitting. By adaptively balancing the impact of meta-learning and task-specific learning within each task, Lee et al. ( 2019 ) introduced Bayesian Task-Adaptive Meta-Learning (Bayesian TAML), a revolutionary meta-learning model that tackles the drawbacks of previous techniques. By learning the balancing variables, Bayesian TAML determines whether to rely on meta-knowledge or task-specific learning for obtaining solutions. Dablain et al. ( 2022 ) proposed DeepSMOTE, an oversampling algorithm for deep learning models, which addresses the challenge of unbalanced data by combining an encoder/decoder framework, SMOTE-based oversampling, and a penalty-enhanced loss function.

Another approach is the re-weighting related research method, which solves the long-tail or imbalanced distribution problem by improving the loss, which is often simple to implement and requires only a few lines of code to modify the loss to achieve a very competitive result. Cui et al. ( 2019 ) proposed a novel theoretical framework for addressing the problem of long-tailed data distribution by measuring data overlap using small neighboring regions instead of single points. In order to establish class balance in the loss function, a reweighting method is created using the effective number of samples, which is determined depending on the volume of data. Muhammad et al. (Jamal et al. 2020 ) proposed a meta-learning method to explicitly estimate the differences between class conditional distributions, which enhances classical class balancing learning by linking class balancing methods to domain adaptation. Cao et al. ( 2019 ) proposed label distribution-aware margin (LDAM) loss as well as delayed re-weighting training schemes that minimise margin-based generalisation bounds and allow the model to learn the initial representation before applying re-weighting, thus improving the performance of imbalanced learning.

Class rebalancing is a simple but well-performing method in long-tail learning, especially when inspired by class-sensitive learning. This makes it an attractive option for real-world applications. However, class rebalancing methods are usually performance trade-offs, and improving the performance of the tail classes may decrease the performance of the head classes. To overcome this problem, combining different approaches can be considered, but the pipeline needs to be carefully designed to avoid performance degradation (Zhang et al. 2023 ). This suggests that the long-tail problem may require more information-enhanced approaches to effectively deal with tail class deficiencies.

5.2 Information enhancement

To increase the performance of deep learning models in long-tailed learning scenarios, a strategy known as the Information Enhancement Method is used in relation to deep long-tail distribution. It is centered on enriching the available information during model training. Transfer learning and data augmentation are the two primary areas that this approach covers (Zhang et al. 2023 ).

Transfer learning aims to learn generic knowledge from the head common class and then transfer it to the tail less sample class. Recently, there has been a growing interest in applying migration learning to scenarios with deep long-tail distributions. For example, Liu et al. addressed the challenge of learning deep features from long-tailed data by proposing a method that expands the distribution of tail classes in the feature space. The method augments each instance of tail classes with disturbances, creating a ”feature cloud” that provides higher intra-class variation. With this method, deep representation learning on long-tailed data is enhanced since it reduces the distortion of the feature space brought on by the unequal distribution between head and tail classes. Xiang et al. ( 2020 ) introduced a novel framework for self-paced knowledge distillation. This approach includes two levels of adaptive learning schedules, namely self-paced expert selection and lesson example selection, aiming to effectively transfer knowledge from multiple ’experts’ to a unified student model. Wang et al. ( 2020 ) addressed the challenge of imbalanced classification in long-tailed data by proposing a new classifier called RoutIng Diverse Experts (RIDE). To lower model variance, decrease model bias, and lower computing costs, RIDE makes use of many experts, a distribution-aware diversity loss, and a dynamic expert routing module. Further research has found that experimental results indicate that self-supervised learning plays a positive role in learning a balanced feature space for long-tailed data (Kang et al. 2020 ). Furthermore, research is being done to investigate methods for managing long-tailed data with noisy labels (Karthik et al. 2021 ).

Data augmentation in the context of deep long-tailed distributions refers to a technique that aims to improve the size and quality of datasets used for model training. It involves applying pre-defined transformations, such as rotation, scaling, cropping, or flipping, to each data point or feature in the dataset (Shorten and Khoshgoftaar 2019 ; Zhang et al. 2023 ). This category is divided into head-to-tail transfer enhancement and non-transfer enhancement. In head-to-tail transfer augmentation, the data augmentation process involves transferring augmented samples from the head classes to the tail classes. By applying pre-defined transformations to the samples from the head classes and adding them to the tail class data, the augmented tail class data is enriched, allowing for better representation and learning of the tail classes. This approach helps to mitigate the class imbalance issue and improve the generalization ability of the model on the tail classes. For example, Kim et al. ( 2020 ) proposed to enhance less frequent classes by performing sample translations from more frequent classes. They enable the network to learn more generalisable features for a small number of classes as a way to address the class imbalance in deep neural networks. Chen et al. ( 2022 ) proposed a reasoning-based implicit semantic data augmentation method to address the performance degradation of existing classification algorithms caused by long-tailed data distributions. By borrowing transformation directions from similar categories using covariance matrices and a knowledge graph, they generate diverse instances for tail categories. Zhang et al. ( 2022 ) proposed a data augmentation method based on Bidirectional Encoder Representation from Transformers (BERT) to address the long-tailed and imbalanced distribution problem in Mandarin Chinese polyphone disambiguation. They incorporate weighted sampling and filtering techniques to balance the data distribution and improve prediction accuracy. Dablain et al. ( 2023 ) proposed a three-stage CNN training framework with extended oversampling (EOS), aiming to address the generalisation gap of a few classes in imbalanced image data by exploiting an end-to-end training approach, learning data augmentation in the embedding space and fine-tuning the classifier head.

Information enhancement is compatible with other methods such as class rebalancing and modular improvement, especially in the two subtypes of information enhancement, migration learning and data augmentation, which, with careful design, can improve the performance of the tail categories without degrading the performance of the head categories. However, it is important to note that simply applying category-independent enhancement techniques may not be effective enough because they ignore the category imbalance problem, may add more samples from the head category than from the tail category, and may introduce additional noise. Thus how to better perform data augmentation for long-tail learning still requires further research.

5.3 Module improvement

In addition to class rebalancing and information enhancement methods, researchers have explored ways to improve the model in recent years. These methods can be divided into representation learning, classifier design and decoupled training (Zhang et al. 2023 ). This method involves analyzing the challenges posed by the distribution and making targeted modifications to the model’s architecture, loss function, training strategies, or data augmentation techniques to better handle the inherent biases and class imbalances.

Representation learning aim to learn feature representations that capture the underlying structure and discriminative information in the data, while also addressing the imbalance issue. For example, Chen et al. ( 2021 ) proposed a novel method based on the principles of causality, leveraging a meta-distributional scenario to enhance sample efficiency and model generalization. Liu et al. ( 2023 ) proposed Transfer Learning Classifier (TLC) to address the challenges of class-imbalanced data and real-time visual data in computer vision. The TLC model incorporates an active sampling module to dynamically adjust skewed distribution and a DenseNet module for efficient relearning. Kuang et al. ( 2021 ) proposed a class-imbalance adversarial transfer learning (CIATL) network to address the challenges of cross-domain fault diagnosis when dealing with class-imbalanced and machine faulty data. The CIATL network incorporates class-imbalanced learning and double-level adversarial transfer learning to learn domain-invariant and class-separate diagnostic knowledge. Recent studies also have explored contrastive learning approaches for addressing long-tailed problems. Methods such as KCL (Kang et al. 2020 ), PaCo (Cui et al. 2021 ), Hybrid (Wang et al. 2021 ), and DRO-LT (Samuel and Chechik 2021 ) have been proposed, each introducing innovative techniques such as k-positive contrastive loss, parametric learnable class centers, prototypical contrastive learning, and distribution robust optimization, respectively. These methods seek to reduce class imbalance, boost model generalization, and strengthen the learnt models’ resistance to distribution change.

Traditional deep learning classification algorithms prioritize the majority class, resulting in poor minority class performance. And the loss functions of most classifiers are based on linear functions. To overcome these problems, various techniques have been developed to improve classifier design in the context of long-tailed distributions. In recent years, different classifier designs have been proposed to address the deep long-tail distribution problem. The Realistic Taxonomic Classifier (RTC) (Wu et al. 2020 ) uses hierarchical classification, mapping images into a class taxonomic tree structure. Samples are adaptively classified at different levels based on difficulty and confidence, prioritizing correct decisions at intermediate levels. The causal classifier applies a multi-head strategy to capture bias information and mitigate long-tailed bias accumulation (Tang et al. 2020 ). The GIST classifier (Liu et al. 2021 ) transfers the geometric structure of head classes to tail classes, improving performance on tail classes by enhancing tail-class weight centers through displacements from head classes. Zhou et al. ( 2022 ) proposed a unique debiased SGG approach dubbed DSDI to address the dual imbalance problem in scene graph generation (SGG). The strategy efficiently addresses the uneven distribution of both foreground-background occurrences and foreground relationship categories in SGG datasets by adding biased resistance loss and a causal intervention tree.

Decoupled training addresses this problem by dividing the training process into two stages: representation learning and classifier learning. This approach allows the model to learn a more discriminative representation of the data, effectively capturing the inherent characteristics of both the majority and minority classes. By decoupling the training, decoupled training methods have shown promising results in improving the classification accuracy and generalization of models in the context of deep long-tail distributions. Nam et al. ( 2023 ) demonstrated effectiveness in long-tail classification through separate decoupled learning of representation learning and classifier learning. The approach includes training the feature extractor using stochastic weight averaging (SWA) to obtain a generalised representation, and a novel classifier retraining algorithm using stochastic representation and uncertainty estimation to construct robust decision bounds. Kang et al. ( 2020 ) used a k-positive contrastive loss to create a more balanced and discriminative feature space, which improved long-tailed learning performance. MiSLAS found that data mixup enhances representation learning but has a negative or negligible effect on classifier training, proposing a two-stage approach with data mixup for representation learning and label-aware smoothing for better classifier generalization.

Module improvement methods solve long-tail problems by changing network modules or objective functions. These techniques complement decoupling training and provide a conceptually simple approach to solving real-world long-tail application problems. However, such methods tend to have high computational complexity, no guaranteed substantial improvements, complex model design, lack of generalizability, and risk of overfitting. These methods require careful consideration and customization for specific scenarios.

6 Imbalanced learning in data streams

Imbalanced data streams refer to continuously arriving data instances in a streaming environment, characterized by a highly skewed and uneven class distribution. These data streams present unique challenges due to the amalgamation of streaming and imbalanced data characteristics, including concept drift and evolving class ratios. Effective mining of imbalanced data streams necessitates adaptable algorithms capable of swiftly adapting to changing decision boundaries, imbalance ratios, and class roles, while maintaining efficiency and cost-effectiveness. In the last few years, several reviews (Fernández et al. 2018 ; Aguiar et al. 2023 ; Alfhaid and Abdullah 2021 ) have provided comprehensive insights into the development of techniques for imbalanced data streams. These reviews offer an overview of data stream mining methods, discuss learning difficulties, explore data-level and algorithm-level approaches for handling skewed data streams, and address challenges such as emerging and disappearing classes, as well as the limited availability of ground truth in streaming scenarios. Aguiar et al. ( 2023 ) have presented a comprehensive experimental framework, evaluating the performance of 24 state-of-the-art algorithms on 515 imbalanced data streams. This framework covers various scenarios involving static and dynamic class imbalance, concept drift, and incorporates both real-world and semi-synthetic datasets. Recent methods for addressing imbalanced data streams can be classified into three categories: (1) online or ensemble learning approaches, (2) incremental learning approaches, and (3) concept drift handling methods.

6.1 Online or ensemble learning

Online or ensemble learning in solving imbalanced data streams refers to the use of algorithms and techniques that continuously update and adapt classifiers to handle the evolving and imbalanced nature of streaming data, either by incorporating multiple classifiers or by updating the classifier’s model online with new incoming data, to improve classification accuracy and performance in imbalanced scenarios. For example, Du et al. ( 2021 ) introduced a cost-sensitive online ensemble learning algorithm that incorporates several equilibrium techniques, such as initializing the base classifier, dynamically calculating misclassification costs, sampling data stream samples, and determining base classifier weights. Furthermore, certain researchers have explored algorithmic techniques for online cost-sensitive learning, integrating online ensemble algorithms with batch mode methods to address cost-sensitive bagging or boosting algorithms. Wang and Pineau ( 2016 ), Klikowski and Woźniak ( 2022 ), Jiang et al. ( 2022 ). Zyblewski et al. ( 2021 ) presented a novel framework that combines non-stationary data stream classification with data analysis of skewed class distributions, using stratified bagging, data preprocessing, and dynamic ensemble selection methods. In addition to exploring high-dimensional imbalanced data streams, recent research has also explored the problem of incomplete imbalanced data streams. You et al. ( 2023 ) proposed a novel algorithm called OLI2DS for learning from incomplete and imbalanced data streams, addressing the limitations of existing approaches. The method uses empirical risk minimisation to detect information features in the missing data space.

The combination of ensemble learning and active learning presents a powerful strategy for addressing the imbalanced data streams. It not only enhances the learning process by actively selecting informative samples but also utilizes the collective knowledge of multiple classifiers to improve classification performance. This integrated approach offers a valuable solution for applications where data arrives continuously and exhibits imbalanced characteristics. For example, Halder et al. ( 2023 ) introduced an autonomous active learning strategy (AACE-DI) for handling concept drifts in imbalanced data streams. The method incorporates a cluster-based ensemble classifier to select informative instances, minimizing expert involvement and costs. It prioritizes uncertain, representative, and minority class data using an automatically adjusting uncertainty strategy. Zhang et al. ( 2020 ) presented a novel method called Reinforcement Online Active Learning Ensemble for Drifting Imbalanced data stream (ROALE-DI). The approach addresses concept drift and class imbalance by integrating a stable classifier and a dynamic classifier group, prioritizing better performance on the minority class.

Combining ensemble learning methods with sampling methods has proved to be a very effective and popular solution for imbalanced data streams problems in recent years (Krawczyk et al. 2017 ; Aguiar and Cano 2023 ). The combination of ensemble learning and sampling both by updating or adding classifiers and in setting unique policies for differently skewed data can provide a unique solution to conceptual drift and class imbalance (Aguiar et al. 2023 ). Aguiar et al. ( 2023 ) apply the combination of ensemble learning methods at the data level to the field of imbalanced data flows for a detailed classification. One of the most popular and at the same time effective solutions in recent years is the combination of sampling methods with Bagging. For example, The robust online self-tuning ensemble introduced by Cano and Krawczyk ( 2022 ) addresses the challenges of concept drift, evolving class distributions, and non-smooth class imbalances by combining online training, concept drift detection, sliding windows for class-specific adaptation, and self-tuning bagging. Klikowski and Woźniak ( 2022 ) proposed deterministic sampling classifiers with weighted Bagging, which demonstrated excellent performance on a variety of imbalance ratios, label noise levels, and conceptual drift types through data preprocessing and weighted Bagging. On the other hand, the effectiveness of the integration approach in dealing with unbalanced data streams can be further enhanced by employing specialised combinatorial strategies or block-based adaptive learning (Aguiar et al. 2023 ). For example, Yan et al. ( 2022 ) proposed a dynamically weighted selection integration that dynamically adjusts the attenuation factor of the base classifier by resampling a small number of samples from previous data blocks. Feng et al. ( 2022 ) proposed an incremental learning algorithm, DME, which uses distribution matching and adaptive weighting integration to efficiently deal with concept drift in real-world streaming datasets.

Online learning approaches have lower runtime and model complexity in solving imbalanced data streams, but are relatively less robust; in contrast, ensemble learning approaches are typically more robust, but may require more computational resources and time.

6.2 Concept drift

Concept drift in an imbalanced data stream refers to the phenomenon where the underlying data distribution, particularly the class distribution, changes over time. This poses a significant challenge for classifiers trained on imbalanced data as they may struggle to adapt to the evolving patterns and imbalanced ratios (Agrahari and Singh 2022 ; Lu et al. 2018 ).

Recently an increasing number of researchers have focused on this aspect and have proposed new research methods. For example, Jiao et al. ( 2022 ) addresses the challenges of concept drift and class imbalance in data streams by proposing a dynamic ensemble selection approach. It incorporates a novel synthetic minority oversampling technique (AnnSMOTE) to generate new minority instances, adapts base classifiers to changing concepts, and constructs an optimal combination of classifiers based on local performance. Korycki and Krawczyk ( 2021 ) introduced a taxonomy of obstacles in multi-class imbalanced data streams impacted by concept drift. They also put forth a trainable concept drift detector based on Restricted Boltzmann Machine, capable of independently monitoring multiple classes and detecting changes through reconstruction error. Liu et al. ( 2021 ) propose CALMID, an integrated active learning method for multiclass imbalanced stream data with concept drift, which combines an integrated classifier, a drift detector, a label sliding window, a sample sliding window and an initialised training sample sequence. CALMID addresses the challenges of multiclass imbalance and concept drift using a variable threshold uncertainty strategy and a novel sample weight formulation. Ren et al. ( 2018 ) proposed Gradual Resampling Ensemble (GRE). GRE selectively resamples previous minority examples using a DBSCAN clustering approach, avoiding influences from small disjuncts and outliers, and ensures that only minority examples with low probability of overlapping with the current majority set are selected.

For imbalanced data stream scenarios with conceptual drift, classifiers constructed based on sampling methods are likely to be overfitted, leading to inefficient drift adaptation. Dynamic ensemble selection can generate a variety of few instances based on the current distribution of data streams for providing more valuable information for classifying conceptual drift. However, this also increases the complexity of the model, requiring more runtime in the data space as well as in classifier design and selection.

6.3 Incremental learning

Incremental learning to deal with imbalanced data streams is the process of continually updating and adapting the learning model to deal with concept drift, a phenomenon where the underlying data distribution changes, while also addressing class imbalances in the data stream. Ditzler and Polikar ( 2012 ).

In response to the challenges posed by imbalanced data streams, researchers have increasingly turned to incremental learning techniques as effective solutions. By gradually incorporating new information, incremental learning algorithms can dynamically adjust their decision boundaries, assign appropriate weights to different samples or classes, and prioritize learning from the minority class. These adaptive mechanisms contribute to improving the overall performance and accuracy of the model when dealing with imbalanced data streams. For example, Li et al. ( 2020 ) proposed Dynamic Updated Ensemble (DUE) algorithm addresses the challenges of concept drift and class imbalance in learning nonstationary data streams by incrementally updating the model one chunk at a time, prioritizing misclassified examples, adapting to different concept drifts, handling the switch from majority to minority class, and maintaining efficiency with a limited number of classifiers. Lu et al. ( 2017 ) proposed Dynamic Weighted Majority Imbalanced Learning (DWMIL) to address the challenges of concept drift and class imbalance in data streams. DWMIL is a block-based incremental learning approach that utilises an ensemble framework with dynamically weighted base classifiers, allowing stability in non-drifting streams and rapid adaptation to new concepts. It is fully incremental, does not require storage of previous data, uses a limited number of classifiers to maintain high efficiency and has a simple implementation using only one threshold parameter. Li et al. ( 2020 ) proposed a block-based dynamic update integration (DUE) that aims to highlight examples of misclassification during model updates by learning one block at a time without accessing previous data and adapting to multiple types of conceptual drift in a timely manner. DUE overcomes the limitations of existing techniques and addresses the challenges of conceptual drift and class imbalance in non-smooth data streams.

Incremental learning aims to incrementally update models to accommodate new instances, adjust decision boundaries, and mitigate the impact of concept drift and class imbalance on classification performance. The research area focuses on developing methods that can learn from streaming data in an online manner, make timely and accurate predictions, and adapt to changing data characteristics. The incremental learning approach allows for adaptive updating of the model with new chunks of data without re-training and tuning, avoiding cumbersome training iteration sessions and therefore saving a significant amount of runtime.

7 Evaluation indicators for imbalanced learning

In this section, we describe various types of evaluation metrics regarding imbalanced learning. Accuracy cannot be chosen as an evaluation index in imbalanced learning. The main reasons are as follows:(1) In an imbalanced dataset, if a model tends to predict the majority of classes, a relatively high accuracy rate can be obtained even if the minority of categories are completely ignored. This makes the accuracy insensitive to the categorization performance of the minority categories and leads to bias in the evaluation. (2) Even if a model’s classification performance for a minority of categories is poor, as long as the classification accuracy for the majority of categories is high, the accuracy may still be high, leading to failure to recognize imbalances. The evaluation indicators relating to imbalanced learning are shown in Table 6 .

In Table 6 most of the evaluation indicators are calculated on the basis of the confusion matrix (CM). CM is a table for evaluating the performance of a classification model, which includes metrics such as True Positive (TP), False Positive (FP), False Negative (FN), and True Negative (TN), which are used to analyze the classification accuracy and misclassification of the model. The main reason for this is that metrics such as AUC and G-mean are unaffected by imbalances in the class distribution, as they are calculated based on the entire ROC curve or different parts of the confusion matrix. This makes them suitable for dealing with situations where there are large differences in the number of positive and negative class samples. In the formulas for AUC and F1-score, Precision denotes accuracy. \(TP_r\) is defined as the true positive rate and \(TN_r\) indicates the true negative rate.

In the case of multi-category imbalanced learning, MAUC combines the AUC values of multiple classes to provide a global multi-class performance metric, which enables a more comprehensive evaluation of the model’s performance on different classes. On the other hand, for multiclass imbalance problems, the evaluation metric of G-mean tends to follow the expansion, mainly because it provides a unified metric without the need to consider each class individually. In Table 6 , m denotes the total number of classes, A ( i ,  j ) is the AUC between two classes computed from column i of matrix M . M denotes an \(N \times m\) matrix.

8 Application scenarios of imbalanced learning

Numerous real-world applications of imbalanced learning have led to the development of methods for learning from imbalanced data. This section concentrates on recent research pertaining to the practical applications of imbalanced learning. Figure 7 illustrates the categorization of the primary application domains into seven major areas, each encompassing specific real-life application challenges.

figure 7

Real-world Application classification

8.1 Biomedical area

In the biomedical field, where imbalanced data distributions are typical and accurate identification of minority class instances is essential for accurate decision-making and patient care, imbalanced learning techniques have been successfully used for tasks like rare disease detection, cancer diagnosis, DNA identification, and others. Figure 8 illustrates the architecture of the application of imbalanced learning in the biological area.

figure 8

Architecture of possible applications in the biological area

8.1.1 Protein subcellular assays

Subcellular localization of human proteins is critical for understanding their function, diagnostic and prognostic studies of pathological conditions, and clinical decision-making, but multi-label classifiers are challenged by severe bias and reduced predictive power when dealing with proteins that are present in multiple locations simultaneously due to data imbalance (Rana et al. 2023 ). Wang and Wei ( 2022 ) used imbalanced multilabeling of immunohistochemical images to make protein subcellular localization predictions. Rana et al. (Ahsan et al. 2022 ) use a multi-label oversampling approach to cope with this type of problem. Protein classification includes the latest solutions recently proposed by Yin et al. ( 2022 ) and Hung et al. ( 2022 ).

8.1.2 Genetic testing

Genetic testing now makes use of imbalanced learning techniques to handle class imbalance problems brought on by the unequal distribution of genetic variations or profiles, hence enhancing the accuracy and dependability of genetic analysis and identification procedures. Wang et al. ( 2015 ) solved the classification of miRNA imbalance data using an ensemble learning approach. Li et al. ( 2023 ) and Liu et al. ( 2016 ) combine integrated learning methods with other models to address DNA-binding protein identification.

8.1.3 Disease diagnosis

Early diagnosis of cancer is crucial for patients, yet data imbalance and quality imbalance between majority and minority classes leading to misclassification is a huge challenge in medical data analysis. Although majority class samples and correct classification are more important for classifiers, cancer diagnosis relies on minority class samples. Researchers are therefore conducting a comprehensive study of the imbalanced data problem from a medical perspective in order to explore new approaches to cancer diagnosis (Fotouhi et al. 2019 ). Fotouhi et al. ( 2019 ) and Behrad and Abadeh ( 2022 ) provide a summary of machine learning and deep learning approaches in tackling disease diagnosis imbalances and present future challenges. Various studies (Xiao et al. 2021 ; Saini and Susan 2022 ; Singh et al. 2020 ; Saini and Susan 2020 ) have proposed novel approaches for imbalanced data in disease diagnosis, including the use of optimized SqueezeNet with bald eagle search optimization, generative adversarial networks, deep transfer networks, and transfer learning with minority data augmentation, all showing promising results in improving classification accuracy for imbalanced breast cancer and melanoma datasets.

8.2 Information Security

Imbalanced learning techniques are finding valuable applications in the field of information security. With the increasing complexity and sophistication of cyber threats, traditional classification models often struggle to effectively handle imbalanced datasets in tasks such as network intrusion detection, malicious detection and fraud detection. Imbalanced learning techniques can identify rare and critical security events, enhance the accuracy of anomaly detection, and help detect previously invisible threats. Figure 9 illustrates the architecture of the application of imbalanced learning in the information Security area.

figure 9

Architecture of the application in the information security area

8.2.1 Network intrusion detection

Imbalanced learning has shown significant potential in the field of Network Intrusion Detection (NID). NID aims to identify and prevent unauthorized access or malicious activities within a computer network. However, traditional detection methods often struggle to handle imbalanced datasets, where the majority of network traffic is normal while a small portion represents intrusions. Imbalanced learning techniques provide effective solutions by addressing the class imbalance problem and improving the detection performance on minority class instances. A number of research scholars have explored the landscape of intrusion detection techniques in recent years, providing insights into application domains, attack detection techniques, evaluation metrics and datasets (Yang et al. 2022 ; Di Mauro et al. 2021 ). It also discusses the concept of intrusion detection systems (IDS), presents a classification of machine learning (ML) and deep learning (DL) techniques used in network-based IDS systems, and highlights the advantages, limitations of ML and DL-based IDSs. Cui et al. ( 2023 ) proposed a new multi-module integrated intrusion detection system. Recently, some scholars have proposed dynamic integration-based methods, resampling-based methods and combining LSTM with other techniques to deal with the class imbalance problem in network intrusion detection (Ren et al. 2023 ; Bagui and Li 2021 ; Gupta et al. 2021 ).

8.2.2 Malicious detection

The goal of malicious detection is to recognize and counteract different forms of criminal activity, such as malware, phishing, and cyberattacks. Imbalanced learning is a critical component of this process. By leveraging imbalanced learning, malicious detection systems can achieve better performance in identifying and mitigating malicious activities, enhancing the security of computer systems, networks, and online platforms. Identification of malicious domains is of great importance as they are one of the main resources necessary for conducting cyber attacks. Zhauniarovich et al. ( 2018 ); Sharma and Rattan ( 2021 ) systematically studied and synthesized the existing methods differing in terms of data sources, analysis methods and evaluation strategies. Phung and Mimura ( 2021 ) propose a new approach to detect malicious JavaScript using machine learning techniques and oversampling methods. Chapaneri and Shah ( 2022 ) proposed regularized Wasserstein generative adversarial networks (WGAN) to solve the imbalanced malicious detection problem. Cui et al. ( 2021 ) proposed a multi-objective Restricted Boltzmann Machine (RBM) model combined with Non-Dominated Sorting Genetic Algorithm (NSGA-II) to solve the malicious code attack problem, and also proposed to find an efficient malicious code detection method.

8.2.3 Fraud detection

In the field of fraud detection, where it is important to spot fraudulent activity in a variety of contexts, including financial transactions, insurance claims, and internet transactions, imbalanced learning plays a key role. Imbalanced learning techniques provide an effective solution by addressing the category imbalance problem and improving the detection performance for a small number of fraudulent instances. The literature (Pourhabibi et al. 2020 ) surveys current trends and identifies key challenges that require significant research efforts to improve the trustworthiness of the technique, while review and statistical machine learning techniques have a wide range of applications in fraud detection as e-commerce systems increase and financial transactions become online with increasing fraud, prevention techniques are effective but fraud detection methods are critical. Li et al. ( 2021 ) proposed a hybrid approach to imbalance fraud detection with dynamic weighted entropy. Zhu et al. ( 2023 ) proposed a clustering-based noise-sample removal undersampling scheme (NUS) for imbalanced credit card fraud detection.

8.3 Computer Vision

The discipline of computer vision has made substantial use of imbalanced learning, notably for tasks like object detection and image classification. By leveraging imbalanced learning in computer vision, researchers and practitioners can improve the robustness and reliability of vision-based systems, enabling applications in areas such as surveillance, medical imaging, autonomous vehicles, and object recognition in diverse real-world scenarios. Figure 10 illustrates the architecture of the application of imbalanced learning in the computer vision area.

figure 10

Architecture of the application in the computer vision

8.3.1 Image classification

In image classification tasks, it is common to have imbalanced distributions of images across different classes, where some classes may have significantly fewer samples than others. This imbalance can negatively impact the performance of conventional classification models, leading to biased predictions and lower accuracy on minority classes. Over the past few years, more and more scholars have created new imbalanced learning algorithms for solving imbalanced image classification. For example, Wang et al. ( 2021 ) introduced Deep Attention-based Imbalanced Image Classification (DAIIC) to automatically allocate more attention to few classes in a data-driven manner. Huang et al. ( 2019 ) mitigate the face classification problem for class-imbalanced data by classical strategies (such as class resampling or cost-sensitive training) and by enforcing deep networks to learn more discriminative deep representations. Jin et al. ( 2022 ) proposed the Balanced Active Learning (BAL) method to alleviate class imbalance by compensating for a few classes of labeled queries, achieving state-of-the-art active learning performance on an imbalanced image classification dataset.

8.3.2 Object detection

Object detection involves the identification and localization of objects within images or videos. However, in many real-world scenarios, the distribution of objects across different classes can be highly imbalanced, where some classes have significantly fewer instances than others. This class imbalance poses a challenge for object detection models as they may struggle to accurately detect and classify objects from the minority classes. Oksuz et al. ( 2020 ) provide a systematic review of imbalance problems in target detection, introduce a problem-based taxonomy, discuss each problem in detail, and provide a critical view of existing solutions. Huang and Liu ( 2022 ) proposed a dense detector for small target detection, which solves the scale imbalance of samples and features by Libra ellipse sampling and residual low-level feature enhancement.

8.4 Business management

In business management, imbalanced learning is widely used for user churn prediction and consumer behavior analysis. Churn prediction refers to predicting which users are likely to stop using an organization’s products or services, which is important for organizations because they can take steps to retain these users (Haixiang et al. 2017 ). Consumer behavior analytics aims to understand and predict consumers’ purchasing decisions and preferences to help companies develop effective marketing strategies. However, since the ratio of purchasers to non-purchasers is usually unbalanced, imbalanced learning can help solve this problem by improving the ability to identify and predict purchasers, thus optimizing marketing campaigns and improving sales. For example, using customer data to identify prospects who are more likely to buy caravan insurance (Almas et al. 2012 ). Literature (Wu and Meng 2016 ; De Caigny et al. 2018 ) using various consumer data to analyze customer behavioral characteristics and habituation.

8.5 Text classification

Imbalanced learning is widely used in spam filtering and sentiment analysis. In spam filtering, traditional classification algorithms may be ineffective in recognizing spam due to the relatively small number of spam emails and the large number of normal emails. In sentiment analysis, for which the distribution of samples representing different sentiment polarities (such as positive, negative and neutral) in textual data is usually uneven, imbalanced learning can be used to address this problem and improve the ability to accurately predict and recognize the sentiments of a few categories. Figure 11 illustrates the architecture of the application of imbalanced learning in the text classification area.

figure 11

Architecture of the application in the text classification

8.5.1 Spam filtering

Spam filtering is essentially an unbalanced classification task designed to identify and block emails that are useless or fraudulent in nature in order to protect users from the nuisance and threat of spam. Francisco et al. (Jáñez-Martino et al. 2023 ) highlighted the challenges in developing robust spam email filters, including the dynamic nature of the environment and the presence of spammers as adversarial figures, and provides an analysis of spammer strategies and state-of-the-art machine learning techniques. Barushka and Hajek ( 2018 ) proposed a regularised deep multilayer perceptron NN model (DBB-RDNN-ReL) based on feature selection and rectified linear units for spam filtering. Rao et al. ( 2023 ) suggested a hybrid framework for identifying social media spam that incorporates dataset balance, sophisticated word embedding techniques, machine learning, and deep learning methodologies, as well as the self-attention mechanism.

8.5.2 Emotional analysis

Sentiment analysis aims to identify and understand the emotional tendencies expressed in a text to determine whether they are positive, negative or neutral. Deng and Ren ( 2023 ) provide a comprehensive overview of recent advances in deep neural network text emotion recognition (TER), covering all aspects of word embedding, architecture and training. It highlights remaining challenges and opportunities in the areas of dataset availability, sentiment boundaries, extractable sentiment information, and TER in conversations. Ding et al. ( 2021 ) proposed a feature extraction-based EEG emotion recognition method using dispersion entropy of different frequency bands of EEG signals and data balancing using a random oversampling algorithm, which resulted in improved characterisation and faster recognition compared to other methods. Lin et al. ( 2022a , 2022b ) proposed novel multi-label sentiment categorization that experimentally validates and solves the class imbalance problem.

8.6 Energy management

The application of imbalanced learning in energy management, especially in the area of electricity theft detection, can help to solve the imbalance problem of electricity theft events in energy systems. The researchers aim to study and analyze energy consumption data and related characteristics to identify and monitor electricity theft by a small number of users. For example, Yan and Wen ( 2021 ) proposed a power theft detector using Extreme Gradient Boosting (XGBoost) based metering data. Cai et al. ( 2023 ) proposed an integrated learning model based on random forest and weighted support vector description to analyze the problem of electricity theft detection in a complex grid environment. Pereira and Saraiva ( 2021 )explored the application of Convolutional Neural Networks (CNNs) combined with various techniques for balancing imbalanced datasets to detect power theft activity.

8.7 Industrial Inspection

The application of imbalanced learning in industrial inspection is mainly focused on the field of anomaly detection and fault diagnosis (Yang et al. 2023 , 2022 ; Chen et al. 2021 ). The problem of class imbalance results from the fact that in industrial environments, normal samples usually dominate, while abnormal or faulty samples are relatively few. Imbalanced learning can effectively enable the model to better learn and recognize anomalous or faulty samples during the training process. Ren et al. ( 2023 ) and Zhang et al. ( 2022 ) offered a thorough evaluation of research accomplishments in the field of fault detection under data imbalance, including data processing methods, model creation methods, and training optimization approaches. Kuang et al. ( 2021 ) proposed a class-imbalanced adversarial transfer learning (CIATL) network to address the challenge of cross-domain troubleshooting in the presence of limited availability of class-balanced data. Liu et al. ( 2021 ) proposed a novel unbalanced data classification method based on weakly supervised learning, which utilizes the Bagging algorithm to generate balanced subsets and employs Support Vector Machine (SVM) classification as a way to improve fault diagnosis performance. Recently there are also some meta-autonomous learning methods based on multi-view sampling, deep reinforcement learning algorithms to optimize the sample distribution and dynamically balanced domain adversarial network algorithms to solve the anomaly detection problem (Lyu et al. 2022 ; Fan et al. 2021 ; Ren et al. 2022 ).

9 Future research directions

Based on a comprehensive review of relevant algorithms and an analysis of their strengths, weaknesses, runtimes, and model complexity discussed in the previous sections, this section will now explore the current challenges and future perspectives of imbalanced learning. These new research directions provide perspectives for a deeper understanding of the complex issues in imbalanced learning, help to address the challenges that have emerged in recent years, and promote more comprehensive and in-depth progress in imbalanced learning research. Building upon the analyses presented earlier, we aim to provide a deeper understanding of the topic at hand.

Increased adaptability in transfer learning scenarios Transfer learning focuses on exploring feature representation discriminability, but they deal with domain alignment and class discriminability independently. Striking a balance between alignment and discriminability is critical, as overemphasising one can lead to the loss of the other. If the property of sample size imbalance between domains is ignored, this can lead to bias, poor alignment and discrimination during training, and ultimately to negative migration (Li et al. 2023 ). As we discussed earlier, the nature of the data distribution over time can affect the performance of migration learning.

Current approaches (Singh et al. 2020 ; Liu et al. 2023 ) predominantly address adaptations to conceptual drift and static category imbalances. However, it is worth exploring the feasibility of predicting changes in advance. Can we anticipate the evolution of category imbalances over time and develop proactive methods to effectively respond? Such proactive measures would substantially reduce the recovery time following shifts in imbalanced data streams and yield more resilient classifiers.

Active learning for imbalanced data Active learning faces unique challenges when dealing with imbalanced data. Suppose there is a prediction problem formulation: \(p({\bar{y}}_{n},{\tilde{y}}_{n}|x)=\frac{p(y_{n}|{\tilde{y}}_{n},x)p(x|{\tilde{y}}_{n})p({\tilde{y}}_{n})}{\sum _{n=1}^{N c}p(x|{\tilde{y}}_{n})p({\tilde{y}}_{n})}\) , \(p({\tilde{y}}_{n})\) is the probability that the model predicts a sample to be in a certain class, \(p(x|{\bar{y}}_{n})\) is the corresponding generative model. Both of the above can be easily calculated. But \(p(y_n\left| {\tilde{y}}_n,x)\right. \) is sample x . It is impractical to calculate the probability of an instance being truly the \(n^{th}\) class when the model predicts it to be the \(n^{th}\) class, as the true label remains unknown in such cases. Active learning algorithms universally require a model trained initially on a randomly chosen sample, serving as a basis for discussing uncertainty or feature space coverage. However, when dealing with minority classes, a random selection often results in a minimal or even zero inclusion of samples. Employing such a subset for training leads to the model producing inaccurate yet highly confident predictions for these infrequent samples, causing uncertainty-based methods to lean towards avoiding their selection (Liu et al. 2021 ; Aguiar and Cano 2023 ). Regarding feature space coverage, the limited number of minority class samples, ineffective feature separation from the main classes post-training, and the mingling of majority class samples make feature space coverage-based methods more inclined to choose majority class samples to cover that particular area.

Moreover, the scarcity of minority class samples can introduce significant noise during the labeling process, thereby impacting the classifier’s performance. Additionally, the class imbalance may introduce bias in the active learning process, causing the classifier to prioritize majority class samples while overlooking the significance of minority class samples (Aguiar and Cano 2023 ). To address these challenges, active learning strategies and algorithms specifically tailored for imbalanced data are required. These approaches aim to enhance classifier performance and maximize the utilization of valuable information from minority class.

Addressing class imbalance in federated learning Federated learning (FL) utilises heterogeneous edge devices and a central server for collaborative learning, where local model training is performed by keeping the collected data locally rather than transmitting the data directly. The edge devices transmit the trained models to a central server, which then performs global model aggregation. Although FL performs well in handling cross-device data migration, its performance is typically poorer when training imbalanced data compared to standard centrally learnt models (Duan et al. 2019 ). The complexity of data imbalances, which can occur locally on a device or across multiple devices, challenges federated learning techniques, particularly the need to balance the need to address data imbalances with the need to maintain data privacy. This is at the same time that the imbalanced data flow problem mentioned in Section 6 also affects FL performance. Duan et al. ( 2019 ) have shown mathematically and theoretically that the data imbalance property can seriously affect the performance of federated learning.

The main reasons why FL is challenging when encountering imbalanced data are as follows:(1) Wang et al. ( 2021 ) have previously pointed out that FL, due to its decentralised nature, may suffer from class imbalances at different levels, including at the local level as well as at the global level across one or more client devices of multiple clients, constituting a system-level imbalance problem. (2) Mismatch in class distribution between clients and servers may also trigger degradation in FL performance. Differences in class distribution among all clients participating in FL can degrade the overall performance of the global model and increase convergence latency. Addressing these issues requires effective management and tuning of class imbalances at different levels and distribution mismatches between clients and servers to improve the robustness and performance of the FL system.

Addressing dynamic class changes in data streams Aguiar et al. ( 2023 ) have demonstrated experimentally that many current learning methods for coping with static unbalanced data perform poorly in unbalanced data streams, especially when confronted with multiple categories. While numerous researchers have examined the impact of class quantity in imbalance problems, the issue of dynamic changes in the number of classes remains largely unexplored. In real-world scenarios, classes may emerge, vanish, and reoccur over time. The dynamic nature of class changes, coupled with the presence of class imbalance, presents a formidable challenge that necessitates the development of flexible models. Such models should possess the ability to detect new classes and seamlessly incorporate them into the model structure, as well as forget obsolete classes while retaining knowledge of recurring classes.

The main problem is the natural data stream shift, which changes as the relevance of the data that generates the minority stream changes, leading to performance degradation. It may be possible in the future to develop programmes to detect and measure dataset shifts, but the real challenge is how to tune it to focus more on the minority class. Addressing this issue requires a deeper understanding of the changing patterns of data streams and the development of methods that can be effectively tuned to maintain focus on the minority class. This is critical to ensure that performance is not negatively impacted by data flow shifts.

Addressing the multi-modal imbalanced learning problem Multimodal learning achieves comprehensive perception and understanding by understanding different types of data. Most of the currently available modal learning methods usually give the same importance to the features of each modality, and multimodal models tend to make use of modal data with smaller values of the loss function for parameter updating during the optimisation process. This causes the model to be biased towards one of the dominant modalities during algorithm training (Behrad and Abadeh 2022 ). This is mainly due to the fact that the modal data with better performance inhibits the role of the other weakly performing modal data in the update, thus creating an imbalance problem.

To date, few studies have simultaneously addressed the challenges associated with class imbalanced and multimodal data (Sleeman et al. 2022 ). When confronted with multimodal data, models must prioritize relevant features extracted from each data domain (Kim and Sohn 2020 ). This emphasis on capturing pertinent information from multiple modalities can introduce sensitivities, particularly when working with a limited number of classes.

Addressing the multi-label cross-domain imbalanced learning problem In multi-label learning (MLD), the imbalance problem covers three levels: intra-label, inter-label and label set. Intra-label imbalance refers to fewer positive samples in individual labels, inter-label imbalance involves differences in the number of positive classes in independent labels, and label set imbalance is affected by label sparsity, which leads to more frequent occurrence of certain label sets. There has been relatively limited research on the problem of imbalanced multi-label classification, with existing work on resolving the imbalance focusing on the area of single-label classification (Rana et al. 2023 ; Tarekegn et al. 2021 ). Despite the increasing demand for multi-label classification in different domains, a comprehensive framework to effectively deal with the imbalance problem in multi-label classification has not yet been fully investigated.

More-ever, In the context of multi-label multimodal datasets, the presence of class correlations and modal variances poses a significant challenge (Tarekegn et al. 2021 ). Such datasets exhibit inter-label correlations and intra-modality differences, adding complexity to model learning and prediction tasks. Therefore, it becomes crucial to accurately capture label correlations and modal differences to enhance categorization and prediction accuracy.

10 Conclusion

In this survey, we present a comprehensive analysis of the challenges posed by class imbalance. We examine its characteristics and associated issues, and propose a novel classification approach to comprehensively review and analyze the existing methods for addressing imbalanced learning. Notably, unlike previous surveys in the field, which primarily focus on data mining and machine learning perspectives, we also delve into the advancements of imbalanced learning in the context of deep learning, specifically long-tail learning. Furthermore, we explore the current state of research on the practical application of imbalanced learning in seven distinct domains, and identify new research directions and areas of innovation for methods and tasks. Our aim is to provide researchers and the community with a comprehensive understanding of imbalanced learning, thereby facilitating future research endeavors in this domain.

Abedin MZ, Guotai C, Hajek P, Zhang T (2022) Combining weighted smote with ensemble learning for the class-imbalanced prediction of small business credit risk. Complex Intell Syst, 1–21

Agrahari S, Singh AK (2022) Concept drift detection in data stream mining: a literature review. Journal of King Saud University-Computer and Information Sciences 34(10):9523–9540

Article   Google Scholar  

Aguiar G, Cano A (2023) An active learning budget-based oversampling approach for partially labeled multi-class imbalanced data streams. In: Proceedings of the 38th ACM/SIGAPP symposium on applied computing, pp 382–389

Aguiar G, Krawczyk B, Cano A (2023) A survey on learning from imbalanced data streams: taxonomy, challenges, empirical study, and reproducible experimental framework. Mach Learn, 1–79

Ahsan R, Ebrahimi F, Ebrahimi M (2022) Classification of imbalanced protein sequences with deep-learning approaches; application on influenza a imbalanced virus classes. Inform Med Unlocked 29:100860

Akbani R, Kwek S, Japkowicz N (2004) Applying support vector machines to imbalanced datasets. In: Machine learning: ECML 2004: 15th European conference on machine learning, Pisa, Italy, September 20-24, 2004. Proceedings 15. Springer, pp 39–50

Akila S, Reddy US (2018) Cost-sensitive risk induced bayesian inference bagging (ribib) for credit card fraud detection. J Comput Sci 27:247–254

Alfhaid MA, Abdullah M (2021) Classification of imbalanced data stream: techniques and challenges. Artif Intell 9(2):36–52

Google Scholar  

Almas A, Farquad M, Avala NR, Sultana J (2012) Enhancing the performance of decision tree: a research study of dealing with unbalanced data. In: Seventh international conference on digital information management (ICDIM 2012). IEEE, pp 7–10

Bader-El-Den M, Teitei E, Perry T (2018) Biased random forest for dealing with the class imbalance problem. IEEE Trans Neural Netw Learn Syst 30(7):2163–2172

Bagui S, Li K (2021) Resampling imbalanced data for network intrusion detection datasets. J Big Data 8(1):1–41

Barushka A, Hajek P (2018) Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks. Appl Intell 48:3538–3556

Behrad F, Abadeh MS (2022) An overview of deep learning methods for multimodal medical data mining. Expert Syst Appl 200:117006

Błaszczyński J, Stefanowski J (2015) Neighbourhood sampling in bagging for imbalanced data. Neurocomputing 150:529–542

Branco P, Torgo L, Ribeiro RP (2019) Preprocessing approaches for imbalanced distributions in regression. Neurocomputing 343:76–99

Branco P, Torgo L, Ribeiro RP (2017) Smogn: a pre-processing approach for imbalanced regression. In: First international workshop on learning with imbalanced domains: theory and applications. PMLR, pp 36–50

Branco P, Torgo L, Ribeiro RP (2018) Rebagg: resampled bagging for imbalanced regression. In: Second international workshop on learning with imbalanced domains: theory and applications. PMLR, pp 67–81

Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C (2009) Safe-level-smote: safe-levelsynthetic minority over-sampling technique for handling the class imbalanced problem. In: Advances in knowledge discovery and data mining: 13th Pacific-Asia Conference, PAKDD 2009 Bangkok, Thailand, April 27-30, 2009 Proceedings 13. Springer, pp 475–482

Cai L, Wang H, Jiang F, Zhang Y, Peng Y (2022) A new clustering mining algorithm for multi-source imbalanced location data. Inf Sci 584:50–64

Cai Q, Li P, Wang R (2023) Electricity theft detection based on hybrid random forest and weighted support vector data description. Int J Electr Power Energy Syst 153:109283

Cano A, Krawczyk B (2022) Rose: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams. Mach Learn 111(7):2561–2599

Article   MathSciNet   Google Scholar  

Cao B, Liu Y, Hou C, Fan J, Zheng B, Yin J (2020) Expediting the accuracy-improving process of svms for class imbalance learning. IEEE Trans Knowl Data Eng 33(11):3550–3567

Cao B, Liu Y, Hou C, Fan J, Zheng B, Yin J (2021) Expediting the accuracy-improving process of svms for class imbalance learning. IEEE Trans Knowl Data Eng 33(11):3550–3567

Cao K, Wei C, Gaidon A, Arechiga N, Ma T (2019) Learning imbalanced datasets with label-distribution-aware margin loss. In: Proceedings of the 33rd international conference on neural information processing systems, pp 1567–1578

Castro CL, Braga AP (2013) Novel costsensitive approach to improve the multilayer perceptron performance on imbalanced data. IEEE Trans Neural Netw Learn Syst 24(6):888–899

Chapaneri R, Shah S (2022) Enhanced detection of imbalanced malicious network traffic with regularized generative adversarial networks. J Netw Comput Appl 202:103368

Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) Smoteboost: improving prediction of the minority class in boosting. In: Knowledge Discovery in Databases: PKDD 2003: 7th European conference on principles and practice of knowledge discovery in databases, Cavtat-Dubrovnik, Croatia, September 22-26, 2003. Proceedings 7. Springer, pp 107–119

Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

Chen J, Xiu Z, Goldstein B, Henao R, Carin L, Tao C (2021) Supercharging imbalanced data learning with energy-based contrastive representation transfer. Adv Neural Inf Process Syst 34:21229–21243

Chen W, Yang K, Yu Z, Zhang W (2022a) Double-kernel based class-specific broad learning system for multiclass imbalance learning. Knowl-Based Syst 253:109535

Chen W, Yang K, Zhang W, Shi Y, Yu Z (2022b) Double-kernelized weighted broad learning system for imbalanced data. Neural Comput Appl 34(22):19923–19936

Chen W, Yang K, Shi Y, Feng Q, Zhang C, Yu Z (2021) Kernel-based classspecific broad learning system for software defect prediction. In: 2021 8th International conference on information, cybernetics, and computational social systems (ICCSS). IEEE, pp 109–114

Chen X, Zhou Y, Wu D, Zhang W, Zhou Y, Li B, Wang W (2022) Imagine by reasoning: a reasoning-based implicit semantic data augmentation for long-tailed classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 356–364

Choudhary R, Shukla S (2021) A clustering based ensemble of weighted kernelized extreme learning machine for class imbalance learning. Expert Syst Appl 164:114041

Cui Z, Zhao Y, Cao Y, Cai X, Zhang W, Chen J (2021) Malicious code detection under 5g hetnets based on a multi-objective rbm model. IEEE Network 35(2):82–87

Cui J, Zong L, Xie J, Tang M (2023) A novel multi-module integrated intrusion detection system for high-dimensional imbalanced data. Appl Intell 53(1):272–288

Cui Y, Jia M, Lin T-Y, Song Y, Belongie S (2019) Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9268–9277

Cui J, Zhong Z, Liu S, Yu B, Jia J (2021) Parametric contrastive learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 715–724

Dablain DA, Bellinger C, Krawczyk B, Chawla NV (2023) Efficient augmentation for imbalanced deep learning. In: 2023 IEEE 39th international conference on data engineering (ICDE). IEEE, pp 1433–1446

Dablain D, Krawczyk B, Chawla NV (2022) Deepsmote: fusing deep learning and smote for imbalanced data. IEEE Trans Neural Netw Learn Syst

Datta S, Das S (2015) Near-bayesian support vector machines for imbalanced data classification with equal or unequal misclassification costs. Neural Netw 70:39–52

Datta S, Das S (2019) Multiobjective support vector machines: handling class imbalance with pareto optimality. IEEE Trans Neural Netw Learn Syst 30(5):1602–1608

Datta S, Ghosh A, Sanyal K, Das S (2017) A radial boundary intersection aided interior point method for multi-objective optimization. Inf Sci 377:1–16

Datta S, Nag S, Das S (2020) Boosting with lexicographic programming: addressing class imbalance without cost tuning. IEEE Trans Knowl Data Eng 32(5):883–897

De Caigny A, Coussement K, De Bock KW (2018) A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. Eur J Oper Res 269(2):760–772

Deng J, Ren F (2023) A survey of textual emotion recognition and its challenges. IEEE Trans Affect Comput 14(1):49–67

Desai A, Wu T-Y, Tripathi S, Vasconcelos N (2021) Learning of visual relations: the devil is in the tails. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 15404–15413

Di Mauro M, Galatro G, Fortino G, Liotta A (2021) Supervised feature selection techniques in network intrusion detection: a critical review. Eng Appl Artif Intell 101:104216

Ding X-W, Liu Z-T, Li D-Y, He Y, Wu M (2021) Electroencephalogram emotion recognition based on dispersion entropy feature extraction using random oversampling imbalanced data processing. IEEE Trans Cogn Dev Syst 14(3):882–891

Ditzler G, Polikar R (2012) Incremental learning of concept drift from streaming imbalanced data. IEEE Trans Knowl Data Eng 25(10):2283–2301

Dixit A, Mani A (2023) Sampling technique for noisy and borderline examples problem in imbalanced classification. Appl Soft Comput 142:110361

Dong X, Yu Z, Cao W, Shi Y, Ma Q (2020) A survey on ensemble learning. Front Comp Sci 14:241–258

Douzas G, Bacao F (2017) Self-organizing map oversampling (somo) for imbalanced data set learning. Expert Syst Appl 82:40–52

Douzas G, Bacao F, Last F (2018) Improving imbalanced learning through a heuristic oversampling method based on k-means and smote. Inf Sci 465:1–20

Du H, Zhang Y, Gang K, Zhang L, Chen Y-C (2021) Online ensemble learning algorithm for imbalanced data stream. Appl Soft Comput 107:107378

Duan M, Liu D, Chen X, Tan Y, Ren J, Qiao L, Liang L (2019) Astraea: selfbalancing federated learning for improving classification accuracy of mobile deep learning applications. In: 2019 IEEE 37th International conference on computer design (ICCD). IEEE, pp 246–254

Fan Q, Wang Z, Li D, Gao D, Zha H (2017) Entropy-based fuzzy support vector machine for imbalanced datasets. Knowl-Based Syst 115:87–99

Fan S, Zhang X, Song Z (2021) Imbalanced sample selection with deep reinforcement learning for fault diagnosis. IEEE Trans Industr Inf 18(4):2518–2527

Fan W, Stolfo SJ, Zhang J, Chan PK (1999) Adacost: misclassification cost-sensitive boosting. In: Icml, vol 99, pp 97–105

Feng B, Gu Y, Yu H, Yang X, Gao S (2022) Dme: an adaptive and just-in-time weighted ensemble learning method for classifying block-based concept drift steam. IEEE Access 10:120578–120591

Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F, Fernández A, García S, Galar M, Prati RC et al (2018) Learning from imbalanced data streams. Learning from imbalanced data sets, 279–303

Fotouhi S, Asadi S, Kattan MW (2019) A comprehensive data level analysis for cancer diagnosis on imbalanced data. J Biomed Inform 90:103089

Freund Y, Schapire RE (1997) A decisiontheoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst, Man, and Cybernetics, Part C (Applications and Reviews) 42(4):463–484

Ghosh K, Bellinger C, Corizzo R, Branco P, Krawczyk B, Japkowicz N (2022) The class imbalance problem in deep learning. Mach Learn, 1–57

Guo Y, Feng J, Jiao B, Cui N, Yang S, Yu Z (2022) A dual evolutionary bagging for class imbalance learning. Expert Syst Appl 206:117843

Gupta N, Jindal V (2021) Bedi P (2021) Lio-ids: handling class imbalance using lstm and improved one-vs-one technique in intrusion detection system. Comput Netw 192:108076

Gutiérrez-Tobal GC, Álvarez D, Vaquerizo-Villar F, Crespo A, Kheirandish-Gozal L, Gozal D, Campo F, Hornero R (2021) Ensemble-learning regression to estimate sleep apnea severity using at-home oximetry in adults. Appl Soft Comput 111:107827

Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239

Halder B, Hasan KA, Amagasa T, Ahmed MM (2023) Autonomic active learning strategy using cluster-based ensemble classifier for concept drifts in imbalanced data stream. Expert Syst Appl 120578

Han M, Guo H, Li J, Wang W (2023) Globallocal information based oversampling for multi-class imbalanced data. Int J Mach Learn Cybern 14(6):2071–2086

Han H, Wang W-Y, Mao B-H (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: Advances in intelligent computing: international conference on intelligent computing, ICIC 2005, Hefei, China, August 23-26, 2005, Proceedings, Part I 1. Springer, pp 878–887

He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

He H, Bai Y, Garcia EA, Li S (2008) Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International joint conference on neural networks (IEEE World Congress on Computational Intelligence). IEEE, pp 1322–1328

Hospedales T, Antoniou A, Micaelli P, Storkey A (2021) Meta-learning in neural networks: a survey. IEEE Trans Pattern Anal Mach Intell 44(9):5149–5169

Huang S, Liu Q (2022) Addressing scale imbalance for small object detection with dense detector. Neurocomputing 473:68–78

Huang C, Li Y, Loy CC, Tang X (2019) Deep imbalanced learning for face recognition and attribute prediction. IEEE Trans Pattern Anal Mach Intell 42(11):2781–2794

Hung L-C, Hu Y-H, Tsai C-F, Huang M-W (2022) A dynamic time warping approach for handling class imbalanced medical datasets with missing values: a case study of protein localization site prediction. Expert Syst Appl 192:116437

Jamal MA, Brown M, Yang M-H, Wang L, Gong B (2020) Rethinking classbalanced methods for long-tailed visual recognition from a domain adaptation perspective. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7610–7619

Jáñez-Martino F, Alaiz-Rodríguez R, Gonzále-Castro V, Fidalgo E, Alegre E (2023) A review of spam email detection: analysis of spammer strategies and the dataset shift problem. Artif Intell Rev 56(2):1145–1173

Jiang J, Liu F, Liu Y, Tang Q, Wang B, Zhong G, Wang W (2022) A dynamic ensemble algorithm for anomaly detection in iot imbalanced data streams. Comput Commun 194:250–257

Jiao B, Guo Y, Gong D, Chen Q (2022) Dynamic ensemble selection for imbalanced data streams with concept drift. IEEE Trans Neural Netw Learn Syst 1–14

Jin Q, Yuan M, Wang H, Wang M, Song Z (2022) Deep active learning models for imbalanced image classification. Knowl-Based Syst 257:109817

Kang Q, Chen X, Li S, Zhou M (2016) A noise-filtered under-sampling scheme for imbalanced classification. IEEE Trans Cybern 47(12):4263–4274

Kang Q, Shi L, Zhou M, Wang X, Wu Q, Wei Z (2017) A distance-based weighted undersampling scheme for support vector machines and its application to imbalanced classification. IEEE Trans Neural Netw Learn Syst 29(9):4152–4165

Kang B, Li Y, Xie S, Yuan Z, Feng J (2020) Exploring balanced feature spaces for representation learning. In: International conference on learning representations

Kang B, Xie S, Rohrbach M, Yan Z, Gordo A, Feng J, Kalantidis Y (2019) Decoupling representation and classifier for long-tailed recognition. arXiv:1910.09217

Karthik S, Revaud J, Chidlovskii B (2021) Learning from long-tailed data with noisy labels. arXiv:2108.11096

Kaur H, Pannu HS (2019) Malhi AK (2019) A systematic review on imbalanced data challenges in machine learning: applications and solutions. ACM Comput Surv (CSUR) 52(4):1–36

Kim KH, Sohn SY (2020) Hybrid neural network with cost-sensitive support vector machine for class-imbalanced multimodal data. Neural Netw 130:176–184

Kim M-J, Kang D-K, Kim HB (2015) Geometric mean based boosting algorithm with over-sampling to resolve data imbalance problem for bankruptcy prediction. Expert Syst Appl 42(3):1074–1082

Kim D, Yu H, Lee H, Beighley E, Durand M, Alsdorf DE, Hwang E (2019) Ensemble learning regression for estimating river discharges using satellite altimetry data: central congo river as a test-bed. Remote Sens Environ 221:741–755

Kim J, Hur Y, Park S, Yang E, Hwang SJ, Shin J (2020) Distribution aligning refinery of pseudo-label for imbalanced semisupervised learning. Adv Neural Inf Process Syst 33:14567–14579

Kim J, Jeong J, Shin J (2020) M2m: imbalanced classification via major-tominor translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13896–13905

Klikowski J, Woźniak M (2022) Deterministic sampling classifier with weighted bagging for drifted imbalanced data stream classification. Appl Soft Comput 122:108855

Korycki L, Krawczyk B (2021) Concept drift detection from multi-class imbalanced data streams. In: 2021 IEEE 37th International conference on data engineering (ICDE). IEEE, pp 1068–1079

Krawczyk B (2016) Learning from imbalanced data: open challenges and future directions. Progress in Artif Intell 5(4):221–232

Krawczyk B, Minku LL, Gama J, Stefanowski J, Woźniak M (2017) Ensemble learning for data stream analysis: a survey. Inf Fusion 37:132–156

Kuang J, Xu G, Tao T, Wu Q (2021) Classimbalance adversarial transfer learning network for cross-domain fault diagnosis with imbalanced data. IEEE Trans Instrum Meas 71:1–11

Lee HB, Lee H, Na D, Kim S, Park M, Yang E, Hwang SJ (2019) Learning to balance: Bayesian meta-learning for imbalanced and out-of-distribution tasks. arXiv:1905.12917

Lee H, Shin S, Kim H (2021) Abc: auxiliary balanced classifier for class-imbalanced semi-supervised learning. Adv Neural Inf Process Syst 34:7082–7094

Li L, He H, Li J (2019) Entropy-based sampling approaches for multi-class imbalanced problems. IEEE Trans Knowl Data Eng 32(11):2159–2170

Li Z, Huang W, Xiong Y, Ren S, Zhu T (2020) Incremental learning imbalanced data streams with concept drift: the dynamic updated ensemble algorithm. Knowl-Based Syst 195:105694

Li Z, Huang M, Liu G, Jiang C (2021) A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detection. Expert Syst Appl 175:114750

Li F, Liu S, Li K, Zhang Y, Duan M, Yao Z, Zhu G, Guo Y, Wang Y, Huang L et al (2023) Epiteamdna: sequence feature representation via transfer learning and ensemble learning for identifying multiple dna epigenetic modification types across species. Comput Biol Med 160:107030

Liang Z, Wang H, Yang K, Shi Y (2022) Adaptive fusion based method for imbalanced data classification. Front Neurorobot 16:827913

Lin W-C, Tsai C-F, Hu Y-H, Jhang J-S (2017) Clustering-based undersampling in class-imbalanced data. Inf Sci 409:17–26

Lin N, Fu S, Lin X, Wang L (2022b) Multi-label emotion classification based on adversarial multi-task learning. Inf Process Manag 59(6):103097

Lin N, Fu Y, Lin X, Yang A, Jiang S (2022) Cl-xabsa: contrastive learning for crosslingual aspect-based sentiment analysis. arXiv:2204.00791

Liu X-Y, Wu J, Zhou Z-H (2009) Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39(2):539–550

Liu B, Wang S, Dong Q, Li S, Liu X (2016) Identification of dna-binding proteins by combining auto-cross covariance transformation and ensemble learning. IEEE Trans Nanobiosci 15(4):328–334

Liu Z, Wei P, Jiang J, Cao W, Bian J, Chang Y (2020) Mesa: boost ensemble imbalanced learning with meta-sampler. Adv Neural Inf Process Syst 33:14463–14474

Liu W, Zhang H, Ding Z, Liu Q, Zhu C (2021) A comprehensive active learning method for multiclass imbalanced data streams with concept drift. Knowl-Based Syst 215:106778

Liu H, Liu Z, Jia W, Zhang D, Tan J (2021) A novel imbalanced data classification method based on weakly supervised learning for fault diagnosis. IEEE Trans Industr Inf 18(3):1583–1593

Liu R, Liu Y, Duan J, Hou F, Wang L, Zhang X, Li G (2022) Ensemble learning directed classification and regression of hydrocarbon fuels. Fuel 324:124520

Liu Y, Yang G, Qiao S, Liu M, Qu L, Han N, Wu T, Yuan G, Peng Y (2023) Imbalanced data classification: using transfer learning and active sampling. Eng Appl Artif Intell 117:105621

Liu Z, Cao W, Gao Z, Bian J, Chen H, Chang Y, Liu T-Y (2020) Self-paced ensemble for highly imbalanced massive data classification. In: 2020 IEEE 36th international conference on data engineering (ICDE). IEEE, pp 841–852

Liu B, Li H, Kang H, Hua G, Vasconcelos N (2021) Gistnet: a geometric structure transfer network for long-tailed recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8209–8218

Li Z, Yu Z, Yang K, Shi Y, Xu Y, Chen CP (2021) Local tangent generative adversarial network for imbalanced data classification. In: 2021 International joint conference on neural networks (IJCNN). IEEE, pp 1–8

Longadge R, Dongre S (2013) Class imbalance problem in data mining review. arXiv:1305.1707

Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G (2018) Learning under concept drift: a review. IEEE Trans Knowl Data Eng 31(12):2346–2363

Lu Y, Cheung Y-M, Tang YY (2019) Selfadaptive multiprototype-based competitive learning approach: a k-means-type algorithm for imbalanced data clustering. IEEE Trans Cybern 51(3):1598–1612

Lu Y, Cheung Y-M, Tang YY (2019) Adaptive chunk-based dynamic weighted majority for imbalanced data streams with concept drift. IEEE Trans Neural Netw Learn Syst 31(8):2764–2778

Lu Y, Cheung Y-m, Tang YY (2017) Dynamic weighted majority for incremental learning of imbalanced data streams with concept drift. In: IJCAI, pp 2393–2399

Lyu P, Zheng P, Yu W, Liu C, Xia M (2022) A novel multiview sampling-based meta self-paced learning approach for classimbalanced intelligent fault diagnosis. IEEE Trans Instrum Meas 71:1–12

Mani I, Zhang I (2003) knn approach to unbalanced data distributions: a case study involving information extraction. In: Proceedings of workshop on learning from imbalanced datasets, vol 126. ICML, pp 1–7

Mullick SS, Datta S, Das S (2019) Generative adversarial minority oversampling. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1695–1704

Mullick SS, Datta S, Das S (2018) Adaptive learning-based k -nearest neighbor classifiers with resilience to class imbalance. IEEE Trans Neural Netw Learn Syst 29(11):5713–5725

Nam G, Jang S, Lee J (2023) Decoupled training for long-tailed classification with stochastic representations. arXiv:2304.09426

Ng WW, Zhang J, Lai CS, Pedrycz W, Lai LL, Wang X (2018) Cost-sensitive weighting and imbalance-reversed bagging for streaming imbalanced and concept drifting in electricity pricing classification. IEEE Trans Industr Inf 15(3):1588–1597

Nguwi Y-Y, Cho S-Y (2010) An unsupervised self-organizing learning with support vector ranking for imbalanced datasets. Expert Syst Appl 37(12):8303–8312

Oksuz K, Cam BC, Kalkan S, Akbas E (2020) Imbalance problems in object detection: a review. IEEE Trans Pattern Anal Mach Intell 43(10):3388–3415

Pan T, Zhao J, Wu W, Yang J (2020) Learning imbalanced datasets based on smote and gaussian distribution. Inf Sci 512:1214–1233

Pereira J, Saraiva F (2021) Convolutional neural network applied to detect electricity theft: a comparative study on unbalanced data handling techniques. Int J Electr Power Energy Syst 131:107085

Phung NM, Mimura M (2021) Detection of malicious javascript on an imbalanced dataset. Internet of Things 13:100357

Pourhabibi T, Ong K-L, Kam BH, Boo YL (2020) Fraud detection: a systematic literature review of graph-based anomaly detection approaches. Decis Support Syst 133:113303

Rana P, Sowmya A, Meijering E, Song Y (2023) Imbalanced classification for protein subcellular localization with multilabel oversampling. Bioinformatics 39(1):841

Rao S, Verma AK, Bhatia T (2023) Hybrid ensemble framework with self-attention mechanism for social spam detection on imbalanced data. Expert Syst Appl 217:119594

Razavi-Far R, Farajzadeh-Zanajni M, Wang B, Saif M, Chakrabarti S (2019) Imputation-based ensemble techniques for class imbalance learning. IEEE Trans Knowl Data Eng 33(5):1988–2001

Razavi-Far R, Farajzadeh-Zanajni M, Wang B, Saif M, Chakrabarti S (2021) Imputation-based ensemble techniques for class imbalance learning. IEEE Trans Knowl Data Eng 33(5):1988–2001

Ren S, Liao B, Zhu W, Li Z, Liu W, Li K (2018) The gradual resampling ensemble for mining imbalanced data streams with concept drift. Neurocomputing 286:150–166

Ren H, Wang J, Dai J, Zhu Z (2022) Liu J (2022) Dynamic balanced domain-adversarial networks for cross-domain fault diagnosis of train bearings. IEEE Trans Instrum Meas 71:1–12

Ren Z, Lin T, Feng K, Zhu Y, Liu Z, Yan K (2023) A systematic review on imbalanced learning methods in intelligent fault diagnosis. IEEE Trans Instrum Meas 72:1–35

Ren H, Tang Y, Dong W, Ren S, Jiang L (2023) Duen: dynamic ensemble handling class imbalance in network intrusion detection. Expert Syst Appl 229:120420

Ren J, Zhang M, Yu C, Liu Z (2022) Balanced mse for imbalanced visual regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7926–7935

Rezvani S, Wang X (2023) A broad review on class imbalance learning techniques. Appl Soft Comput 110415

Sáez JA, Luengo J, Stefanowski J, Herrera F (2015) Smote-ipf: addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Inf Sci 291:184–203

Sağlam F, Cengiz MA (2022) A novel smotebased resampling technique trough noise detection and the boosting procedure. Expert Syst Appl 200:117023

Sahani M, Dash PK (2019) Fpga-based online power quality disturbances monitoring using reduced-sample hht and class-specific weighted rvfln. IEEE Trans Industr Inf 15(8):4614–4623

Saini M, Susan S (2020) Deep transfer with minority data augmentation for imbalanced breast cancer dataset. Appl Soft Comput 97:106759

Saini M, Susan S (2022) Vggin-net: deep transfer network for imbalanced breast cancer dataset. IEEE/ACM Trans Comput Biol Bioinf 20(1):752–762

Samuel D, Chechik G (2021) Distributional robustness loss for long-tail learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9495–9504

Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2009) Rusboost: a hybrid approach to alleviating class imbalance. IEEE transactions on systems, man, and cybernetics-part A: systems and humans 40(1):185–197

Sharma T (2021) Rattan D (2021) Malicious application detection in android—a systematic literature review. Comput Sci Rev 40:100373

Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48

Singh R, Ahmed T, Kumar A, Singh AK, Pandey AK, Singh SK (2020) Imbalanced breast cancer classification using transfer learning. IEEE/ACM Trans Comput Biol Bioinf 18(1):83–93

Sleeman WC IV, Kapoor R, Ghosh P (2022) Multimodal classification: current landscape, taxonomy and future directions. ACM Comput Surv 55(7):1–31

Smith MR, Martinez T, Giraud-Carrier C (2014) An instance level analysis of data complexity. Mach Learn 95:225–256

Stefanowski J, Wilk S (2008) Selective preprocessing of imbalanced data for improving classification performance. In: Data warehousing and knowledge discovery: 10th international conference, DaWaK 2008 Turin, Italy, September 2-5, 2008 Proceedings 10. Springer, pp 283–292

Steininger M, Kobs K, Davidson P, Krause A, Hotho A (2021) Density-based weighting for imbalanced regression. Mach Learn 110:2187–2211

Sun J, Lang J, Fujita H, Li H (2018) Imbalanced enterprise credit evaluation with dte-sbd: decision tree ensemble based on smote and bagging with differentiated sampling rates. Inf Sci 425:76–91

Sun Y, Cai L, Liao B, Zhu W (2020) Minority sub-region estimation-based oversampling for imbalance learning. IEEE Trans Knowl Data Eng 34(5):2324–2334

Sun Y, Cai L, Liao B, Zhu W, Xu J (2022) A robust oversampling approach for class imbalance problem with small disjuncts. IEEE Trans Knowl Data Eng

Tang K, Huang J, Zhang H (2020) Longtailed classification by keeping the good and removing the bad momentum causal effect. Adv Neural Inf Process Syst 33:1513–1524

Tarekegn AN, Giacobini M, Michalak K (2021) A review of methods for imbalanced multi-label classification. Pattern Recogn 118:107965

Torgo L, Ribeiro R (2009) Precision and recall for regression. In: Discovery science: 12th international conference, DS 2009, Porto, Portugal, October 3-5, 2009 12. Springer, pp 332–346

Tsai C-F, Lin W-C, Hu Y-H, Yao G-T (2019) Under-sampling class imbalanced datasets by combining clustering analysis and instance selection. Inf Sci 477:47–54

Van Hulse J, Khoshgoftaar T (2009) Knowledge discovery from imbalanced and noisy data. Data Knowl Eng 68(12):1513–1542

Viola P, Jones M (2001) Fast and robust classification using asymmetric adaboost and a detector cascade. Adv Neural Inf Process Syst 14

Wang B, Pineau J (2016) Online bagging and boosting for imbalanced data streams. IEEE Trans Knowl Data Eng 28(12):3353–3366

Wang F, Wei L (2022) Multi-scale deep learning for the imbalanced multi-label protein subcellular localization prediction based on immunohistochemistry images. Bioinformatics 38(9):2602–2611

Wang C, Hu L, Guo M, Liu X, Zou Q (2015) imdc: an ensemble learning method for imbalanced classification with mirna data. Genet Mol Res 14(1):123–133

Wang Y, Yao Q, Kwok JT, Ni LM (2020) Generalizing from a few examples: a urvey on few-shot learning. ACM Comput Surv (csur) 53(3):1–34

Wang Z, Cao C, Zhu Y (2020) Entropy and confidence-based undersampling boosting random forests for imbalanced problems. IEEE Trans Neural Netw Learn Syst 31(12):5178–5191

Wang L, Zhang L, Qi X, Yi Z (2021) Deep attention-based imbalanced image classification. IEEE Trans Neural Netw Learn Syst 33(8):3320–3330

Wang Y, Gan W, Yang J, Wu W, Yan J (2019) Dynamic curriculum learning for imbalanced data classification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5017–5026

Wang P, Han K, Wei X-S, Zhang L, Wang L (2021) Contrastive learning based hybrid networks for long-tailed image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 943–952

Wang X, Lian L, Miao Z, Liu Z, Yu SX (2020) Long-tailed recognition by routing diverse distribution-aware experts. arXiv:2010.01809

Wang L, Xu S, Wang X, Zhu Q (2021) Addressing class imbalance in federated learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 10165–10173

Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: 2009 IEEE symposium on computational intelligence and data mining. IEEE, pp 324–331

Wei C, Sohn K, Mellina C, Yuille A, Yang F (2021) Crest: a class-rebalancing selftraining framework for imbalanced semisupervised learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10857–10866

Wen G, Li X, Zhu Y, Chen L, Luo Q, Tan M (2021) One-step spectral rotation clustering for imbalanced high-dimensional data. Inf Process Manag 58(1):102388

Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern (3):408–421

Woźniak M, Grana M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Information Fusion 16:3–17

Wu T-Y, Morgado P, Wang P, Ho C-H, Vasconcelos N (2020) Solving long-tailed recognition with deep realistic taxonomic classifier. In: Computer vision-ECCV 2020: 16th European conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part VIII 16. Springer, pp 171–189

Wu X, Meng S (2016) E-commerce customer churn prediction based on improved smote and adaboost. In: 2016 13th International conference on service systems and service management (ICSSSM). IEEE, pp 1–5

Xiang L, Ding G, Han J (2020) Learning from multiple experts: self-paced knowledge distillation for long-tailed classification. In: Computer vision-ECCV 2020: 16th European conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part V 16. Springer, pp 247–263

Xiao Y, Wu J, Lin Z (2021) Cancer diagnosis using generative adversarial networks based on deep learning from imbalanced data. Comput Biol Med 135:104540

Xu Y, Yu Z, Chen CP, Liu Z (2021) Adaptive subspace optimization ensemble method for high-dimensional imbalanced data classification. IEEE Trans Neural Netw Learn Syst

Yan Z, Wen H (2021) Electricity theft detection base on extreme gradient boosting in ami. IEEE Trans Instrum Meas 70:1–9

Yan Y, Zhu Y, Liu R, Zhang Y, Zhang Y, Zhang L (2023) Spatial distribution-based imbalanced undersampling. IEEE Trans Knowl Data Eng 35(6):6376–6391

Yang Y, Xu Z (2020) Rethinking the value of labels for improving class-imbalanced learning. Adv Neural Inf Process Syst 33:19290–19301

Yang K, Yu Z, Wen X, Cao W, Chen CP, Wong H-S, You J (2019) Hybrid classifier ensemble for imbalanced data. IEEE Trans Neural Netw Learn Syst 31(4):1387–1400

Yang K, Yu Z, Chen CP, Cao W, Wong H-S, You J, Han G (2021) Progressive hybrid classifier ensemble for imbalanced data. IEEE Trans Syst, Man, and Cybernetics: Systems 52(4):2464–2478

Yang K, Yu Z, Chen CP, Cao W, You J, Wong H-S (2021) Incremental weighted ensemble broad learning system for imbalanced data. IEEE Trans Knowl Data Eng 34(12):5809–5824

Yang K, Shi Y, Yu Z, Yang Q, Sangaiah AK, Zeng H (2022) Stacked one-class broad learning system for intrusion detection in industry 4.0. IEEE Trans Ind Inform 19(1):251–260

Yang Z, Liu X, Li T, Wu D, Wang J, Zhao Y, Han H (2022) A systematic literature review of methods and datasets for anomaly-based network intrusion detection. Comput Secur 116:102675

Yang K, Chen W, Bi J, Wang M, Luo F (2023) Multi-view broad learning system for electricity theft detection. Appl Energy 352:121914

Yang Y, Lv H, Chen N (2023) A survey on ensemble learning under the era of deep learning. Artif Intell Rev 56(6):5545–5589

Yang Y, Zha K, Chen Y, Wang H, Katabi D (2021) Delving into deep imbalanced regression. In: International conference on machine learning. PMLR, pp 11842–11851

Yan Z, Hongle D, Gang K, Lin Z, Chen Y-C (2022) Dynamic weighted selective ensemble learning algorithm for imbalanced data streams. J Supercomput 1–26

Yin L, Du X, Ma C, Gu H (2022) Virtual screening of drug proteins based on the prediction classification model of imbalanced data mining. Processes 10(7):1420

You D, Xiao J, Wang Y, Yan H, Wu D, Chen Z, Shen L, Wu X (2023) Online learning from incomplete and imbalanced data streams. IEEE Trans Knowl Data Eng

Zang Y, Huang C, Loy CC (2021) Fasa: feature augmentation and sampling adaptation for long-tailed instance segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3457–3466

Zhang X, Hu B-G (2014) A new strategy of cost-free learning in the class imbalance problem. IEEE Trans Knowl Data Eng 26(12):2872–2885

Zhang H, Li M (2014) Rwo-sampling: a random walk over-sampling approach to imbalanced data classification. Inf Fusion 20:99–116

Zhang T, Ma F, Yue D, Peng C, O’Hare GM (2019) Interval type-2 fuzzy local enhancement based rough k-means clustering considering imbalanced clusters. IEEE Trans Fuzzy Syst 28(9):1925–1939

Zhang H, Liu W, Liu Q (2020) Reinforcement online active learning ensemble for drifting imbalanced data streams. IEEE Trans Knowl Data Eng 34(8):3971–3983

Zhang T, Chen J, Li F, Zhang K, Lv H, He S, Xu E (2022) Intelligent fault diagnosis of machines with small & imbalanced data: a state-of-the-art review and possible extensions. ISA Trans 119:152–171

Zhang Z, Wang G, Carranza EJM, Fan J, Liu X, Zhang X, Dong Y, Chang X, Sha D (2022) An integrated framework for datadriven mineral prospectivity mapping using bagging-based positive-unlabeled learning and bayesian cost-sensitive logistic regression. Nat Resour Res 31(6):3041–3060

Zhang Y, Kang B, Hooi B, Yan S, Feng J (2023) Deep long-tailed learning: a survey. IEEE Trans Pattern Anal Mach Intell

Zhang J, Tao H, Hou C (2023) Imbalanced clustering with theoretical learning bounds. IEEE Trans Knowl Data Eng

Zhang X, Wu Z, Weng Z, Fu H, Chen J, Jiang Y-G, Davis LS (2021) Videolt: largescale long-tailed video recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7960–7969

Zhang Y, Zhang H, Lin Y (2022) Data augmentation for long-tailed and imbalanced polyphone disambiguation in mandarin. In: ICASSP 2022-2022 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 7137–7141

Zhauniarovich Y, Khalil I, Yu T, Dacier M (2018) A survey on malicious domains detection through dns data analysis. ACM Comput Surv (CSUR) 51(4):1–36

Zhou H, Zhang J, Luo T, Yang Y, Lei J (2022) Debiased scene graph generation for dual imbalance learning. IEEE Trans Pattern Anal Mach Intell 45(4):4274–4288

Zhou B, Cui Q, Wei X-S, Chen Z-M (2020) Bbn: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9719–9728

Zhu T, Lin Y, Liu Y, Zhang W, Zhang J (2019) Minority oversampling for imbalanced ordinal regression. Knowl-Based Syst 166:140–155

Zhu H, Zhou M, Liu G, Xie Y, Liu S, Guo C (2023) Nus: noisy-sample-removed undersampling scheme for imbalanced classification and application to credit card fraud detection. IEEE Trans Comput Soc Syst

Zyblewski P, Sabourin R, Woźniak M (2021) Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams. Inf Fusion 66:138–154

Download references

Acknowledgements

This work was supported in part by National Key R &D Pogram of China 2023YFA1011601, and in part by the Major Key Project of PCL, China under Grant PCL2023AS7-1, and in part by the National Natural Science Foundation of China No. U21A20478, 62106224, U21B2029, and in part by the Open Research Project KFKT2022B11 of the State Key Lab. for Novel Software Technology, Nanjing University, China.

Author information

Authors and affiliations.

School of Future Technology, South China University of Technology, Panyu district, Guangzhou, 511442, Guangdong, China

Wuxing Chen

Peng Cheng Laboratory, Nanshan district, Shenzhen, 518066, Guangdong, China

School of Computer Science and Engineering, South China University of Technology, Panyu district, Guangzhou, 510006, Guangdong, China

Kaixiang Yang, Zhiwen Yu & C. L. Philip Chen

College of Engineering, Huaqiao University, Fengze District, Quanzhou, 362021, Fujian, China

You can also search for this author in PubMed   Google Scholar

Contributions

Wuxing Chen (WC) and Kaixiang Yang (KY) conceived the research idea and designed the study. WC, KY, Zhiwen Yu (ZY), Yifan Shi (YS), and C. L. Philip Chen (PC) analyzed the research. WC and KY wrote the main manuscript text. ZY and YS prepared figures 1-6. All authors reviewed and provided critical feedback on the manuscript. All authors contributed to discussions regarding the research and revised the manuscript for intellectual content.

Corresponding author

Correspondence to Kaixiang Yang .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Chen, W., Yang, K., Yu, Z. et al. A survey on imbalanced learning: latest research, applications and future directions. Artif Intell Rev 57 , 137 (2024). https://doi.org/10.1007/s10462-024-10759-6

Download citation

Accepted : 07 April 2024

Published : 09 May 2024

DOI : https://doi.org/10.1007/s10462-024-10759-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Imbalanced learning
  • Ensemble learning
  • Multiclass imbalanced learning
  • Machine learning
  • Imbalance regression
  • Long-tailed learning
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. Problem Based Learning (PBL)

    problem based learning research articles

  2. (PDF) A Systematic Review of Problem Based Learning in Education*

    problem based learning research articles

  3. Problem Based Learning The Complete Guide

    problem based learning research articles

  4. Strategies for Implementing Problem-based Learning in Classrooms

    problem based learning research articles

  5. Overview of a problem based learning process

    problem based learning research articles

  6. (PDF) Innovations in Problem-based Learning: What can we Learn from

    problem based learning research articles

VIDEO

  1. Who We Are? PBL-R Thailand EP. 1/2023

  2. พึ่งตน เพื่อชาติ Model I คุณนิค รุ่นที่ 11 บ้านครูพะกอยวา

  3. PBL-R Documentary

  4. PBL-R Documentary

  5. What is problem-based learning?

  6. พึ่งตน เพื่อชาติ Model คุณสุริยา อ่อนดี (บอย) พึ่งตนฯ รุ่นที่ 10 กับคุณอัมมร ศิลาเลข (มร) รุ่นที่ 10

COMMENTS

  1. Effective Learning Behavior in Problem-Based Learning: a Scoping Review

    Introduction. Problem-based learning (PBL) is an educational approach that utilizes the principles of collaborative learning in small groups, first introduced by McMaster Medical University [].The shift of the higher education curriculum from traditional, lecture-based approaches to an integrated, student-centered approach was triggered by concern over the content-driven nature of medical ...

  2. Problem-Based Learning: An Overview of its Process and Impact on

    Problem-based learning (PBL) has been widely adopted in diverse fields and educational contexts to promote critical thinking and problem-solving in authentic learning situations. Its close affiliation with workplace collaboration and interdisciplinary learning contributed to its spread beyond the traditional realm of clinical education 1 to ...

  3. Effectiveness of problem-based learning methodology in undergraduate

    Problem-based learning (PBL) is a pedagogical approach that shifts the role of the teacher to the student (student-centered) and is based on self-directed learning. Although PBL has been adopted in undergraduate and postgraduate medical education, the effectiveness of the method is still under discussion. The author's purpose was to appraise available international evidence concerning to the ...

  4. Full article: The process of implementing problem-based learning in a

    Problem-based learning (PBL) is a student-centred instructional approach in which complex real-world problems are used as the vehicle to promote students' learning of concepts and principles. This paper presents a case study that explored the learning experiences of 18 pre-service teachers and how the instructor was affected when implementing ...

  5. The effectiveness of problem based learning in improving critical

    Background The adaptation process for first-year medical students is an important problem because it significantly affects educational activities. The previous study showed that 63% of students had difficulties adapting to the learning process in their first year at medical school. Therefore, students need the most suitable learning style to support the educational process, such as Problem ...

  6. Problem-Based Learning

    Definition. Problem-based learning (PBL) is an instructional method aimed at preparing students for real-world settings. By requiring students to solve problems, PBL enhances students' learning outcomes by promoting their abilities and skills in applying knowledge, solving problems, practicing higher order thinking, and self-directing their ...

  7. The critical thinking-oriented adaptations of problem-based learning

    Critical thinking is a significant twenty-first century skill that is prioritized by higher education. Problem-based learning is becoming widely accepted as an effective way to enhance critical thinking. However, as the results of studies that use PBL to develop CT have had mixed success, PBL models need to be modified to guarantee positive outcomes. This study is a systematic review that ...

  8. A Bibliometric Analysis of the Landscape of Problem-Based Learning

    Background: Problem-Based Learning (PBL) is an instructional method of hands-on, active learning centered on investigating and resolving messy, real-world problems. This study aims to systematically analyze the current status and hotspots of PBL research and provide insights for research in the field.

  9. Principles of Problem-Based Learning (PBL) in STEM Education: Using

    Developing teacher knowledge, skills, and confidence in Science, Technology, Engineering, and Mathematics (STEM) education is critical to supporting a culture of innovation and productivity across the population. Such capacity building is also necessary for the development of STEM literacies involving the ability to identify, apply, and integrate concepts from STEM domains toward understanding ...

  10. Conceptualizing Problem-Based Learning: An Overview

    Problem-based learning (PBL) is an important aspect of this new model and a contributor in competency-based learning method. PBL was introduced by McMaster University, Canada, in 1969 as a unique, hands-on approach to learning medicine. [ 3] It is pertinent to mention here that PBL is not the same as "problem-solving," as the goal of ...

  11. PDF Problem Based Learning: A Student-Centered Approach

    Problem based learning is a student-centered educational method which aims to develop problem - solving skills ... Students can begin their research with an "easy" problem and teacher can introduce the expectations. Teacher can organize some sessions regarding the problem assigned to them (background knowledge) research topics, ...

  12. The Effectiveness of the Project-Based Learning (PBL) Approach as a Way

    The PBL approach is a typical form of cooperative and research-based learning technique, characterized by active student engagement and comparative learning (Loyens ... (2009). Group collaboration in an online problem based university course. In Tan O.-S. (Ed.), Problem-based learning and creativity (pp. 173-192). Cengage Learning Asia ...

  13. Problem-Based Learning: What and How Do Students Learn?

    Problem-based approaches to learning have a long history of advocating experience-based education. Psychological research and theory suggests that by having students learn through the experience of solving problems, they can learn both content and thinking strategies. Problem-based learning (PBL) is an instructional method in which students learn through facilitated problem solving. In PBL ...

  14. Interdisciplinary Journal of Problem-Based Learning

    The Interdisciplinary Journal of Problem-based Learning (IJPBL) publishes relevant, interesting, and challenging peer-reviewed articles of research, analysis, or promising practice related to all aspects of implementing problem-based learning (PBL) in K-12 and post-secondary classrooms. ISSN 1541-5015.

  15. Effectiveness of problem-based learning methodology in undergraduate

    After the research questions and a search strategy were defined, the searches were conducted in PubMed and Web of Science using the MeSH terms "problem-based learning" and "Medicine" (the Boolean operator "AND" was applied to the search terms). No limits were set on language, publication date, study design or country of origin.

  16. Full article: Integrated problem-based learning versus lectures: a path

    Research Article. Integrated problem-based learning versus lectures: a path analysis modelling of the relationships between educational context and learning approaches. Marie-Paule Gustin Department of public health, Biostatistics, Institute of pharmaceutic and biological sciences, ...

  17. Effective Learning Behavior in Problem-Based Learning: a Scoping Review

    Problem-based learning (PBL) emphasizes learning behavior that leads to critical thinking, problem-solving, communication, and collaborative skills in preparing students for a professional medical career. However, learning behavior that develops these skills has not been systematically described. This review aimed to unearth the elements of effective learning behavior in a PBL context, using ...

  18. Systematic review of problem based learning research in fostering

    According to data from the articles that have been analyzed, Indonesia has performed a good portion of research on problem-based learning methods and critical thinking skills. Sumatra (Aceh, Bengkulu, Riau, Padang, Jambi), Java (Semarang, Yogyakarta, Banyumas, Purwokerto, Malang, Surabaya), and West Nusa Tenggara (Mataram) are the Indonesian ...

  19. An Action Research Study of the Effectiveness of Problem-Based Learning

    This action research study compares the efficacies of problem-based learning (PBL), traditional lecture, and a combination of PBL and traditional lecture to promote understanding and retention of the principle content of an elective science course, biochemistry, taught at a school for talented students.

  20. PDF Problem Based Learning (Pbl) in Teacher Education: a Review of The

    In recent years, Problem Based Learning (PBL), a learning approach that aligned with the social constructivist framework has become one of the promising innovations in higher education teaching and learning settings. PBL stand on premises for advocating learner-centred learning approach where students are problem-solvers, think in critical and

  21. New Research Explores the Impact of PBL

    Proponents of project-based learning (PBL) argue that it fosters a sense of purpose in young learners, pushes them to think critically, and prepares them for modern careers that prize skills like collaboration, problem-solving, and creativity. Critics say that the pedagogy places too much responsibility on novice learners, and ignores the ...

  22. Does problem-based learning education improve knowledge, attitude, and

    Background Patient safety is a top priority for any health care system. Most universities are looking for teaching methods through which they would be able to enhance students' clinical decision-making capabilities and their self-centered learning to ensure safe and quality nursing care. Therefore, this study aimed to determine the effect of patient safety education through problem-based ...

  23. Research article Outcomes of problem-based learning in nurse education

    Problem-based learning is a student-centered method and strategy that allows nursing students to collaborate in small groups with the goal of improving their clinical skills and cognitive ... How theory and design-based research can mature PBL practice and research. Adv. Health Sci. Educ., 24 (5) (2019), pp. 879-891, 10.1007/s10459-019-09940-2.

  24. Contextual Design: Pengembangan Model Pembelajaran Problem Based

    Abstract: This research is a learning model development research that aims to achieve learning outcomes in high-rise building design course studios, using a contextual design approach based on the Problem Based Learning model, this research was conducted by applying the principles of contextual inquiry, interpretation, and data consolidation on the assignment, as well as applying storyboarding ...

  25. A Survey of Deep Learning Methods for Estimating the Accuracy of ...

    The quality prediction of quaternary structure models of a protein complex, in the absence of its true structure, is known as the Estimation of Model Accuracy (EMA). EMA is useful for ranking predicted protein complex structures and using them appropriately in biomedical research, such as protein-protein interaction studies, protein design, and drug discovery. With the advent of more ...

  26. A survey on imbalanced learning: latest research ...

    Imbalanced learning constitutes one of the most formidable challenges within data mining and machine learning. Despite continuous research advancement over the past decades, learning from data with an imbalanced class distribution remains a compelling research area. Imbalanced class distributions commonly constrain the practical utility of machine learning and even deep learning models in ...

  27. Full article: Deep Recurrent Reinforcement Learning for Intercept

    However, for practical tasks, the whole system state is not always available due to limited measurements, and the resulting partial observability would cause significant performance degradation of the RL policy. To address this problem, we propose a historical observation sequence-based deep recurrent reinforcement learning algorithm.

  28. Flawed machine-learning confounds coding sequence annotation

    Detecting protein coding genes in genomic sequences is a significant challenge for understanding genome functionality, yet the reliability of bioinformatic tools for this task remains largely unverified. This is despite some of these tools having been available for several decades, and being widely used for genome and transcriptome annotation. We perform an assessment of nucleotide sequence ...