Center for Teaching

Peer review of teaching.

Bandy, J. (2015). Peer Review of Teaching. Vanderbilt University Center for Teaching. Retrieved [todaysdate] from https://cft.vanderbilt.edu/guides-sub-pages/peer-review-of-teaching/.

Introduction

peer review higher education

In higher education, peer review stands as the prime means for ensuring that scholarship is of the highest quality, and from it flows consequential assessments that shape careers, disciplines, and entire institutions.  While peer review is well established as a means of evaluating research across the disciplines, it is less common in the assessment of teaching.  Yet it is no less useful, since it can improve what Ernest Boyer has called the “scholarship of teaching and learning” by enhancing instructional and faculty development, by bolstering the integrity of personnel decisions, and by enabling more intentional and mutually supportive communities of scholar teachers.  This guide is intended as an introduction to the basics of peer review, including its purposes, challenges, and common practices.  The primary audience for this guide consists of departments, programs, or schools considering implementing peer review, although individual faculty, staff, and students are likely to find what follows interesting, as well.

What Is Peer Review?

Peer review is often identified with peer observations, but it is more broadly a method of assessing a portfolio of information about the teaching of an instructor under review.  This portfolio typically includes curricula vitae, student evaluations, self-evaluative statements, peer observations, and other evidence such as syllabi, assignments, student work, and letters solicited from former students.  This said, peer observations will figure prominently in what follows.

It is also worth noting a common distinction between two very different forms of peer review: formative and summative.   Formative evaluation typically is oriented solely towards the improvement of teaching and is part of instructional mentorship and development.  Summative evaluation , in contrast, is that done to inform personnel decisions.  To improve the freedom and exploration of individual faculty, formative reviews may be shielded from scrutiny for a period of years until such time that there needs to be accountability to standards of excellence for personnel decisions.  At this point in time, summative evaluations are more common since they are tied to decisions related to reappointment, promotion, or tenure (Bernstein et al. 2000).  Because the more consequential nature of summative evaluations tends to diminish the formative value of the peer review process, it is important to maintain a clear distinction between these types of evaluation and be transparent with those under review.  It is also common to have different faculty involved in each form of assessment – mentor faculty in the formative evaluation and departmental or program administrators, such as chairs, involved in summative evaluations.

Why Peer Review?

Peer review serves many functions in the process of evaluating faculty, courses, or entire programs.

What’s good for research is good for teaching. As in peer reviews of research, it is a vital means of receiving expert assessments of one important part of scholarly practice: teaching.  As with research, peer review ensures that faculty internalize, in the words of Pat Hutchings, scholarly “habits of mind” by identifying goals, posing questions for inquiry, exploring alternatives, taking appropriate risks, and assessing the outcomes with learned colleagues.  When this process of scholarly engagement and deliberate improvement is part of the institutional expectations for teaching, as it is with research, it can function to support a community of scholarship around teaching (Hutchings 1996).

Enables teaching to be a community endeavor .  Relatedly, too often in higher education teaching is subject to what Pat Hutchings has called, “pedagogical isolation,” but peer review provides opportunities for us to open our teaching up to a community of colleagues who can nurture improvement (Pat Hutchings 1996).

Peer review allows for less exclusive reliance on student evaluations. Student evaluations have become institutionalized in higher education and for the most part provide extremely useful information for the purposes of evaluating faculty, courses, and even entire curricula.  However, students may not always be the best evaluators since they often have limited disciplinary training, they can have biases against certain faculty unrelated to teaching effectiveness, and they can be less cognizant of institutional goals or values than faculty. Indeed it is for these reasons that the American Sociological Association, along with other professional societies, have cautioned universities not to overly rely on student evaluations (see here ).

Greater faculty experimentation and rigor. Just as importantly, an over-reliance on student evaluations in processes of professional review can cause faculty to become overly concerned about receiving positive student evaluations.  In the worst of moments, this can lead faculty to adopt a consumer model of education, shaping our teaching to meet the needs of students over the needs of our disciplines or institutions (Hutchings 1996).  This, in turn, results in faculty becoming overly cautious by refusing to challenge student expectations by using conventional teaching methods, by becoming less rigorous in their standards, and at worst, by feeling a need to entertain more than educate.  Peer review, when done in formative and summative forms alongside student evaluations, can ensure both faculty and students will have a voice in their evaluation, and that faculty have greater autonomy to innovate and to teach rigorously.  This can give faculty the opportunity to focus more intentionally on what helps students learn best, and therefore more directly focus on the quality of their teaching.

Allows for both formative and summative evaluation. When done well, peer review involves both formative and summative evaluations.  The inclusion of greater formative evaluation allows for more significant faculty and instructional development by encouraging more critical reflection on teaching and by providing a safer, less risky, and more collegial setting for assessment.

I mproves faculty approaches to teaching . Daniel Bernstein, Jessica Jonson, and Karen Smith (2000), in their examination of peer review processes found they positively impact faculty attitudes and approaches toward teaching.  While their study did not reveal a necessary shift in faculty attitudes towards student learning and grading, it did change several important aspects of teaching practice.  First, it dramatically impacted in-class practices, particularly the incorporation of more active and collaborative learning, and less reliance on lecturing.  Second, it improved faculty willingness to ask students to demonstrate higher order intellectual and critical thinking skills.  Third, for some faculty it increased the quality of feedback they gave to their students on assignments, and thus improved student understanding and performance.  And lastly, they enjoyed discussing substantive disciplinary and teaching issues with their colleagues, enhancing the scholarly community in their departments and programs.  Peer review therefore shows an ability to improve faculty joy in teaching by improving the relations among faculty and students, and among faculty themselves.

How to Select Peer Reviewers

Peer review may take many forms, but usually begins with the selection of peer reviewers drawn most often from within the same department or program as the instructor being reviewed.  The reviewers typically are senior faculty, but sometimes junior faculty as well, who have significant expertise in teaching.  These faculty may be chosen to undertake all peer teaching reviews for the department or program during a specific period, or they may be selected specifically because they share some expertise with the instructor being reviewed.  The person under review also may be granted some choice as to whom one or more of the reviewers may be.  The number of the reviewers may vary but usually include at least two and rarely more than four.

In selecting reviewers, one must be mindful of several criteria .

Institutional Experience. It helps if reviewers are highly familiar with the department or program, school, and institutional goals, and particularly the processes of peer review itself and the criteria that form the basis of the assessment.

Integrity. Peer reviews also function best when reviewers have commitments to integrity, fair-mindedness, privacy, and understanding the reasoning behind the teaching choices of the person under review.

Trust. Peer reviewers, especially in formative reviews, work collaboratively with the faculty under review to establish a clear process of evaluation and reporting, therefore peer reviewers who can establish trust are particularly effective.

Mentorship. Those under review are particularly vulnerable and often anxious, therefore reviewers who have grace and tact in the process of assessment, can offer feedback with integrity and support, and who can help advise on strategies for faculty development will be most helpful.

Thorough and Practical. Peer reviewers should be able to provide summary reports that clearly and thoroughly represent all phases of the process, and that make recommendations that are specific and practical (Center for Teaching Effectiveness, University of Texas, Austin).

How to Evaluate?

The peer evaluation itself usually focuses on several aspects of teaching through a process that usually has a series of activities.  The following list of peer evaluation activities represents a sequential, reasonably thorough, and maximal model for peer review, but not all are necessary.

Develop Departmental Standards for Teaching. Without a clear set of learning goals for all departmental programs it is difficult to assess teaching with any validity or reliability, and it can leave departments open to biases, inconsistencies, and miscommunications in peer evaluation processes.  One of the greatest benefits of peer reviews of teaching is that it provides an occasion for departments and programs, if not entire schools and universities, to be more intentional, specific, and clear about quality teaching and learning, and the various means to achieve it.  This may be the work of an entire department or a special teaching committee that researches disciplinary and institutional benchmarks and proposes guidelines for review.

Preliminary Interview. Peer review processes usually begin with a conversation, sometimes framed as an interview, between the peer reviewers and the teacher being reviewed.  The prime purpose of this is to provide the teacher in question an understanding of the process of peer review, and to offer them the opportunity to provide their input on the process.  The conversation also allows the peer reviewers to begin collecting information about the teaching context, particularly the courses, of the teacher being reviewed.  This context helps to provide better understandings of the teacher’s goals and teaching choices, and may be divided into several dimensions related to the design of their courses (Fink 2005).

Logistical contexts. How many students?  Is the course(s) lower division, upper division, a graduate class, etcetera?  How frequent and long are the class meetings?  Is it a distance-learning course?  What are the physical elements of the learning environment?

Goals. H o w have the learning goals of the course(s) been shaped by the department, college, university, or discipline?  Are the courses required or electives?  What kinds of intellectual and skill outcomes is the focus of the course(s)?

Characteristics of the learners. What are their ages and other demographic factors that may bear upon teaching?  What is their prior experience in the subject?  What are their interests and goals?  What are their life situations?

Characteristics of the teacher. What expertise does he or she have in the subject areas?  What are his or her own assessments of his/her strengths and weaknesses?  What models of teaching did he or she encounter as a student?  What theoretical or practical orientations ground his or her approach to teaching and learning?  What from the teaching and learning scholarship has been influential on his/her teaching?  How do these influences take shape in the teaching of the instructor’s different courses?

Class Observations. The goal of the class observations is to collect a sample of information about the in-class practices of teaching and learning.  They typically include two to four class visits to gain reliable data.  If the teacher being reviewed teaches multiple courses, as they often do, the process may involve fewer observations per course (e.g., two).

What to observe? The goal is to create a thorough inventory of instructor and student practices that define the teaching and learning environment.  These may vary widely across discipline and teachers, and can be drawn from a broad array of pedagogies, depending on learning goals.  This said, there are several categories of instructor and student practices to note during the observation(s).

Content knowledge Use of instructional materials Class organization Presentation form and substance Teacher-Student interactions Student participation Assessment practices

How to assess teaching practices? In many institutions, inventories of teaching practices are combined with assumptions about what is conducive to student learning.  It is important for the peer reviewers and the administrators who guide them to be conscious of what they regard as effective teaching and the appropriate evidence for it before committing to an observation process, lest the peer review gather invalid or unreliable data, and lest the process invite peer biases and unexamined pedagogy into the evaluation.  A reasonably representative list of teaching practices, along with more or less explicit value for learning, would include the following:

Content knowledge

– Selection of class content worth knowing and appropriate to the course – Provided appropriate context and background – Mastery of class content – Citation of relevant scholarship – Presented divergent viewpoints

Clear and effective class organization

– Clear statement of learning goals – Relationship of lesson to course goals, and past and future lessons – Logical sequence – Appropriate pace for student understanding – Summary

Varied methods for engagement, which may include…

– In-class writing – Analysis of quotes, video, artifacts – Group discussions – Student-led discussions – Debates – Case studies – Concept maps – Book clubs – Role plays – Poster sessions – Think aloud problem solving – Jigsaws – Field trips – Learning logs, journals – Critical incident questionnaire ( see Brookfield )

Presentation

– Project voice – Varied intonation – Clarity of explanation – Eye contact – Listened effectively – Defined difficult terms, concepts, principles – Use of examples – Varied explanations for difficult material – Used humor appropriately

Teacher-Student Interactions

– Effective questioning – Warm and welcoming rapport – Use of student names – Encouraging of questions – Encouraging of discussion – Engaged student attention – Answered students effectively – Responsive to student communications – Pacing appropriate for student level, activity – Restating questions, comments – Suggestion of further questions, resources – Concern for individual student needs – Emotional awareness of student interests, needs

Appropriateness of instructional materials

– Content that matches course goals – Content that is rigorous, challenging – Content that is appropriate to student experience, knowledge – Adequate preparation required – Handouts and other materials are thorough and facilitated learning – Audio/visual materials effective – Written assignments

Student engagement

– Student interest – Enthusiasm – Participation – Student-to-student interaction

Support of departmental/program/school instructional efforts

– Appropriate content – Appropriate pedagogy – Appropriate practice

In-class, formative assessment practices

– Background knowledge probes, muddiest point exercises, defining features matrix and other “classroom assessment techniques” described in greater detail here – Ungraded in-class writing exercises, such as minute papers – Discussions – Questioning

Out-of-class, summative assessment practices

– Class participation – In-class writing exercises, graded – Presentations – Examinations – Projects

Use of observation forms. To make the process more transparent, reliable, and valid, many departments and programs use observation forms, constructed from items like those listed above, to help peer evaluators track and evaluate teaching and learning practices.  These may include nothing more than checklists of activities; they may provide rating scales (e.g., Likert scales) to assist the evaluation; they may have open-ended prompts that provide space for general commentary and analysis; or, they may involve some combination of all three.  The most thorough forms guide the observer in what exactly they should observe, and prompt them to provide some synthesis and evaluation of their observations.  Several example forms may be found with a broad online search, but  here  is a useful example from Wayne State University.

Evidence of Student Learning.

End-of-course student work. To more thoroughly assess the effectiveness of instruction, peer reviewers may collect evidence of student learning in the form of examinations, written assignments, and other projects from the course of the teacher under review.  Collecting this evidence may be helpful in assessing core competencies expected from the course.

Student work throughout the course. Evidence of student learning may be more thoroughly assessed by collecting examples of student work at various times during a course so as to gain perspective on student growth and development.  To do this requires some preparation and lead-time to ensure the teacher under review is sure to collect work from students, and gain their consent for sharing it.

Grades. Student grades also may be used as an indicator of student performance, if they are accompanied by contextual information such as a grade distribution, the criteria used to assign those grades, and samples of student work at A, B, C, D, and failing levels.

Student Evaluations. In addition to reviewing standard end-of-course evaluations, peer reviewers may choose to solicit letters of evaluation from a sample of students, current or alumni, who have had at least one course with the teacher in question, preferably two or more.  Requesting these from graduates who have a more mature perspective on the effectiveness and impact of the teacher under review can be especially useful.  The request for evaluation letters can be more or less specific in its prompts, but at a minimum typically introduce the importance of the evaluation process for the individual and the institution, and ask for them to assess how effective the teacher was as an instructor, what limitations he or she may have, and what impacts he or she made to their educations.

Engagement with Centers for Teaching.  If the person under review has attended consultations, workshops, or other programs offered by a campus center for teaching and learning, the evaluation process may consider this to be part of the analysis.

Advising Activity. Peer evaluators may wish to make note of the advising activities and load of the teacher in question, along with any special service to the teaching mission of the department, school, or institution.  This may involve some data collection from students the teacher has advised and peers with whom the teacher has collaborated in their teaching service.  For some faculty, this kind of teaching outside typical course structures can be a substantial contribution to the teaching mission of the department.

Professional Publications, Presentations, and Recognitions. Peer reviewers also may wish to collect evidence of the scholarly activities in teaching and learning by the teacher in question, such as professional publications, presentations, or awards for their teaching.

Collaborative Analysis. Together, each of the activities above provides information that can be assembled into an overall picture of the teacher under review.  After meetings between the peer evaluators to review the data collected, any missing information can be sought and unresolved questions can be answered.  It is then incumbent upon the evaluators to discuss the form and substance of a final assessment and to divide the work of writing it.

Overall Recommendation. Typically the written evaluation includes some clarification of the process, the methods, the data collected, and of course any positive feedback and constructive criticism that is necessary, along with suggested improvements.  This will be the substance of a formative or summative assessment by the peer evaluators, one that may be shared with the relevant administrators and the teacher under review, depending on the process adopted.  If the evaluation is formative, this may accompany a series of suggested improvements for teaching and a plan for instructional or curricular development that could include ongoing mentorship, the use of professional development resources such as the Center for Teaching, and further peer evaluation.  If it is a summative evaluation, the recommendation will be used by departmental and university committees and administrators as the basis for a reappointment, promotion, or tenure decision.

Possible Limitations of Peer Review?

Limitations of Peer Observations. While peer review may be a process that allows for a more rigorous evaluation of a teaching portfolio, it is worth noting that peer observations alone are often insufficient data on which to base an entire teacher’s assessment.  Peer observations represent merely a snapshot of teaching, and thus must be only one component of a teaching portfolio that is subject to peer evaluation, including student evaluations, evidence of student learning, course materials, and self evaluations, just to name a few.

Bias. Surely, all methods of teaching evaluation risk biases of one form or another.  One common criticism of peer review processes is that they may invite some bias if they involve limited or unprofessional approaches to information collection and analysis.  This may occur because of several reasons.  Personal relationships between reviewers and those being reviewed can create either hyper- or hypo-critical approaches to evaluation.  Standards of excellence or their application can be highly subjective and individual teaching styles may vary widely, therefore evaluations can be contentious if standards are not defined in advance through rigorous research and open, collaborative processes.  Power relations in departments or programs also can unduly influence open and thorough evaluation.  Other factors may cause peer evaluator bias as well.  Therefore, to avoid the worst cases of bias, peer review must be established via processes that guarantee the greatest rigor, openness, and transparency.

Collegiality Issues. Under the best of circumstances, peer review can shape a dialogue about teaching that fosters a teaching community among educators and can lead to more growth-oriented forms of professional development.  However, when it is implemented in less collaborative and more adversarial forms, or when it involves unavoidable consequences such as promotion or job security, anxieties and frustrations can be triggered for both reviewers and those being reviewed.  Therefore peer review must adhere to the highest standards of transparency, integrity, and care for the sake of those under review.

Time and Effort. Possibly the most common critique of peer review processes, and the reason they are not more commonly used in the academy, is that they require significant time and effort.  Departmental and campus administrators must define the process, establish standards, train and prepare reviewers, perform peer observations, review portfolios, draft assessments, and have multiple dialogues with those under review.  Each step requires preparation if it is to be fair, transparent, and professional.  Any shortcut may compromise the rigor, care, or goals of the evaluation.  However, there are several shortcuts each with potential costs.

Rely on the expertise of senior colleagues, administrators, and the Center for Teaching. There are typically those on campus that my have sufficient knowledge to assist in defining departmental learning or teaching goals, in determining what data to include in a teaching portfolio, in training peer observers, in drafting assessments, etcetera.  These sources of expertise may be helpful in streamlining the process with little cost to its integrity, as long as their suggestions may be tailored to the needs of the department or program in question.

Use predefined standards for teaching and learning. Rather than spend significant time adjudicating which learning and teaching goals are appropriate, department or program leaders may decide to use existing language in university or departmental missions, course catalogs, accreditation reports, other constituting documents, or the operating principles of the Center for Teaching.  This may grant some efficiency with limited costs to the integrity of the peer review process.  However, vague and imprecise learning goals that sometimes characterize constitutional documents (e.g., “critical thinking”) may be of little help in benchmarking a specific set of courses or teaching strategies.  Likewise, departments and programs may have particular teaching challenges that broad standards may not take into consideration.  Both difficulties can leave departments or programs open to unclear standards, unfair or inconsistent judgments, and miscommunications.

Collect data judiciously. One of the more time consuming tasks of peer review is combing through all facets of a teaching portfolio, particularly if it includes samples of student work.  To save time, some peer review processes rely largely upon peer observation, in addition to student evaluations of teaching, and do not collect teaching portfolios or examples of student work.  Others collect only limited samples of student work, such as grade distributions and examples of student work at A, B, C and D levels to evaluate an instructor’s assessment and grading strategies.  Other data collection short cuts may be possible as well.  However, more limited data may allow fewer contextual interpretations of a teaching career, and peer observations alone are merely in-class snapshots of instructional performance, not a more encompassing perspective on all phases of teaching.  These may lead a department or program to make less informed and fair judgments.

Use templates for written peer evaluation reports. Final written reports need not be highly expansive analyses, but may represent more of a thorough check list with brief sections of commentary on challenges and successes that become points of discussion between peer reviewers and the instructor under review.  This form or report can save valuable time, but it also may provide limited feedback to the instructor under review, possibly affording him or her less useful guidance on where to improve his or her teaching.

Only summative evaluation. A department or program may limit peer evaluation to only summative and not formative assessments of teaching.  This would limit opportunities for faculty development, hinder data collection, create more tensions between reviewers and those being evaluated, and thwart the formation of collegial cultures that improve teaching for entire departments and programs. However, many departments and programs have used this shortcut to conduct peer review.

Concluding Thoughts

Peer review of teaching, when done well, has many benefits in fostering teaching excellence, creating collegial communities of scholar teachers, and more fair and transparent cultures of professional development.  By contrast the challenges of peer review, while not insignificant, are small by comparison.  Peer review of teaching, as in research, enhances the integrity and innovation of teaching and is a practice whose institutionalization is long overdue.

Bibliography

  • Bernstein, Daniel J. 2008. “Peer Review and Evaluation of the Intellectual Work of Teaching.” Change. March/April.
  • Bernstein, Daniel J., Jessica Jonson, and Karen Smith. 2000. “An Examination of the Implementation of Peer Review of Teaching.” New Directions for Teaching and Learning. 83: 73-86
  • Bernstein, Daniel., A.N. Burnett, A. Goodburn and P Savory. 2006. Making Teaching and Learning Visible: Course Portfolios and the Peer Review of Teaching . Anker.
  • Center for Teaching Effectiveness. “Preparing for Peer Observation: A Guidebook.” University of Texas, Austin.
  • Chism, Nancy V. 2007. Peer Review of Teaching: A Sourcebook . 2 nd Edition. Anker.
  • Glassick, C. M. T. Huber, and G. Maeroff. 1997. Scholarship Assessed: Evaluation of the Professoriate . Jossey-Bass.
  • Hutchings, Pat. 1995. From Idea to Prototype: The Peer Review of Teaching . Stylus
  • Hutchings, Pat. 1996. “The Peer Collaboration and Review of Teaching.” ACLS Occasional Paper No 33.
  • Hutchings, Pat. 1996. Making Teaching Community Property: A Menu for Peer Collaboration and Peer Review . Stylus
  • Hutchings, Pat. 1998. The Course Portfolio . Stylus
  • Perlman, Baron and Lee I. McCann. 1998. “Peer Review of Teaching: An Overview.” Office of Teaching Resources in Psychology and Department of Psychology, Georgia Southern University.
  • Seldin, P. 1997. The Teaching Portfolio . 2 nd Edition . Anker.
  • Seldin, P. 1999. Changing Practices in Evaluating Teaching: A Practical Guide to Improved Faculty Performance and Promotion/Tenure Decisions. Jossey-Bass.
  • Shulman, Lee S. 2004. Teaching as Community Property: Essays on Higher Education . Jossey-Bass.

Creative Commons License

Teaching Guides

  • Online Course Development Resources
  • Principles & Frameworks
  • Pedagogies & Strategies
  • Reflecting & Assessing
  • Challenges & Opportunities
  • Populations & Contexts

Quick Links

  • Services for Departments and Schools
  • Examples of Online Instructional Modules

Cover image of The Review of Higher Education

The Review of Higher Education

Penny A. Pasque, The Ohio State University; Thomas F. Nelson Laird, Indiana University, Bloomington

Journal Details

The Review of Higher Education  is interested in empirical research studies, empirically-based historical and theoretical articles, and scholarly reviews and essays that move the study of colleges and universities forward. The most central aspect of  RHE  is the saliency of the subject matter to other scholars in the field as well as its usefulness to academic leaders and public policymakers. Manuscripts submitted for  RHE  need to extend the literature in the field of higher education and may connect across fields and disciplines when relevant. Selection of articles for publication is based solely on the merits of the manuscripts with regards to conceptual or theoretical frameworks, methodological accurateness and suitability, and/or the clarity of ideas and gathered facts presented. Additionally, our publications center around issues within US Higher Education and any manuscript that we send for review must have clear implications for US Higher Education. 

Guidelines for Contributors

Manuscripts should be typed, serif or san serif text as recommended by APA 7th edition (e.g., 11-point Calibri, 11-point Arial, and 10-point Lucida Sans Unicode, 12-point Times New Roman, 11-point Georgia, 10-point Computer Modern) double-spaced throughout, including block quotes and references. Each page should be numbered on the top right side of the page consecutively and include a running head. Please supply the title of your submission, an abstract of 100 or fewer words, and keywords as the first page of your manuscript submission (this page does not count towards your page limit). The names, institutional affiliations, addresses, phone numbers, email addresses and a short biography of authors should appear on a separate cover page to aid proper masking during the review process. Initial and revised submissions should not run more than 32 pages (excluding abstract, keywords, and references; including tables, figures and appendices). Authors should follow instructions in the 7th edition Publication Manual of the American Psychological Association; any manuscripts not following all APA guidelines will not be reviewed. Please do not change fonts, spacing, or margins or use style formatting features at any point in the manuscript except for tables. All tables should be submitted in a mutable format (i.e. not a fixed image). Please upload your manuscript as a word document. All supporting materials (i.e., tables, figures, appendices) should be editable in the manuscript or a separate word document (i.e., do not embedded tables or figures). For a fixed image, please upload a separate high-resolution JPEG.

Authors should use their best judgment when masking citations. Masking some or all citations that include an author’s name can help prevent reviewers from knowing the identities of the authors. However, in certain circumstances masking citations is unnecessary or could itself reveal the identities of manuscript authors. Because authors are in the best position to know when masking citations will be effective, the editorial team will generally defer to them for these decisions.

Manuscripts are to be submitted in Word online at  mc.manuscriptcentral.com/rhe . (If you have not previously registered on this website, click on the “Register here” link to create a new account.) Once you log on, click on the “Author Center” link and then follow the printed instructions to submit your manuscript.

The term “conflict of interest” means any financial or other interest which conflicts with the work of the individual because it (1) could significantly impair the individual’s objectivity or (2) could create an unfair advantage for any person or organization. We recommend all authors review and adhere to the ASHE Conflict of Interest Policy before submitting any and all work. Please refer to the policy at  ashe.ws/ashe_coi

Please note that  The Review of Higher Education  does not require potential contributors to pay an article submission fee in order to be considered for publication.  Any other website that purports to be affiliated with the Journal and that requires you to pay an article submission fee is fraudulent. Do not provide payment information. Instead, we ask that you contact the  RHE  editorial office at  [email protected]  or William Breichner the Journals Publisher at the Johns Hopkins University Press  [email protected] .

Author Checklist for New Submissions

Page Limit.  Manuscripts should not go over 32 pages (excluding abstract, keywords, and references; including tables, figures and appendices.)

Masked Review.  All author information (i.e., name, affiliation, email, phone number, address) should appear on a separate cover page of the manuscript. The manuscript should have no indication of authorship. Any indication of authorship will result in your manuscript being unsubmitted.

Formatting.  Manuscripts should be typed, serif or san serif text as recommended by APA 7th edition (e.g., 11-point Calibri, 11-point Arial, and 10-point Lucida Sans Unicode, 12-point Times New Roman, 11-point Georgia, 10-point Computer Modern), double-spaced throughout, including block quotes and references, and each page should be numbered on the top right side of the page consecutively. Authors should follow instructions in the 7th edition Publication Manual of the American Psychological Association; this includes running heads, heading levels, spacing, margins, etc.. Any manuscripts not following APA 7th edition will be unsubmitted. [Please note, the  RHE  editorial team recommends 12-pt Times New Roman font to ensure proper format conversion within the ScholarOne system.]

Abstract.  All manuscripts must include an abstract of 100 words or fewer, and keywords as the first page of your manuscript submission (this page does not count towards your page limit).

Author Note.  An Author’s note may include Land Acknowledgments, Disclosure Statement (i.e., funding sources), or other acknowledgments. This should appear on your title page (not in the masked manuscript).  

Tables.  All tables should be editable. Tables may be uploaded in the manuscript itself or in a separate word document. All tables must be interpretable by readers without the reference to the manuscript. Do not duplicate information from the manuscript into tables. Tables must present additional information from what has already been stated in the manuscript.

Figures.  Figures should be editable in the manuscript or a separate word document (i.e., no embedded tables). For fixed images, please upload high-resolution JPEGs separately.

References.  The reference page should follow 7th edition APA guidelines and be double spaced throughout (reference pages do not count toward your page limit). 

Appendices.  Appendices should generally run no more than 3 manuscript pages. 

Additional Checklist for Revised Submissions

Revised manuscripts should follow the checklist above, with the following additional notes: 

Page Limit.  Revised manuscripts should stay within the page limit for new submissions (32 pages). However, we do realize that this is not always possible, and we may allow for a couple of extra pages for your revisions. Extensions to your page length will be subject to editor approval upon resubmission, but may not exceed 35 pages (excluding abstract, keywords, and references).

  • Author Response to Reviewer Comments.  At the beginning of your revised manuscript file, please include a separate masked statement that indicates fully [1] all changes that have been made in response to the reviewer and editor suggestions and the pages on which those changes may be found in the revised manuscript and [2] those reviewer and editor suggestions that are not addressed in the revised manuscript and a rationale for why you think such revisions are not necessary. This can be in the form of a table or text paragraphs and must appear at the front of your revised manuscript document. Your response to reviewer and editor comments will not count toward your manuscript page limit. Please note that, because you will be adding your response to the reviewer and editor feedback to the beginning of your submission, this may change the page numbers of your document unless you change the pagination and start your manuscript itself on page 1. The choice is yours but either way, please ensure that you reference the appropriate page numbers within your manuscript in these responses. Additionally, when you submit your revised manuscript, there will be a submission box labeled “Author Response to Decision Letter”. You are not required to duplicate information already provided in the manuscript, but instead may use this to send a note to the reviewer team (e.g., an anonymous cover letter or note of appreciation for feedback). Please maintain anonymity throughout the review process by NOT including your name or by masking any potentially identifying information when providing your response to the reviewer's feedback (both in documents and the ScholarOne system).

Editorial Correspondence

Please address all correspondence about submitting articles (no subscriptions, please) to one or both of the following editors:

Dr. Penny A. Pasque, PhD Editor, Review of Higher Education 341 C Ramseyer Hall 29 W. Woodruff Avenue The Ohio State University Columbus, OH 43210 email:  [email protected]

Dr. Thomas F. Nelson Laird, PhD Editor, Review of Higher Education 201 North Rose Avenue Indiana University School of Education Bloomington, IN 47405-100 email:  [email protected]

Submission Policy

RHE publishes original works that are not available elsewhere. We ask that all manuscripts submitted to our journal for review are not published, in press or submitted to other journals while under our review. Additionally, reprints and translations of previously published articles will not be accepted.

Type of Preliminary Review

RHE utilizes a collaborative review process that requires several members of the editorial team to ensure that submitted manuscripts are suitable before being sent out for masked peer-review. Members of this team include a Editor, Associate Editor and Managing Editors. Managing Editors complete an initial review of manuscripts to ensure authors meet RHE ’s Author Guidelines and work with submitting authors to address preliminary issues and concerns (i.e., APA formatting). Editors and Associate Editors work together to decide whether it should be sent out for review and select appropriate reviewers for the manuscript.

Type of Review

When a manuscript is determined as suitable for review by the collaborative decision of the editorial team, Editors and/or Associate Editors will assign reviewers. Both the authors’ and reviewers’ are masked throughout the review and decision process.

Criteria for Review

Criteria for review include, but are not limited to, the significance of the topic to higher education, completeness of the literature review, appropriateness of the research methods or historical analysis, and the quality of the discussion concerning the implications of the findings for theory, research, and practice. In addition, we look for the congruence of thought and approach throughout the manuscript components.

Type of Revisions Process

Some authors will receive a “Major Revision” or “Minor Revision” decision. Authors who receive such decisions are encouraged to carefully attend to reviewer’s comments and recommendations and resubmit their revised manuscripts for another round of reviews. When submitting their revised manuscripts, authors are asked to include a response letter and indicate how they have responded to reviewer comments and recommendations. In some instances, authors may be asked to revise and resubmit a manuscript more than once.

Review Process Once Revised

Revised manuscripts are sent to the reviewers who originally made comments and recommendations regarding the manuscript, whenever possible. We rely on our editorial board and ad-hoc reviewers who volunteer their time and we give those reviewers a month to provide thorough feedback. Please see attached pdf for a visual representation of the RHE workflow .

Timetable (approx.)

  • Managing Editor Technical Checks – 1-3 days
  • Editor reviews and assigns manuscript to Associate Editors – 3-5 days
  • Associate Editor reviews and invites reviewers – 3-5 days
  • Reviewer comments due – 30 days provided for reviews
  • Associate Editor makes a recommendation –  5-7 days
  • Editor makes decision – 5-7 days
  • If R&R, authors revise and resubmit manuscript – 90 days provided for revisions
  • Repeat process above until manuscript is accepted or rejected -

Type of review for book reviews

Book reviews are the responsibility of the associate editor of book reviews. Decisions about acceptance of a book review are made by that associate editor.

The Hopkins Press Journals Ethics and Malpractice Statement can be found at the ethics-and-malpractice  page.

The Review of Higher Education expects all authors to review and adhere to ASHE’s Conflict of Interest Policy before submitting any and all work. The term “conflict of interest” means any financial or other interest which conflicts with the work of the individual because it (1) could significantly impair the individual’s objectivity or (2) could create an unfair advantage for any person or organization. Please refer to the policy at ashe.ws/ashe_coi .

Guidelines for Book Reviews

RHE publishes book reviews of original research, summaries of research, or scholarly thinking in book form. We do not publish reviews of books or media that would be described as expert opinion or advice for practitioners.

The journal publishes reviews of current books, meaning books published no more than 12 months prior to submission to the associate editor in charge of book reviews.

If you want to know whether the RHE would consider a book review before writing it, you may email the associate editor responsible for book reviews with the citation for the book.

Reviewers should have scholarly expertise in the higher education research area they are reviewing.

Graduate students are welcome to co-author book reviews, but with faculty or seasoned research professionals as first authors.

Please email the review to the associate editor in charge of book reviews (Timothy Reese Cain, [email protected] ), who will work through necessary revisions with you if your submission is accepted for publishing.

In general, follow the APA Publication Manual, 7th edition.

Provide a brief but clear description and summary of the contents so that the reader has a good idea of the scope and organization of the book. This is especially important when reviewing anthologies that include multiple sections with multiple authors.

Provide an evaluation of the book, both positive and negative points. What has been done well? Not so well? For example the following are some questions that you can address (not exclusively), as appropriate:

What are the important contributions that this book makes?

What contributions could have been made, but were not made?

What arguments or claims were problematic, weak, etc.?

How is the book related to, how does it supplement, or how does it complicate current work on the topic?

To which audience(s) will this book be most helpful?

How well has the author achieved their stated goals?

Use quotations efficiently to provide a flavor of the writing style and/or statements that are particularly helpful in illustrating the author(s) points. 

If you cite any other published work, please provide a complete reference.

Please include a brief biographical statement immediately after your name, usually title and institution. Follow the same format for co authored reviews. The first author is the contact author.

Please follow this example for the headnote of the book(s) you are reviewing: Stefan M. Bradley. Upending the Ivory Tower: Civil Rights, Black Power, and the Ivy League. New York: New York University Press, 2018. 465 pp. $35. ISBN 97814798739999.

Our preferred length is 2,000–2,500 words in order for authors to provide a complete, analytical, review. Reviews of shorter books may not need to be of that length.

The term “conflict of interest” means any financial or other interest which conflicts with the work of the individual because it (1) could significantly impair the individual’s objectivity or (2) could create an unfair advantage for any person or organization. We recommend all book reviewers read and adhere to the ASHE Conflict of Interest Policy before submitting any and all work. Please refer to the policy at ashe.ws/ashe_coi

NOTE: If the Editor has sent a book to an author for review, but the author is unable to complete the review within a reasonable timeframe, we would appreciate the return of the book as soon as possible; thanks for your understanding.

Please send book review copies to the contact above. Review copies received by the Johns Hopkins University Press office will be discarded.

Penny A. Pasque,         The Ohio State University

Thomas F. Nelson Laird,         Indiana University-Bloomington

Associate Editors

Angela Boatman,         Boston College

Timothy Reese Cain (including Book Reviews),         University of Georgia

Milagros Castillo-Montoya,         University of Connecticut

Tania D. Mitchell,         University of Minnesota

Chrystal A. George Mwangi       George Mason University

Federick Ngo,        University of Nevada, Las Vegas

Managing Editors

Stephanie Nguyen,         Indiana University Bloomington

Monica Quezada Barrera,         The Ohio State University

Editorial Board

Sonja Ardoin,         Clemson University

Peter Riley Bahr,        University of Michigan

Vicki Baker,      Albion College

Allison BrckaLorenz,        Indiana University Bloomington

Nolan L. Cabrera,        The University of Arizona

Brendan Cantwell,        Michigan State University

Rozana Carducci,        Elon University

Deborah Faye Carter,         Claremont Graduate University

Ashley Clayton,         Louisiana State University

Regina Deil-Amen,         The University of Arizona 

Jennifer A. Delaney,     University of Illinois Urbana Champaign

Erin E. Doran,    Iowa State University

Antonio Duran,   Arizona State University 

Michelle M. Espino,        University of Maryland 

Claudia García-Louis,        University of Texas, San Antonio

Deryl Hatch-Tocaimaza,        University of Nebraska-Lincoln

Nicholas Hillman,        University of Wisconsin-Madison

Cindy Ann Kilgo,        Indiana University-Bloomington

Judy Marquez Kiyama,  University of Arizona

Román Liera,        Montclair State University

Angela Locks,        California State University, Long Beach

Demetri L. Morgan,  Loyola University Chicago

Rebecca Natow,         Hofstra University 

Z Nicolazzo,        The University of Arizona

Elizabeth Niehaus,        University of Nebraska-Lincoln

Robert T. Palmer,        Howard University

Rosemary Perez,        University of Michigan

OiYan Poon,         Spencer Foundation 

Kelly Rosinger,        The Pennsylvania State University

Vanessa Sansone,         The University of Texas at San Antonio

Tricia Seifert,        Montana State University

Barrett Taylor,         University of North Texas 

Annemarie Vaccaro,  University of Rhode Island

Xueli Wang,        University of Wisconsin-Madison

Stephanie Waterman,         University of Toronto 

Rachelle Winkle-Wagner,         University of Wisconsin-Madison

Association for the Study of Higher Education Board of Directors

The Review of Higher Education is the journal of Association for the Study Higher Education (ASHE) and follows the ASHE Bylaws and Statement on Diversity. 

ASHE Board of Directors

Abstracting & Indexing Databases

  • Current Contents
  • Web of Science
  • Dietrich's Index Philosophicus
  • IBZ - Internationale Bibliographie der Geistes- und Sozialwissenschaftlichen Zeitschriftenliteratur
  • Internationale Bibliographie der Rezensionen Geistes- und Sozialwissenschaftlicher Literatur
  • Academic Search Alumni Edition, 9/1/2003-
  • Academic Search Complete, 9/1/2003-
  • Academic Search Elite, 9/1/2003-
  • Academic Search Premier, 9/1/2003-
  • Current Abstracts, 9/1/2003-
  • Education Research Complete, 3/1/1997-
  • Education Research Index, Sep.2003-
  • Education Source, 3/1/1997-
  • Educational Administration Abstracts, 3/1/1991-
  • ERIC (Education Resources Information Center), 1977-
  • MLA International Bibliography (Modern Language Association)
  • Poetry & Short Story Reference Center, 3/1/1997-
  • PsycINFO, 2001-, dropped
  • Russian Academy of Sciences Bibliographies
  • TOC Premier (Table of Contents), 9/1/2003-
  • Scopus, 1996-
  • Gale Academic OneFile
  • Gale OneFile: Educator's Reference Complete, 12/2001-
  • Higher Education Abstracts (Online)
  • ArticleFirst, vol.15, no.3, 1992-vol.35, no.2, 2011
  • Electronic Collections Online, vol.20, no.1, 1996-vol.35, no.2, 2011
  • Periodical Abstracts, v.26, n.4, 2003-v.33, n.3, 2010
  • PsycFIRST, vol.24, no.3, 2001-vol.33, no.1, 2009
  • Personal Alert (E-mail)
  • Education Collection, 7/1/2003-
  • Education Database, 7/1/2003-
  • Health Research Premium Collection, 7/1/2003-
  • Hospital Premium Collection, 7/1/2003-
  • Periodicals Index Online, 1/1/1981-7/1/2000
  • Professional ProQuest Central, 07/01/2003-
  • ProQuest 5000, 07/01/2003-
  • ProQuest 5000 International, 07/01/2003-
  • ProQuest Central, 07/01/2003-
  • Psychology Database, 7/1/2003-
  • Research Library, 07/01/2003-
  • Social Science Premium Collection, 07/01/2003-
  • Educational Research Abstracts Online
  • Research into Higher Education Abstracts (Online)
  • Studies on Women and Gender Abstracts (Online)

Abstracting & Indexing Sources

  • Contents Pages in Education   (Ceased)  (Print)
  • Family Index   (Ceased)  (Print)
  • Psychological Abstracts   (Ceased)  (Print)

Source: Ulrichsweb Global Serials Directory.

1.8 (2022) 3.2 (Five-Year Impact Factor) 0.00195 (Eigenfactor™ Score) Rank in Category (by Journal Impact Factor): 185 of 269 journals, in “Education & Educational Research”

© Clarivate Analytics 2023

Published quarterly

Readers include: Scholars, academic leaders, administrators, public policy makers involved in higher education, and all members of the Association for the Study of Higher Education (ASHE)

Print circulation: 761

Print Advertising Rates

Full Page: (4.75 x 7.5") - $450.00

Half Page: (4.75 x 3.5") - $338.00

2 Page Spread - $675.00

Print Advertising Deadlines

September Issue – July 15

December Issue – October 15

March Issue – January 15

June Issue – April 15

Online Advertising Rates (per month)

Promotion (400x200 pixels) – $338.00

Online Advertising Deadline

Online advertising reservations are placed on a month-to-month basis.

All online ads are due on the 20th of the month prior to the reservation.

General Advertising Info

For more information on advertising or to place an ad, please visit the Advertising page. 

eTOC (Electronic Table of Contents) alerts can be delivered to your inbox when this or any Hopkins Press journal is published via your ProjectMUSE MyMUSE account. Visit the eTOC instructions page for detailed instructions on setting up your MyMUSE account and alerts.  

Also of Interest

Cover image of Journal of College Student Development

Vasti Torres, Indiana University

Cover image of Feminist Formations

Patti Duncan, Oregon State University

Cover image of The Classical Journal

Georgia L. Irby, College of William & Mary

Cover image of Hispania

Benjamin Fraser, The University of Arizona

Cover image of The CEA Critic

Jeraldine Kraver, University of Northern Colorado; Peter Kratzke, University of Colorado, Boulder

Cover image of Bookbird: A Journal of International Children's Literature

Chrysogonus Siddha Malilang, Malmö University, Sweden

Cover image of The French Review

Carine Bourget, University of Arizona

Cover image of College Literature: A Journal of Critical Literary Studies

Megan Corbin, West Chester University

Cover image of Children's Literature Association Quarterly

Joseph Michael Sommers, Central Michigan University

Hopkins Press Journals

Hands holding a journal with more journals stacked in the background.

Peer review of teaching: A rapid appraisal

Peer review of teaching (PRT) as a quality enhancement and review process was first adopted by higher education institutions (HEIs) in the 1990s and was driven in part by Quality Assurance Agency for Higher Education (QAA) expectations. PRT serves two principal purposes: (i) to assure the institution and others of the provision of quality teaching and assessment and (ii) to improve teaching and assessment through the sharing of good practice staff support and enhancement of teaching practice.

While there are numerous definitions of PRT it is widely accepted that PRT is a purposeful process of collaboration between academics which provides constructive feedback on the effectiveness of interventions to promote student learning. PRT is generally applied more widely than classroom-based teaching and it encompasses all approaches used to support student learning while peer observation of teaching (POT) is used more specifically to describe observation of formal teaching in a classroom laboratory workplace or fieldwork setting.

This rapid appraisal of peer review of teaching aims to examine the processes and schemes currently in operation in different disciplines across the expanded and differentiated structure of UK higher education (HE). The rapid appraisal includes a non-systematic review of the literature a survey of UK HEIs a review of HEI policies on PRT and a series of short telephone interviews with key informants to identify innovative approaches to peer review. 

hea_report_peer_review_of_teaching_and_rapid_appraisal_5.docx

The materials published on this page were originally created by the Higher Education Academy.

©Advance HE 2020. Company limited by guarantee registered in England and Wales no. 04931031 | Company limited by guarantee registered in Ireland no. 703150 | Registered charity, England and Wales 1101607 | Registered charity, Scotland SC043946 | VAT Registered number GB 152 1219 50. Registered UK Address: Advance HE, Innovation Way, York Science Park, Heslington, York, YO10 5BR, United Kingdom | Registered Ireland Address: Advance HE, First Floor, Penrose 1, Penrose Dock, Cork, T23 Kw81, Ireland.

Univeristy of Pittsburgh - Home Page

University Center for Teaching and Learning

Peer review.

Below, review the Assessment of Teaching Initiative’s Guide on Conducting Faculty Peer Review.

The Purposes and Benefits of Peer Review

Peer review may consist of any combination of the following assessments performed by a faculty member’s colleague(s):

  • Review of curriculum
  • Review of teaching materials (e.g. syllabi, lesson plans, assignments, course shells)
  • Review of student artifacts (e.g. examples of student work)
  • Review of a teaching portfolio
  • Classroom observations

Research has shown that peer review has several benefits:

  • It provides another perspective on teaching effectiveness beyond student opinion of teaching surveys. Unlike students, peer reviewers are often experts in the faculty member’s discipline and/or pedagogy (Berk et al., 2004).
  • It helps faculty better recognize areas of teaching strengths and weakness and determine how to improve (Al Qahatani et al, 2011).
  • Faculty report that participating in peer review is useful and helps them improve their teaching (Bell & Mladenovic, 2008; DiVall et al. 2012) and that the benefits outweigh the effort of participating (DiVall et al., 2012)
  • Peer observations benefit the observer as well as the faculty member being observed (Bell & Mladenovic, 2008; Hendry & Oliver, 2012; Swinglehurst et al., 2008).
  • Observers have reported that observations help them learn about new teaching strategies and increase their confidence in their abilities to perform new teaching strategies themselves in class (Hendry & Oliver, 2012).

Peer review may be performed formatively to provide a faculty member with feedback to help them improve, or summatively , as part of a formal evaluation like annual, promotion, or tenure review.

The purpose, process, tools, and expectations for peer reviews should always be communicated clearly to faculty early enough to allow them to prepare.

Considerations for Developing a Peer Review Process

Whose teaching will be reviewed when.

Academic units should establish review cycles that balance providing faculty with regular feedback with feasibility and sustainability. Review cycles may vary based on faculty rank, appointment, experience teaching the course, or experience teaching at Pitt. For example, academic units may decide to conduct formative peer reviews of teaching during the first semester for new faculty or faculty teaching new courses in order to provide feedback that can be used for improvement, but conduct summative peer reviews for experienced faculty teaching established courses less frequently.

Units may also opt to design peer review cycles to coincide or integrate with existing schedules for curricular review or assessment of student learning. Some review processes like syllabi review, for example, can generate data for multiple types of reviews.

Units that are conducting new peer review processes should plan on piloting peer review, collecting reviewer and reviewee feedback, and making revisions as needed.

Who will conduct the review? What will be reviewed?

The purpose of peer review should help units determine who conducts them. Units should be cognizant of the fact that summative peer reviews are higher stakes and may be more anxiety-provoking for faculty. Faculty may be more concerned about how bias affects summative peer review (Berk et al., 2004). Academic units may address these concerns by:

  • Developing peer review tools that focus more on providing qualitative feedback than ratings
  • Training reviewers to recognize potential bias and apply review tools as equitably and consistently as possible
  • Assigning reviewers from a different department within the same school
  • Using review teams rather than a single reviewer
  • Inviting the faculty member who was reviewed to respond to feedback as part of the review process

Regardless of whether the review is formative or summative, all reviewers should receive training on how to conduct reviews, use review tools, and how to provide their colleagues with meaningful, constructive feedback. In addition to increasing reviewer competence, conducting training can also increase faculty trust in reviewers (Kohut et al., 2007).

Units should also determine whether peer review consists of an assessment of teaching materials, student artifacts, and/or a classroom observation. Will reviews of teaching materials and observations be part of the same processes or different processes? For example, units may specify that a curriculum committee is responsible for review of syllabi, a promotion and tenure committee will conduct evaluations or teaching portfolios, and that peer observations should be performed by a faculty mentor or colleague. Conversely, review of teaching materials and observations can be combined into a single peer review process completed by the same reviewer or team of reviewers.

What is the process for peer review of teaching?

Faculty should be engaged to help create and revise specific peer review processes, policies, and tools to ensure that they are feasible, relevant, and equitable (Bingham & Ottewill, 2001).

Most peer reviews broadly consist of these steps:

Preliminary activities : Prior to the review, reviewers should talk to the faculty member being reviewed to learn more about their teaching and learning goals and to provide some context for the review. If performing an observation, a reviewer may ask about:

  • the course, course description, and learning objectives
  • how the course is delivered (e.g. lecture, recitation, lab or clinical, face-to-face, hybrid, online, flipped)
  • the number of students
  • what students have been doing in the course prior to the observation being performed
  • the lesson plan or agenda for the session to be observed
  • challenges that the reviewee may have experienced teaching the class up to that point or areas the reviewee would like the reviewer to focus on during the observation

These conversations may take place synchronously or asynchronously and face-to-face or remotely. Academic units should determine what type of information reviewers will need beforehand to conduct a successful review and provide them with questions to ask or types of information to seek out.

The peer review : The reviewer(s) apply a peer review tool to teaching materials and/or a classroom observation. The review may consist of more than one step, based on the review process. Depending on the tool being used, there may be a specific protocol for reviewers to follow. Reviewers should take notes that are detailed enough to allow them to cite specific examples to support their feedback. Academic units should create resources and training to help faculty learn how to complete reviews.

Follow-up activities : Reviewer(s) should conduct a debrief or follow-up to discuss and share the results of their review with the faculty member who was reviewed and potentially, others in the department. Depending on the type of peer review, this might consist of a short, informal conversation or could involve a formal reporting process. It may involve the reviewee self-assessing or reflecting on their teaching and/or documenting the steps they will take to improve teaching based on feedback. This meeting should take place as soon after the peer review as possible. If a peer observation was performed, the follow-up meeting should occur within the same week.

Sample Tools

Although there is always some degree of subjectivity involved in conducting peer review, using tools that identify specific, observable criteria helps reviewers apply tools in a consistent manner. To ensure validity, peer review tools must align with the academic unit’s definition of teaching effectiveness. Units should engage faculty in the process of selecting, constructing, and revising tools.

Lists of question prompts, checklists, and rubrics can be used to guide peer review. Academic units may want to adopt or adapt existing tools or create their own. Tools may vary depending on the purpose of the assessment and the course or rank/appointment of the faculty member being assessed. For example, an academic unit may decide to develop a different observation tool for didactic versus clinical courses.

Sample Peer Review Tools (validated tools and examples from other institutions)

  • UC Berkley Peer Review of Course Instruction (opens in a new tab)
  • Peer Observation and Assessment Tool (MS Word – 134KB) (POET)
  • Reformed Teaching Observation Protocol (RTOP) (PDF – 18KB)
  • Classroom Observation Protocol for Undergraduate STEM (COPUS) (PDF – 123KB)
  • Harvard Medical School Peer Observation of Teaching Handbook (PDF – 550KB)
  • University of Minnesota Peer Review Tools (opens in new tab)
  • University of Colorado Boulder Teaching Observation Protocol Template (MS Word – 27.5KB)
  • Elon University’s Narrative Log Peer Review Tool (DOCX – 34KB)

Sample Tools for Self and Peer Assessment of Online Course Design and Teaching

  • SUNY’s Online Course Quality Review Rubric (OSCQR)
  • Evaluation of Online Courses/Teaching in the Department of Clinical Sciences at Colorado State University (PDF – 670KB)
  • Illinois Central College’s Quality Online Course Initiative Guide and Rubric (opens in a new tab)
  • Penn State’s Peer Review of Online Teaching Guides and Rubrics (opens in a new tab)

Resources for Inclusive Peer Review

New research and resources on conducting peer review of inclusive teaching are constantly emerging. Here are some examples of ongoing projects:

  • Teaching in Psych at the University of Pittsburgh
  • The Teaching in Higher Ed podcast interview of Dr. Tracie Addy on Inclusive Teaching Visualization and Observation , the  Inclusive Teaching Visualization Project website , which includes video vignettes that can be used for observation training, and the Protocol for Advancing Inclusive Teaching Efforts (PAITE) inclusive teaching observation protocol . PAITE is unique because it is designed to be used by trained student observers, but faculty (alone or in conjunction with student observers) could also use the tools.

Tips for Offering Peer Review Feedback

For peer review to be effective, faculty need to receive meaningful, constructive feedback from a reviewer. The method and manner of delivery affects how feedback is received. There are several steps that reviewers can take (adapted from Newman et al., 2012) to offer effective feedback:

  • Approach giving feedback as a collaborative problem-solving experience that will allow both you and the person you reviewed to learn more about teaching. Avoid positioning yourself as an expert giving advice to a novice unless that is explicitly the purpose and process for the peer review (e.g. a mentor observing a mentee).
  • Ask the reviewee to share their thoughts about the experience. What do they identify as strengths and areas of improvement? Which activities went well? Which would they do differently next time? This prompts reflection and allows you to build your feedback on their self-reflection. It also gives the reviewee the chance to correct anything that you misperceived during your review.
  • Start with positive feedback. Receiving feedback can be an anxiety-producing process for the recipient. Beginning by discussing strengths can lower anxiety and increase the likelihood that the reviewee will be engaged and receptive to additional feedback.
  • When delivering constructive feedback, avoid judgment. Offer examples of observed behaviors. Be specific and improvement-focused. Vague comments often do not give the reviewee enough information to make improvements. For example, if you say, “Students gradually became less engaged,” that might be true, but it does not give the reviewee sufficient information to make changes. If you say, “I noticed that students started to become less engaged after you had been lecturing for about 20 minutes. They were more engaged during the shorter lectures and class discussions. It might be helpful to break up longer lectures with some short active learning strategies,” you are telling the reviewee when and (likely) why students became disengaged and what they could do differently to prevent that from happening in the future.
  • It can also be helpful to deliver constructive feedback as questions to help the faculty member who was reviewed reflect on what they did and consider how they might make changes. For example, instead of, “The tone of this syllabus is harsh and would be off-putting to students” you could say, “When reviewing your syllabus, I noticed that the language used did not align with what you told me about your teaching style. Can you tell me more about why you chose that language and what you are trying to communicate?”
  • Focus suggestions for improvement on things the reviewee can change. For example, if you tell a large lecture faculty member to move away from multiple choice exams as their primary mode of assessment, that might not be feasible given the class size.

Depending on the purpose of the peer review, it may also be appropriate to limit constructive feedback to a few of the most impactful changes that the reviewee could make rather than pointing out every perceived area for improvement.

You might conduct a peer review and determine that the reviewee would benefit from sustained resources and support. In these cases, you can refer them to faculty development resources in your academic unit or to the Center for Teaching and Learning ( [email protected] ).

How will results be shared? With whom?

How and with whom results are communicated may depend on the purpose of the peer review and how the data they generate are used. Academic units may decide that formative peer reviews should remain confidential, with results only being shared with the reviewee. Summative review results may be shared with academic unit leaders, promotion and tenure review committees, or other leaders in the department so that they can be used for formal evaluations or to inform program- or unit-level decision-making.

Academic units will also need to determine what type of artifact is ultimately shared. Will the reviewer generate a narrative statement, a completed checklist or rubric, lists of strengths and areas for improvement, or a report? Will the faculty member compose a response? Will aggregate data be shared with unit leaders? Units may adopt some combination of these approaches. Focusing on how the faculty member who was reviewed will use results may reduce faculty anxiety about peer review and encourage iterative improvement of teaching. For example, a unit may decide that detailed feedback should remain confidential between the reviewer and reviewee, but that the reviewee should submit a statement documenting how they used feedback to improve their teaching as part of a teaching portfolio.

References, Resources, and Readings

Al Qahtani, S., Kattan, T., Al Harbi, K., Seefeldt, M. (2011). Some thoughts on educational peer evaluation (opens in new tab) . South-East Asian Journal of Medical Education, 5 (1), 47–49.

Bell, A. & Mladenovic, R. (2008). The benefits of peer observation of teaching for tutor development. Higher Education, 55(6). doi: 10.1007/s10734-007-9093-1 (opens in new tab)

Berk, R.A., Naumann, P.L., & Appling, S.A. (2004). Beyond student ratings: Peer observation of classroom and clinical teaching. International Journal of Nursing Education Scholarship, 1 (1).doi: 10.2202/1548-923x.1024 (opens in new tab)

Bingham, R., Ottewill, R. (2001). Whatever happened to peer review? Revitalising the contribution of tutors to course evaluation. Quality Assurance in Education , 9, 32–39. doi:10.1108/09684880110381319 (opens in new tab)

DiVall, M., Barr, J., Gonyeau, M., Matthews, S. J., Van Amburgh, J., Qualters, D., & Trujillo, J. (2012). Follow-up assessment of a faculty peer observation and evaluation program. American Journal of Pharmaceutical Education, 76 (4). doi: 10.5688/ajpe76461 (opens in new tab)

Gosling, D. (2002). Models of peer observation of teaching. ITSN Generic Centre Learning and Teaching Support Network.

Hendry, G.D. & Oliver, G.R. (2012). Seeing is believing: The benefits of peer observation . Journal of University Teaching & Learning Practice, 9 (1). (PDF – 114KB)

Kohut, G. F., Burnap, C., Yon, M. G. (2007). Peer observation of teaching: Perceptions of the observer and the observed. College Teaching , 55, 19–25. doi:10.3200/CTCH.55.1.19-25 (opens in new tab)

Kuo, F., Crabtree, J. L., & Scott, P. J. (2016). Peer observation and evaluation tool (POET): A formative peer review supporting scholarly teaching.  The Open Journal of Occupational Therapy, 4 (3), doi:10.15453/2168-6408.1273 (PDF – 887KB)

Lund, T. J., Pilarz, M., Velasco, J. B., Chakraverty, D., Rosploch, K., Undersander, M., & Stains, M. (2015). The best of both worlds: Building on the COPUS and RTOP observation protocols to easily and reliably measure various levels of reformed instructional practice.  CBE Life Sciences Education, 14 (2), doi:10.1187/cbe.14-10-0168 (opens in new tab)

Marchant, G. J. (1989).  StRoBe: A classroom-on-task measure . (opens in new tab)

Newman, L.R., Roberts, D.H., & Schwartzstein, R.M. (2012). Peer observation of teaching handbook . Shapiro Institute for Education and Research at Harvard Medical School and Beth Israel Deaconess Medical Center. (PDF – 543KB)

O’Leary, M. (2020). Classroom Observation: A Guide to the Effective Observation of Teaching and Learning. Taylor and Francis.

Pembridge, J.J. & Rohrbacher, C.M. (2020). Faculty peer review of teaching for the 21 st century. In S.M. Linder, C.M. Lee, & S. K. Stefl (Eds.), Handbook of STEM faculty development (pp. 207-220). Information Age Publishing .

Smith, M. K., Francis H. M. Jones, Gilbert, S. L., & Wieman, C. E. (2013). The classroom observation protocol for undergraduate STEM (COPUS): A new instrument to characterize university STEM classroom practices.  CBE-Life Sciences Education, 12 (4), 618-627. doi:10.1187/cbe.13-08-0154 (opens in new tab)

Swinglehurst, D., Russell, J. & Greenhalgh, R. (2008). Peer observation of teaching in the online environment: an action research approach (opens in new tab) . Journal of Computer Assisted Learning, 24 (5), 383-393. doi:10.1111/j.1365-2729.2007.00274.x

Van Note Chism, N. (1999). Peer review of teaching: A sourcebook (opens in new tab) .

  • Provost’s Diversity Institute for Faculty Development
  • Generative AI Resources for Faculty
  • Student Communication & Engagement Resource Hub
  • Important dates for the summer term  
  • Summer Term Finals Assessment Strategies
  • Importing your summer term grades to PeopleSoft
  • Enter your summer term grades in Canvas
  • Alternative Final Assessment Ideas
  • Not sure what you need?
  • Accessibility Resource Hub
  • Assessment Resource Hub
  • Canvas and Ed Tech Support
  • Center for Mentoring
  • Creating and Using Video
  • Diversity, Equity and Inclusion
  • General Pedagogy Resource Hub
  • Graduate Student/TA Resources
  • Remote Learning Resource Hub
  • Syllabus Checklist
  • Student Communication and Engagement
  • Technology and Equipment
  • Classroom & Event Services
  • Assessment of Teaching
  • Classroom Technology
  • Custom Workshops
  • Open Lab Makerspace
  • Pedagogy, Practice, & Assessment
  • Need something else? Contact Us
  • Educational Software Consulting
  • Learning Communities
  • Makerspaces and Emerging Technology
  • Mentoring Support
  • Online Programs
  • Teaching Surveys
  • Testing Services
  • Classroom Recordings and Lecture Capture
  • Creating DIY Introduction Videos
  • Media Creation Lab
  • Studio & On-Location Recordings
  • Video Resources for Teaching
  • Assessment and Teaching Conference
  • Diversity Institute
  • New Faculty Orientation
  • New TA Orientation
  • Teaching Center Newsletter
  • Meet Our Team
  • About the Executive Director
  • Award Nomination Form
  • Award Recipients
  • About the Teaching Center
  • Annual Report
  • Join Our Team

ACM Digital Library home

  • Advanced Search

A Review of Peer Code Review in Higher Education

School of Computer Science, The University of Auckland, New Zealand and Department of Informatics, Universitas Atma Jaya Yogyakarta, Indonesia

Author Picture

School of Computer Science, The University of Auckland, New Zealand

Author Picture

  • 31 citation

New Citation Alert added!

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

  • Publisher Site

ACM Transactions on Computing Education

ACM Digital Library

Peer review is the standard process within academia for maintaining publication quality, but it is also widely employed in other settings, such as education and industry, for improving work quality and for generating actionable feedback to content authors. For example, in the software industry peer review of program source code—or peer code review—is a key technique for detecting bugs and maintaining coding standards. In a programming education context, although peer code review offers potential benefits to both code reviewers and code authors, individuals are typically less experienced, which presents a number of challenges. Some of these challenges are similar to those reported in the educational literature on peer review in other academic disciplines, but reviewing code presents unique difficulties. Better understanding these challenges and the conditions under which code review can be taught and implemented successfully in computer science courses is of value to the computing education community. In this work, we conduct a systematic review of the literature on peer code review in higher education to examine instructor motivations for conducting peer code review activities, how such activities have been implemented in practice, and the primary benefits and difficulties that have been reported. We initially identified 187 potential studies and analyzed 51 empirical studies pertinent to our goals. We report the most commonly cited benefits (e.g., the development of programming-related skills) and barriers (e.g., low student engagement), and we identify a wide variety of tools that have been used to facilitate the peer code review process. While we argue that more empirical work is needed to validate currently reported results related to learning outcomes, there is also a clear need to address the challenges around student motivation, which we believe could be an important avenue for future research.

Index Terms

Social and professional topics

Professional topics

Computing education

Recommendations

Improving student peer code review using gamification.

Peer code review has been shown to have several benefits for students, including the development of both technical skills and soft skills. However, a lack of motivation has been identified as one of the barriers to successful peer code review in ...

Investigating Accuracy and Perceived Value of Feedback in Peer Code Review Using Gamification

The practice of peer code review has been shown to deliver a variety of benefits to programming students. These include learning from producing and receiving feedback, and from being exposed to a range of problem-solving approaches and solutions. ...

Code review quality: how developers see it

In a large, long-lived project, an effective code review process is key to ensuring the long-term quality of the code base. In this work, we study code review practices of a large, open source project, and we investigate how the developers themselves ...

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

  • Information
  • Contributors

Published in

cover image ACM Transactions on Computing Education

Washington State University, USA

Copyright © 2020 ACM

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Association for Computing Machinery

New York, NY, United States

Publication History

  • Published: 9 September 2020
  • Accepted: 1 June 2020
  • Revised: 1 May 2020
  • Received: 1 January 2020

Permissions

Request permissions about this article.

Check for updates

Author tags.

  • Peer review
  • code review
  • higher education
  • peer code review
  • programming course
  • systematic literature review
  • systematic review
  • research-article

Funding Sources

Other metrics.

  • Bibliometrics
  • Citations 31

Article Metrics

  • 31 Total Citations View Citations
  • 1,124 Total Downloads
  • Downloads (Last 12 months) 316
  • Downloads (Last 6 weeks) 55

View or Download as a PDF file.

View online with eReader.

Digital Edition

View this article in digital edition.

HTML Format

View this article in HTML Format .

Share this Publication link

https://dl.acm.org/doi/10.1145/3403935

Share on Social Media

  • 0 References

Export Citations

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

higheredu-logo

Article Menu

peer review higher education

  • Subscribe SciFeed
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

The relationship between self- and peer assessment in higher education: a systematic review, 1. introduction, 2. materials and methods, 2.1. inclusion criteria.

  • The documents should have been published in peer-reviewed scientific journals;
  • They had to be published in the last decade (from 2011 to 2022);
  • The terms “higher education/university”, “assessment strategies”, “self-assessment”, and “peer assessment” should be included in the keywords or in the main topics of the documents;
  • The documents should be written in English.

2.2. Searching Process

2.3. analysis of the documents.

  • An analysis of the frequencies of the mentioned features, in order to examine the main trends in current research on the topic;
  • An examination of the main research topics that appear in the literature, in order to answer the three research questions.

3.1. Years of Publication

3.2. nationality of the research, 3.3. type of document, 3.4. main themes in the documents.

  • Papers mainly focused on self-assessment and its core features ( n = 3);
  • Papers mainly focused on peer assessment and its core features ( n = 7);
  • Papers mainly focused on the correlations between peer assessment and self-assessment ( n = 20).

3.4.1. Self-Assessment and Its Core Features

3.4.2. peer assessment and its core features, 3.4.3. correlations between peer assessment and self-assessment, 4. discussion, 5. conclusions, institutional review board statement, informed consent statement, conflicts of interest.

  • Concina, E. Educational Outcomes Assessment and Validity Testing. In Quality Education, Encyclopedia of the UN Sustainable Development Goals ; Leal Filho, W., Azul, A., Brandli, L., Özuyar, P., Wall, T., Eds.; Springer Nature: London, UK, 2019; pp. 284–294. [ Google Scholar ]
  • Black, P.; Wiliam, D. Assessment for Learning in the Classroom. In Assessment and Learning , 2nd ed.; Gardner, J., Ed.; Sage: London, UK, 2012; pp. 11–32. [ Google Scholar ]
  • Jonsson, A. Facilitating productive use of feedback in higher education. Act. Learn. High. Educ. 2013 , 14 , 63–76. [ Google Scholar ] [ CrossRef ]
  • Bailey, R.; Garner, M. Is the feedback in higher education assessment worth the paper it is written on? Teachers’ reflections on their practices. Teach. High. Educ. 2010 , 15 , 187–198. [ Google Scholar ] [ CrossRef ]
  • McConlogue, T. Assessment and Feedback in Higher Education: A Guide for Teachers ; UCL Press: London, UK, 2020. [ Google Scholar ]
  • Winstone, N.E.; Boud, D. The need to disentangle assessment and feedback in higher education. Stud. High. Educ. 2020 , 47 , 656–667. [ Google Scholar ] [ CrossRef ]
  • McDonald, B. Peer Assessment that Works: A Guide for Teachers ; Rowman & Littlefield: Lanham, MD, USA, 2015. [ Google Scholar ]
  • Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Syst. Rev. 2021 , 10 , 89. [ Google Scholar ] [ CrossRef ]
  • Adachi, C.; Tai, J.H.-M.; Dawson, P. Academics’ perceptions of the benefits and challenges of self and peer assessment in higher education. Assess. Eval. High. Educ. 2018 , 43 , 294–306. [ Google Scholar ] [ CrossRef ]
  • Bourke, R. Self-assessment to incite learning in higher education: Developing ontological awareness. Assess. Eval. High. Educ. 2018 , 43 , 827–839. [ Google Scholar ] [ CrossRef ]
  • Bozkurt, F. Teacher Candidates’ Views On Self And Peer Assessment As A Tool For Student Development. Aust. J. Teach. Educ. 2020 , 45 , 47–60. [ Google Scholar ] [ CrossRef ]
  • Cheong, C.M.; Luo, N.; Zhu, X.; Lu, Q.; Wei, W. Self-assessment complements peer assessment for undergraduate students in an academic writing task. Assess. Eval. High. Educ. 2022 , 1–14. [ Google Scholar ] [ CrossRef ]
  • Cirit, D.K. An Analysis of Self-, Peer-, and Teacher-Assessment within the Scope of Classroom Teaching Activities. Shanlax Int. J. Educ. 2021 , 9 , 150–163. [ Google Scholar ] [ CrossRef ]
  • Gonzalez de Sande, J.C.; Godino Llorente, J.I. Peer Assessment and Self-assessment: Effective Learning Tools in Higher Education. Int. J. Eng. Educ. 2014 , 30 , 711–721. [ Google Scholar ]
  • González-Betancor, S.M.; Bolívar-Cruz, A.; Verano-Tacoronte, D. Self-assessment accuracy in higher education: The influence of gender and performance of university students. Act. Learn. High. Educ. 2019 , 20 , 101–114. [ Google Scholar ] [ CrossRef ]
  • Gunning, T.K.; Conlan, X.A.; Collins, P.K.; Bellgrove, A.; Antlej, K.; Cardilini, A.P.A.; Fraser, C.L. Who engaged in the team-based assessment? Leveraging EdTech for a self and intra-team peer-assessment solution to free-riding. Int. J. Educ. Technol. High. Educ. 2022 , 19 , 38. [ Google Scholar ] [ CrossRef ]
  • Kearney, S. Transforming the first-year experience through self and peer assessment. J. Univ. Teach. Learn. Pr. 2019 , 16 , 20–35. Available online: https://ro.uow.edu.au/jutlp/vol16/iss5/3 (accessed on 31 October 2022). [ CrossRef ]
  • Kiliç, D. An Examination of Using Self-, Peer-, and Teacher-Assessment in Higher Education: A Case Study in Teacher Education. High. Educ. Stud. 2016 , 6 , 136. [ Google Scholar ] [ CrossRef ]
  • Ma, N.; Du, L.; Lu, Y.; Sun, Y.-F. The influence of social network prestige on in-service teachers’ learning outcomes in online peer assessment. Comput. Educ. Open 2022 , 3 , 100087. [ Google Scholar ] [ CrossRef ]
  • Makovskaya, L. Towards Sustainable Assessment in Higher Education: Teachers’ and Students’ Perspectives. Discourse Commun. Sustain. Educ. 2022 , 13 , 88–103. [ Google Scholar ] [ CrossRef ]
  • Misiejuk, K.; Wasson, B. Backward evaluation in peer assessment: A scoping review. Comput. Educ. 2021 , 175 , 104319. [ Google Scholar ] [ CrossRef ]
  • Nawas, A. Grading anxiety with self and peer-assessment: A mixed-method study in an Indonesian EFL context. Issues Educ. Res. 2020 , 30 , 224–244. [ Google Scholar ]
  • Ndoye, A. Peer/Self-Assessment and Student Learning. Int. J. Teach. Learn. High. Educ. 2017 , 29 , 255–269. [ Google Scholar ]
  • Nulty, D.D. Peer and self-assessment in the first year of university. Assess. Eval. High. Educ. 2011 , 36 , 493–507. [ Google Scholar ] [ CrossRef ]
  • Panadero, E.; Alqassab, M. An empirical review of anonymity effects in peer assessment, peer feedback, peer review, peer evaluation and peer grading. Assess. Eval. High. Educ. 2019 , 44 , 1253–1278. [ Google Scholar ] [ CrossRef ]
  • Planas-Lladó, A.; Feliu, L.; Arbat, G.; Pujol, J.; Suñol, J.J.; Castro, F.; Martí, C. An analysis of teamwork based on self and peer evaluation in higher education. Assess. Eval. High. Educ. 2021 , 46 , 191–207. [ Google Scholar ] [ CrossRef ]
  • Ratminingsih, N.M.; Artini, L.P.; Padmadewi, N.N. Incorporating Self and Peer Assessment in Reflective Teaching Practices. Int. J. Instr. 2017 , 10 , 165–184. [ Google Scholar ] [ CrossRef ]
  • Rico-Juan, J.R.; Cachero, C.; Macià, H. Influence of individual versus collaborative peer assessment on score accuracy and learning outcomes in higher education: An empirical study. Assess. Eval. High. Educ. 2022 , 47 , 570–587. [ Google Scholar ] [ CrossRef ]
  • Seifert, T.; Feliks, O. Online self-assessment and peer-assessment as a tool to enhance student-teachers’ assessment skills. Assess. Eval. High. Educ. 2019 , 44 , 169–185. [ Google Scholar ] [ CrossRef ]
  • Serrano-Aguilera, J.; Tocino, A.; Fortes, S.; Martín, C.; Mercadé-Melé, P.; Moreno-Sáez, R.; Muñoz, A.; Palomo-Hierro, S.; Torres, A. Using Peer Review for Student Performance Enhancement: Experiences in a Multidisciplinary Higher Education Setting. Educ. Sci. 2021 , 11 , 71. [ Google Scholar ] [ CrossRef ]
  • Simonsmeier, B.A.; Peiffer, H.; Flaig, M.; Schneider, M. Peer Feedback Improves Students’ Academic Self-Concept in Higher Education. Res. High. Educ. 2020 , 61 , 706–724. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Stančić, M. Peer assessment as a learning and self-assessment tool: A look inside the black box. Assess. Eval. High. Educ. 2021 , 46 , 852–864. [ Google Scholar ] [ CrossRef ]
  • To, J.; Panadero, E. Peer assessment effects on the self-assessment process of first-year undergraduates. Assess. Eval. High. Educ. 2019 , 44 , 920–932. [ Google Scholar ] [ CrossRef ]
  • Tait-McCutcheon, S.; Knewstubb, B. Evaluating the alignment of self, peer and lecture assessment in an Aotearoa New Zealand pre-service teacher education course. Assess. Eval. High. Educ. 2018 , 43 , 772–785. [ Google Scholar ] [ CrossRef ]
  • Tsunemoto, A.; Trofimovich, P.; Blanchet, J.; Bertrand, J.; Kennedy, S. Effects of benchmarking and peer-assessment on French learners’ self-assessments of accentedness, comprehensibility, and fluency. Foreign Lang. Ann. 2022 , 55 , 135–154. [ Google Scholar ] [ CrossRef ]
  • Wanner, T.; Palmer, E. Formative self-and peer assessment for improved student learning: The crucial factors of design, teacher participation and feedback. Assess. Eval. High. Educ. 2018 , 43 , 1032–1047. [ Google Scholar ] [ CrossRef ]
  • Yang, A.C.; Chen, I.Y.; Flanagan, B.; Ogata, H. How students’ self-assessment behavior affects their online learning performance. Comput. Educ. Artif. Intell. 2022 , 3 , 100058. [ Google Scholar ] [ CrossRef ]
  • Zhan, Y.; Wan, Z.H.; Sun, D. Online formative peer feedback in Chinese contexts at the tertiary Level: A critical review on its design, impacts and influencing factors. Comput. Educ. 2022 , 176 , 104341. [ Google Scholar ] [ CrossRef ]
  • Sullivan, K.; Hall, C. Introducing Students to Self-assessment. Assess. Eval. High. Educ. 1997 , 22 , 289–305. [ Google Scholar ] [ CrossRef ]
  • Boud, D. The Role of Self-Assessment in Student Grading. Assess. Eval. High. Educ. 1989 , 14 , 20–30. [ Google Scholar ] [ CrossRef ]
  • Boud, D.; Falchikov, N. Quantitative studies of student self-assessment in higher education: A critical analysis of findings. High. Educ. 1989 , 18 , 529–549. [ Google Scholar ] [ CrossRef ]
AuthorsYearNationalityType of DocumentMain TopicMain Findings
Adachi, C., Hong-Meng Tai, J., and Dawson, P.
[ ]
2018AustraliaResearch paper (qualitative)Benefits and challenges of self- and peer assessment perceived by university educatorsReferred benefits include the enhancement of the learning process; the promotion of transferable skills, cooperation, and self-regulated learning; the possibility to have more feedback. Perceived challenges focus on the need for more time and effort, the role of teachers’ and students’ motivation, the risk of underestimating the importance of assessment and learning, and the difficulties connected with online learning environments.
Bourke, R. [ ]2018New ZealandResearch paper (qualitative)Defining and testing some self-assessment tasks from the perspective of studentsFor students, it is not easy to engage in self-assessment tasks, but these are very useful skills for helping them develop self-regulation in their learning. These tasks also focus students’ attention more on the quality of the learning process than on the grades that they are awarded for their performance.
Bozkurt, F. [ ]2020TurkeyResearch paper (qualitative)Examining the beliefs that teacher candidates have about peer and self-assessment Self- and peer assessments are viewed as processes that can support students’ autonomous learning, social exchange, and critical thinking. They are not only useful for better understanding teacher assessment but also for improving the individual process of learning.
Cheong C.M., Luo N., Zhu X., Lu Q., and Wei W.
[ ]
2022ChinaResearch paper (mixed methods)The relationship between self- and peer assessment in higher education: the way in which self-assessment integrates peer assessmentSelf-assessment completes feedback from peers: giving specific suggestions when peers’ indications are lacking or are too generic; offering a different perspective (individual vs. others); correcting the effect of the “social desirability” bias in peer assessment; offering more support for high-achiever students.
Cirit, D.K. [ ]2021TurkeyResearch paper (mixed methods)Possible correlations among self-, peer, and teacher assessments for university studentsThere were no correlations found for self- and teacher assessment, nor for peer and teacher assessment. In most cases, there was a correlation between peer and self-assessment.
González de Sande, J.C. and Godino-Llorente, J.I.
[ ]
2014SpainResearch paper (quantitative)Effectiveness of self- and peer assessment in enhancing the learning process in higher educationComparing the different assessment methods, peer assessment seems more effective in enhancing students’ performance than self-assessment. In addition, both of them seem to be more effective and useful than the instructor assessment only. Conversely, students believe that self-assessment can be more useful for them than peer assessment.
González-Betancor, S.M., Bolívar-Cruz, A., and Verano-Tacoronte, D.
[ ]
2019SpainResearch paper (quantitative)Relationship between the accuracy of self-assessment feedback and gender, and the performance results of university studentsIn self-assessment, students tend to give themselves higher scores than those given by the instructors for their performance. Male students tend to evaluate themselves more highly than their female classmates; in addition, male students tend to overrate their own performance relative to the scores awarded by both peers and teachers.
Gunning, T.K. et al.
[ ]
2022AustraliaResearch paper (mixed methods)Evaluation of online strategies for promoting effective self- and intra-team peer assessment in higher education.The online strategy developed for enhancing self- and peer assessment in collaborative tasks in STEM disciplines has been effective in supporting students in these processes.
Kearney, S.
[ ]
2019AustraliaResearch paper (mixed methods)Evaluation of a pedagogical model for self- and peer assessment for helping students in the transition from secondary school to university.The pedagogical model proposed helps students to better understand the teaching–learning process, offering them the opportunity to use assessment feedback for revising and improving their learning.
Kılıç, D.
[ ]
2016TurkeyResearch paper (quantitative)Self- and peer assessment in teachers’ education. Correlation between self-, peer and teacher assessment scores.Peer-assessment scores are higher than self- and teacher-assessment scores (which were almost equivalent). The role of the integration of these three processes (self-, peer, and teacher assessment) is to promote formative assessment for preservice teachers.
Ma, N., Du, L., Lu, L., and Sun, Y.F.
[ ]
2022ChinaResearch paper (mixed methods)Differences in peer assessment, considering the learners’ characteristicsHigh achievers (HAs) and low achievers (LAs) show different levels of reflection in peer assessment: while LAs reflect first on peers’ feedback and then engage in self-reflection, HAs reflect on others’ performance and on the feedback they received as a result of peer-assessment practices.
Makovskaya, L. [ ]2022UzbekistanResearch paper (qualitative)Teachers’ and students’ opinions about sustainable assessement practices (including peer and self- assessement tasks)There are contrasting opinions regarding self- and peer assessment. Some participants find them difficult to perform and not as useful in the current forms, although others recognize their crucial role in enhacing students’ learning. Probably, most difficulties are associated with the lack of competencies for accomplishing peer and self-evaluation tasks.
Misiejuk, K. and Wasson, B.
[ ]
2021NorwayScoping literature reviewBackward evaluation in peer assessmentBackward evaluation may have a relevant role in peer assessment since it allows students to evaluate the usefulness and the quality of peer feedback, developing their own assessment skills.
Nawas, A. [ ]2020AustraliaResearch paper (mixed methods)Students’ anxiety in peer and self-assessment tasksStudents tend to be more anxious about self-assessment than about peer assessment. Assessment tools may be useful for reducing stress among students.
Ndoye, A. [ ]2017USResearch paper (qualitative)How students believe that self- and peer assessment may affect their learning processStudents recognize the importance of constructive feedback from themselves and others. In addition, several social benefits may be derived from peer assessement.
Nulty, D.D.
[ ]
2011AustraliaLiterature reviewUse and effectiveness of peer and self-assessment in the first year of universityFindings from the literature show that the use of self- and peer assessment in the first year of university has several benefits; however, these processes should be specifically introduced and presented to the students.
Panadero, E. and Alqassab, M.
[ ]
2019China/SpainLiterature reviewThe role of anonymity in peer assessmentAnonymity in peer assessment may lead to more critical feedback and help to develop different kinds of feedback for classmates. This can increase the perceived value of peer assessment in specific contexts and with specific learning goals.
Planas-Lladó, A. et al.
[ ]
2021SpainResearch paper (quantitative)The relationship between self- and peer assessment in teamwork and team performanceThe quality of the learning results is related positively to the performance of the team, and also in terms of self- and team evaluation. The performance grade seems to be higher in those groups where the members assigned similar ratings to each others’ work.
Ratminingsih, N.M., Artini, L.M., and Padmadewi, N.N. [ ]2017IndonesiaResearch paper (mixed methods)Role of self- and peer assessment in student teachers’ reflective practicesSelf- and peer assessment practices are crucial for encouraging reflective thinking in student teachers, not only with reference to their academic learning but also for their future profession.
Rico-Juan, J.R., Cachero, C., and Macià, H.
[ ]
2022SpainResearch paper (quantitative)Effects on the accuracy of peer assessment, considering individual vs. collaborative feedbackWorking collaboratively with classmates in peer assessment tasks is positively correlated with more accurate self-evaluations. This may also have an impact on the learning process, but only for those students who have already shown good performance.
Seifert, T. and Feliks, O.
[ ]
2019IsraelResearch paper (mixed methods)Effects of online self- and peer assessment on the development of students’ assessment skillsOnline self- and peer assessment can be useful for helping students to understand the assessment process and develop assessment skills. Students are not fully aware of the potentialities of these assessment processes.
Serrano-Aguilera, J.J. et al.
[ ]
2021SpainResearch paper (quantitative)Effectiveness of a peer assessment strategy based on collaborative learning in different higher-education contextsInvolvement in cooperative peer-assessment strategies has positive effects on students’ performance. Peer evaluations obtained in this way are very similar to those given by teachers.
Simonsmeier, B.A., Peiffer, H., Flaig, M., and Schneider, M.
[ ]
2020GermanyResearch paper (quantitative)Effects of peer assessment on academic self-conceptParticipation in training regarding academic writing, with peer feedback, enhanced the academic self-concept related to field discipline.
Stančić, M.
[ ]
2021SerbiaResearch paper (qualitative)Students’ perceptions about the benefits and challenges of peer and self-assessment in higher education contexts. Peer assessment has a more positive impact on the learning process than self-assessment. However, peer assessment may be uncomfortable and challenging for some students, but when developing a sense of responsibility and motivation, students may experience many benefits from peer assessment, including improving their own assessment skills.
Tait-McCutcheon, S. and Knewstubb, B. [ ]2018New ZealandResearch paper (mixed methods)Features that allow self-, peer, and teacher assessments to be consistent with each otherThe alignment of self-assessment with peer and teacher evaluations may depend on individual aspects related to self-efficacy, cultural background, personal expectations, and others. It is important to also consider this when training students to reflect in the self-assessment task.
To, J. and Panadero, E.
[ ]
2019China/SpainResearch paper (qualitative)Effects of peer assessment on self-assessment in first-year university studentsParticipation in peer assessment activity may improve students’ understanding of the main features of the assessment process, and their ability to assess academic performance. The peer assessment process may be impaired by competition among students, misunderstandings related to the feedback received, and a lack of trust in their classmates’ opinions.
Tsunemoto, A. et al.
[ ]
2021CanadaResearch paper (quantitative)Effects of peer assessment and benchmarking on the self-assessment process in second-language learningTo participate in peer assessment activities in the field of second-language learning may be useful for improving individual self-assessment abilities, making self-rating of their performance more in line with ratings received from external judges.
Wanner, T. and Palmer, E.
[ ]
2018AustraliaResearch paper (mixed methods)The relationship between self- and peer assessment and academic learning and performance for university studentsSelf- and peer assessment in higher education are crucial for promoting a formative vision of assessment and a student-centered perspective. Participation in activities focused on self- and peer assessment may not only help students understand how the assessment process works but also support them in enhancing their learning process and performance.
Yang, A.C.M. et al.
[ ]
2022Taiwan/JapanResearch paper (mixed methods)Effects of online self-assessment on the learning process and performanceStudents that regularly took the online assessment activities may have an improvement in their performance, resulting in higher ratings.
Zhan, Y., Wan, Z.H., and Sun, D.
[ ]
2022ChinaLiterature reviewEffects of online peer assessment on the learning processOnline peer feedback is fundamental for improving the students’ learning process. Educators should promote specific training for supporting peer assessment activities and creating a collaborative and formative climate.
MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

Concina, E. The Relationship between Self- and Peer Assessment in Higher Education: A Systematic Review. Trends High. Educ. 2022 , 1 , 41-55. https://doi.org/10.3390/higheredu1010004

Concina E. The Relationship between Self- and Peer Assessment in Higher Education: A Systematic Review. Trends in Higher Education . 2022; 1(1):41-55. https://doi.org/10.3390/higheredu1010004

Concina, Eleonora. 2022. "The Relationship between Self- and Peer Assessment in Higher Education: A Systematic Review" Trends in Higher Education 1, no. 1: 41-55. https://doi.org/10.3390/higheredu1010004

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Online ordering is currently unavailable due to technical issues. We apologise for any delays responding to customers while we resolve this. For further updates please visit our website: https://www.cambridge.org/news-and-insights/technical-incident Due to planned maintenance there will be periods of time where the website may be unavailable. We apologise for any inconvenience.

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

peer review higher education

  • > Journals
  • > BJPsych Open
  • > Volume 10 Issue 1
  • > A systematic review of peer support interventions for...

peer review higher education

Article contents

  • Conclusions

Data availability

Author contributions, declaration of interest, a systematic review of peer support interventions for student mental health and well-being in higher education.

Published online by Cambridge University Press:  15 December 2023

  • Supplementary materials

Higher education institutions (HEIs) are seeking effective ways to address the rising demand for student mental health services. Peer support is widely considered a viable option to increase service capacity; however, there are no agreed definitions of peer support, making it difficult to establish its impact on student mental health and well-being.

This systematic review aims to better understand and evaluate peer support in HEIs.

Five databases, OpenGrey and Grey Matters were searched in May 2021. Included studies were quantitative, longitudinal (with and without a control) or cross-sectional with a control. The vote-counting method was used for synthesis. The risk of bias was assessed with the National Institutes of Health Quality Assessment Tool.

Three types of peer support were represented in 28 papers: peer-led support groups, peer mentoring and peer learning. Peer learning and peer mentoring had more positive, significant results reported for the outcomes of anxiety and stress. Peer-led support groups were the only type targeting students with mental health difficulties.

The heterogeneity of measures and outcomes prevents firm conclusions on the effectiveness of peer support for mental health and well-being. Most studies were rated ‘poor’ or ‘fair’ in their risk of bias. There is not a solid evidence base for the effectiveness of peer support. Nonetheless, HEIs can use the terminology developed in this review for shared discussions that guide more robust research and evaluation of peer support as an intervention.

There are growing concerns for students’ mental health in higher education, Reference Brown 1 with significant numbers of students reporting distress. Reference Neves and Hillman 2 Higher education institutions (HEIs) refer to any tertiary education leading to an academic degree award. 3 In the World Health Organization's international college student survey, a third of first-year students screened positive for at least one common anxiety, mood or substance use disorder as defined by the DSM-IV. Reference Auerbach, Mortier, Bruffaerts, Alonso, Benjet and Cuijpers 4 Correspondingly, British HEIs reported a 94% increase in demand for counselling services from 2012 to 2017. Reference Thorley 5 Despite service demand rising, only 4.9% of students disclosed a mental health condition to their HEI as a disability in the 2019–2020 enrolment, 6 indicating that barriers to student help-seeking still exist. HEIs are seeking effective ways to support students, considering the increased demand and low disclosure rates. Globally, a settings-based, whole-systems approach to improving health has been widely advocated for. Reference Whitelaw, Baxendale, Bryce, MacHardy, Young and Witney 7 – 10 In UK HEIs, this has gained momentum with the ‘University Mental Health Charter’, which outlines how institutions can take a ‘whole-university’ approach to mental health and encourages peer support to be represented in their strategies. Reference Hughes and Spanner 11

Peer support is ‘support provided by and for people with similar conditions, problems or experiences’. Reference Gulliver and Byrom 12 It can be delivered in various ways, including one-to-one mentoring and self-help groups. Reference Solomon 13 Convening people with similar experiences creates a supportive space underpinned by respect, collective responsibility and an agreement on what is helpful. Reference Mead, Hilton and Curtis 14 Two approaches exist: informal and formal. Informal peer support happens naturally within communities when people help others in similar circumstances based on their lived experience. Reference Gulliver and Byrom 12 Without structure, this form of peer support is challenging to evaluate. In contrast, formal peer support brings people with similar experiences together intentionally to share knowledge for mutual benefit, building social connection and reducing loneliness. Reference Solomon 13 , Reference Schubert, Borkman and Powell 15 Formal peer support will be the focus of this review, with the term generally describing higher education students helping each other based on their common lived experience of being a student.

Students find peer support easy to use, and recent research suggests it can increase support service accessibility. Reference Suresh, Karkossa, Richard and Karia 16 Students disclose more to peers than to their HEIs: 75% of students who experienced mental health difficulties reported telling a peer. 17 Since students prefer seeking help from friends more than professional services, Reference Rickwood, Deane, Wilson and Ciarrochi 18 , Reference Ebert, Mortier, Kaehlke, Bruffaerts, Baumeister and Auerbach 19 HEIs want to harness this natural preference through peer support, as recommended in the University Mental Health Charter. Reference Hughes and Spanner 11 A quantitative meta-analysis of 23 peer-run programmes for depression in community health settings found that the interventions produced significant reductions in depressive symptoms, performing as well as professional-led interventions and significantly better than no treatment. Reference Bryan and Arkowitz 20 Although peer support is used by many and seems promising, its effectiveness in higher education settings is unknown. Reference John, Page, Martin and Whittaker 21

There is currently no comprehensive quantitative review of the published and grey literature on peer support interventions evaluated in higher education settings. Peer support in clinical settings is well defined, with competency standards and fidelity assessments providing an emerging standard of practice. Reference Fortuna, Solomon and Rivera 22 In contrast, different forms of peer support exist in HEIs, and guidance is still needed to delineate between models. Reference Monk and Purnell 23 Limited search terms in a previous systematic review, Reference John, Page, Martin and Whittaker 21 which included only three studies, missed relevant research on other forms of peer support. Although studies outline individual benefits for specific types of peer support in higher education settings, Reference Byrom 24 – Reference Bosmans, Young and McLoughlin 27 no current reviews collate all forms of peer support in HEIs that target mental health and well-being in the literature.

Defining a ‘peer’ is also critical to understanding how the kinds of peer support in higher education differ. In broader contexts, definitions of a peer most commonly refer to those who have lived experience with mental health difficulties or have used mental health services in clinical settings. Reference King and Simmons 28 In HEIs, however, other identities, such as ethnicity, sexual orientation or course of study, may provide an additional point of connection. For example, research recommends creating more peer support spaces for Black students. Reference Stoll, Yalipende, Byrom, Hatch and Lempp 29 , Reference Stoll, Yalipende and Haas 30 A synthesis of the definitions of peer support and what it means to be a peer are needed to inform and evaluate current practice, direct future research and clarify the role of peer support in a whole-university approach to student mental health and well-being.

The aim of this review was to screen relevant literature on peer support interventions evaluated in higher education settings worldwide, to identify current practice and assess its effect on measures of student mental health and well-being, by undertaking the following objectives: (a) to synthesise and categorise types of peer support and define peers according to study characteristics; and (b) to evaluate the effectiveness of peer support in higher education for improving student mental health and well-being according to the developed intervention categories.

For the purpose of this review, mental health and well-being are defined according to the University Mental Health Charter. Mental health refers to ‘a full spectrum of experiences ranging from good mental health to mental illness’ and well-being encompasses ‘a wider framework, of which mental health is an integral part, but which also includes physical and social wellbeing’. Reference Hughes and Spanner 11

The systematic review protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO; identifier: CRD42021256552). No amendments were made. The review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Reference Brennan and Munn 31 , Reference Page, McKenzie, Bossuyt, Boutron, Hoffmann and Mulrow 32 and Synthesis Without Meta-Analysis (SWiM) guidance. Reference Campbell, McKenzie, Sowden, Katikireddi, Brennan and Ellis 33

Eligibility criteria

Studies with a quantitative longitudinal design were included, with and without a control, or comparator. Cross-sectional studies with a control condition were included. Cross-sectional studies lacking a control were excluded. Qualitative-only studies were excluded. Any students (aged ≥18 years) in HEIs were included. Interventions delivering peer support in higher education were included. Interventions that provided a one-off psychoeducation initiative were excluded.

Studies with and without a comparator, or control, were included. Comparator conditions included those not participating in peer support, a waitlist, informal groups, website access only, year group or faculty mentoring. Where a study used a comparator, the population had to be from a similar higher education setting as the primary intervention.

The outcome of this review was a change in the quantitative measure of well-being or mental health for HEI students, such as stress, anxiety, depression, well-being, loneliness and belonging. Studies were excluded if no quantitative measures were reported. Outcomes for anyone other than students receiving the peer support intervention were excluded.

Information sources

In May 2021, a worldwide systematic search of studies written in English was conducted in the databases: Ovid (PsycINFO, Medline, EMBASE), Web of Science (Core Collection) and the Education Resources Information Center (ERIC). The search was limited to the past 30 years in alignment with a previous review that included a study from 1991. Reference Bryan and Arkowitz 20 Grey literature was searched for through OpenGrey 34 and Grey Matters. 35

Search strategy

Search terms were developed in PsycINFO and adapted for other databases. Key words included population terms (e.g. ‘university’ or ‘student’), intervention terms (e.g. ‘peer support’, ‘peer mentoring’ or ‘peer-assisted learning’ or ‘peer to peer’ or ‘peer tutoring’ or ‘peer health education’) and outcome terms (e.g. ‘mental health’ or ‘well-being’). A complete search strategy (see Supplementary Table 1 available at https://doi.org/10.1192/bjo.2023.603 ) was developed with existing systematic reviews with similar keywords, to identify relevant MeSH and free-text terms. Reference John, Page, Martin and Whittaker 21 , Reference Upsher, Nobili, Hughes and Byrom 36 , Reference Lyons, Cooper and Lloyd-Evans 37 Free-text terms identified in relevant studies from a scoping review were also included (e.g. Reference Monk and Purnell 23 – Reference Jacobi 26 ). Grey literature was identified through OpenGrey 34 and Grey Matters, 35 a scoping review and backward citation tracking of included full-text studies. Authors were contacted during the search process via email for clarification or full-text articles.

Selection process

In stage 1, titles and abstracts of papers identified by electronic searches were exported to the Windows desktop version of Clarivate EndNote 20 (London, UK; see https://endnote.com/downloads ) from all databases, to remove duplicates. Reference Hupe 38 The citations were then exported using a Windows browser with the web-based software as a service application, ‘Rayyan-intelligent systematic review’ (Qatar Computing Reseach Institute, Boston, USA; see www.rayyan.ai ), where independent screening by two researchers was conducted. Reference Ouzzani, Hammady, Fedorowicz and Elmagarmid 39 The lead reviewer (J.P.-H.) screened all titles and abstracts, and the second researcher (L.W.) screened 50%. If there was any uncertainty at this stage, papers were included for full-text review. In stage 2, full texts of all papers included in stage 1 were independently screened for inclusion by both researchers (J.P.-H. and L.W.). Any discrepancies were resolved by a third researcher (J.F.).

Data collection process

Data extraction was managed in Windows Microsoft Excel (version 2309) with tables (e.g. study characteristics) and figures (e.g. risk-of-bias data) created. The team developed and approved a data extraction form before being piloted on five studies independently by two researchers (J.P.-H. and L.W.). Data extraction for these studies was compared and refined before applying it to all included studies.

The following data items were extracted upon availability and reported:

(a) Publication characteristics: year of publication, country and HEI of recruitment;

(b) Methodology and study design: longitudinal or cross-sectional with a control;

(c) Population characteristics: sample size, attrition, the mental health status of the population, level of study, students’ year of study, gender, mean age and ethnicity;

(d) Intervention characteristics: type and objective of peer support, number of peer support sessions, duration of intervention, format of delivery and who the peer support is for;

(e) Outcome characteristics/measures: quantitative measures of well-being and/or mental health at pre- and post-intervention for longitudinal studies (with or without a control) or at a particular time point with a control for cross-sectional studies;

(f) Results: mean and standard deviation at baseline and follow-up, P -value and confidence intervals from the intervention group and comparator (where applicable).

Missing data was denoted as ‘not reported’ to indicate its absence for the risk-of-bias assessment.

Study risk-of-bias assessment

The methodological quality of studies included in the review was assessed independently by two reviewers (J.P.-H. and L.W.). A modified American National Institutes of Health (NIH): National Heart, Lung and Blood Institute Health Topics Study Quality Assessment Tool for ‘Before-After (Pre-Post) Studies With No Control Group’ was used. 40 This approach to the risk of bias was chosen as many of the studies lacked a control, and similar reviews demonstrated its utility in higher education settings. Reference Upsher, Nobili, Hughes and Byrom 36

The following outlines the 12 items from the tool used to determine the risk of bias: (item 1) clear study question; (item 2) prespecified eligibility criteria; (item 3) study participants representative; (item 4) all eligible participants enrolled; (item 5) sample size sufficiently large; (item 6) intervention clearly described and delivered consistently; (item 7) outcomes measures prespecified, valid, reliable and assessed consistently across all participants; (item 8) blinding; (item 9) 20% or less attrition in follow-up; (item 10) statistical methods examined changes in outcome measures/statistical tests conducted that provided P -values; (item 11) outcome measures taken multiple times before and after intervention; and (item 12) group level intervention took into account individual-level data to determine effects. 40 For this review, items 8 and 12 were excluded, as they were irrelevant to any of the included studies.

For each study, all items were rated according to the guidance as ‘yes’ (met criteria), ‘no’ (did not satisfy criteria), ‘not reported’, ‘cannot determine’ (unclear from information) or ‘not applicable (not relevant to particular study). 40 Reviewers used these ratings to make a qualitative assessment of overall risk of bias, using the ratings of ‘good’, ‘fair’ or ‘poor’. All risk-of-bias scorings are outlined for study in Supplementary Table 2.

Effect measures

The baseline and post-intervention time points were used only in data extraction to calculate pre (time point 1) and post (time point 2) studies. The mean differences and P -values between pre and post of intervention and control group (when applicable) were calculated with raw data reported in individual longitudinal studies (if available). For cross-sectional studies with a control group, mean differences were calculated between groups at the post-intervention time point (as baseline data was not reported). Outcome data beyond post-intervention were not synthesised. When data was unavailable for calculating mean differences, ‘CD’ (cannot determine) was used.

Standardised mean differences (Cohen's d ) with 95% confidence intervals were calculated when longitudinal studies included a control group. The calculations were made in StataMP version 17 for Windows, 41 with the raw scores of each intervention/control measure, including sample size, mean difference and s.d. For longitudinal studies without a control group, available data such as P -value, Cohen's d and t -values were extracted. The significance of outcomes was also reported, which included the directionality of an improvement or decline.

Synthesis methods

A meta-analysis was not appropriate because of the heterogeneity of study methodologies. The vote counting method outlined in the SWiM reporting guidelines was used. Reference Campbell, McKenzie, Sowden, Katikireddi, Brennan and Ellis 33 Missing data are denoted in the tables. Outcome data were tabulated for each included study and stratified by type of peer support intervention. The most common outcomes assessed in this review were stress, anxiety and depression. In each vote counting synthesis, the following was reported: the number and percentage of studies that affected the most common outcome for each peer support category, the binomial test indicating the probability of the results if the intervention was ineffective (i.e. equal to 0.5) and the 95% confidence intervals for the percentage of effects favouring the intervention. Reference McKenzie and Brennan 42 The binomial test was calculated in StataMP version 17, 41 using the syntax ‘bitesti X Y 0.5’, whereas the 95% confidence intervals were calculated with the syntax ‘cii proportions X Y, level (95)’, where X equates to the number of effects and Y is the number of effects favouring the intervention.

Study selection

As summarised in Fig. 1 , 12 763 records remained after duplicates were removed. A total of 57 papers were included for full-text screening, and a final 28 papers were included.

peer review higher education

Fig. 1 Process of identifying eligible studies for inclusion.

Study design characteristics

The study characteristics are outlined in Table 1 alphabetically according to the author, with a reference number used in square brackets for the results section only. The most common study type was the pre–post with a control design. Many studies ( n  = 12) adopted this approach [1, 4, 6–8, 14, 17, 20, 24–26, 28], whereas others ( n  = 7) employed a pre–post without controls design [2–3, 10–12, 15, 27]. Although some studies ( n  = 8) used a randomised controlled trial design [5, 9, 16, 18–19, 21–23], one of these studies [19] only used relevant mental health measures at time point 2, so that this study was analysed as a cross-sectional study with a control design along with one other study [13].

Table 1 Summary of study characteristics in review

peer review higher education

Population characteristics

Many studies ( n  = 13) targeted students by year of study, with the majority of studies offering peer support for lower-year students such as ‘first year’ [1, 4, 11–12, 18–19, 22–24] or ‘freshmen and sophomores’ [17]. Students were also recruited by discipline ( n  = 12); ‘nursing/nurse anaesthetists’ [2, 13, 20] and ‘psychology’ [4, 8, 26] courses were the most common. Other population criteria included ‘lived experience of mental health difficulties’ [5, 9], ‘student status’ [22, 28], ‘ethnicity’ [15, 24] and ‘age’ [2, 5, 20, 22]. The complete list is included in Table 1 .

Other population characteristics were also extracted. One study focused on postgraduate students [13]. Others invited both undergraduates and postgraduates students to participate [1, 9]. All other studies were for undergraduate students. The majority of studies reported binary biological sex (male versus female). Of these, five reported the percentage of females in their sample only, leaving the reader to infer that the remaining percentage were males. Of the 22 studies that reported on binary sex in the baseline intervention group, the average proportion was 64.1% females and 35.9% males. Only one study used the term gender instead of sex in reporting; it was still presented in a binary way (44.4% men and 55.6% women [8]. Three studies [3, 13, 15] reported beyond binary sex, with options like ‘other’, ‘non-binary’ or ‘unspecified’ making up an average of 6.9%, along with 59.3% females and 33.8% males. The average mean age across the 20 studies that reported this for the intervention group was 21.6 years of age. Not enough studies reported clearly on gender, sex or mean age in the control group to desegregate this data. Similarly, few studies reported on ethnicity.

Intervention characteristics

Two intervention characteristics were important during this review: how a peer was defined and what type of peer support was investigated. To understand how the studies described a peer, we investigated how students were recruited for peer support (the population) and who facilitated the interventions. The studies referred to these students in various ways, including ‘leaders’, ‘peer supporters’ and ‘peer mentors’. This review uses the term ‘peer facilitators’ to describe any peer leading the intervention. Each study's population and peer facilitator are presented in Table 1 . The shared experiences or identities between the peer facilitators and those accessing peer support helped to define a peer. Peer facilitators were frequently defined by their ‘seniority/year’ ( n  = 13) [1, 4, 6, 11–13, 19–24, 27] or ‘course of study’ ( n  = 11) [2, 6–8, 18–22, 24, 27]. A smaller number of studies recruited peer facilitators by ‘interest’ [3, 23], ‘gender’ [8, 16], ‘age’ [2], ‘lived experience’ of mental health difficulties’ [5, 9] or ‘heritage’ [15, 28]. Five studies created groups where all students participated and supported each other equally for mutual support [9, 14, 17, 25–26]. One study did not specify how they recruited [10]. These experiences and identities further defined being a peer beyond being a student in higher education.

The three categories of peer support created for this review to delineate between types are outlined below. A definition of each type is provided, along with the nomenclature process. The assigned category and each study's terminology (when different) are provided in Table 1 .

Peer-led support group

This peer support gathers groups of students for mutual support. The most used terms of ‘peer-led/peer leader’ groups [5, 8, 14–16, 18, 19, 21] and ‘support groups’ [9, 14, 18–19, 25] or ‘group support’ [3, 16, 22] were both featured in eight studies.

Peer mentoring

Peer mentoring relies on higher-year/more experienced students to support lower-year/less experienced students. Eight studies used the term ‘peer mentoring’ [4, 11–13, 20, 23–24, 28], whereas two others used similar terms such as ‘specialised peer mentoring’ [27] or ‘peer dyad mentoring’ [2]. One study used ‘peer helper’ [7], but this was a one-to-one pairing of a more experienced student with a less experienced student.

Peer learning

This describes peer support that convenes students based on academic objectives. Terms used for this included ‘cooperative learning’ [17], ‘peer-assisted learning’ [1] and ‘peer-led team learning’ [6]. As the terms ‘peer’ and ‘learning’ were used across these studies, this category was named ‘peer learning’.

Most studies were categorised as a peer-led support group ( n  = 14) or peer mentoring ( n  = 11). The least common category of peer support was peer learning ( n  = 3).

The categorisation of these three types of peer support was most challenging with peer mentoring in small groups. Most peer mentoring occurred on a one-to-one basis; however, one study [24] paired mentors with one to three students. The potential small group, mutual nature of this type of peer support made a consideration of it being a peer-led support group necessary. Because the defining factor of this peer support study was that it was for incoming at-risk Latino students, its objective and ultimately self-identification as being a form of peer mentoring decided its final categorisation.

Comparator (control) characteristics

In total, 21 studies used a control group. Comparators in this review varied and included examples such as groups not participating in peer support [1, 6, 7, 13, 17, 20–22, 24, 25–26, 28], a waitlist [5, 8, 16], a group that met informally on occasion [18–19], a separate HEI without peer support [4], students given access to a website only [9], students in a different course or year (without peer support) [14] and faculty mentor pairing [23].

Outcome characteristics

There were 18 outcomes identified. Stress was most commonly measured with the Perceived Stress Scale Reference Cohen, Kessler and Gordon 43 ( n  = 8) [4, 10, 12, 14, 20, 22–24], with other measures being used only once, including the Chipas’ 2011 Survey Reference Chipas and McKenna 44 [13] and the Depression, Anxiety and Stress Scale (DASS-21 Reference Lovibond 45 ) [25]. One study assessed stress by using two measures: the three-item House and Rizzo measure Reference House and Rizzo 46 and Allen's Reference Allen, McManus and Russell 47 two-item measure of mentor-related stress [11].

For anxiety, six measures were used: the State-Trait Anxiety Inventory (STAI) Reference Spielberger 48 ( n  = 4) [1, 2, 6, 8], Generalised Anxiety Disorder-7 scale Reference Spitzer, Kroenke, Williams and Löwe 49 ( n  = 3) [1, 5, 21], Social Anxiety Questionnaire for Adults Reference Caballo, Salazar, Arias, Irurtia, Calderero and Graña 50 ( n  = 1) [6], Liebowitz Social Anxiety Scale Reference Liebowitz, Coryell and Winokur 51 ( n  = 1) [17], DASS-21 Reference Lovibond 45 ( n  = 1) [25] and the Adult Manifest Anxiety Scale – College Version Reference Reynolds, Richmond and Lowe 52 ( n  = 1) [27].

Depression was assessed with the Beck Depression Inventory, Second Edition Reference Beck, Steer and Brown 53 ( n  = 1) [15], Center for Epidemiologic Studies Short Depression Scale 10 Reference Kohout, Berkman, Evans and Cornoni-Huntley 54 ( n  = 1) [5], DASS-21 Reference Lovibond 45 ( n  = 1) [25], ten-item Edinburgh Postnatal Depression Scale Reference Martin and Redshaw 55 ( n  = 1) [24] and Patient Health Questionnaire-9 Reference Adewuya, Ola and Afolabi 56 ( n  = 1) [21].

Three studies measured well-being with the Shortened Warwick–Edinburgh Scale of Wellbeing Reference Stewart-Brown, Tennant, Tennant, Platt, Parkinson and Weich 57 , Reference Tennant, Hiller, Fishwick, Platt, Joseph and Weich 58 ( n  = 1) [3], Positive and Negative Affect Schedule Reference Gençöz 59 , Reference Watson, Clark and Tellegen 60 (PANAS; n  = 1) [7] and Satisfaction with Life Scale Reference Diener 61 , Reference Koker 62 (SWLS; n  = 1) [7].

Loneliness was assessed with only one measure, the revised University of California – Los Angeles Loneliness Scale Reference Russell, Peplau and Cutrona 63 ( n  = 3) [17–19].

Psychological distress was measured with the Clinical Outcomes in Routine Evaluation – Outcome Measure Reference Evans, Connell, Barkham, Margison, McGrath and Mellor-Clark 64 ( n  = 1) [9], Brief Symptom Inventory Reference Derogatis and Melisaratos 65 ( n  = 1) [15] and the 12-item General Health Questionnaire Reference Goldberg and Williams 66 ( n  = 1) [26].

The Index of General Affect from the Index of Wellbeing Scale Reference Campbell, Converse and Rodgers 67 ( n  = 1) [4] and the PANAS Reference Crawford and Henry 68 ( n  = 1) [16] measured negative affect.

These outcomes were measured in one study each: eating disorder pathology, measured with the Eating Disorder Examination Questionnaire Reference Fairburn and Beglin 69 [16]; resilience, measured with the 25-item Resilience Questionnaire Reference Wagnild and Young 70 [21]; quality of life, measured with the Linear Analogue Self-Assessment Reference Flugel Colle, Vincent, Cha, Loehrer, Bauer and Wahner-Roedler 71 [21]; satisfaction with life, measured with the SWLS Reference Pavot and Diener 72 [9]; perceived social support, measured with the Social Provisions Scale Reference Cutrona, Russell, Jones and Perlman 73 [18]; domains of functioning, measured with the Outcomes Questionnaire Reference Lambert, Hansen, Umphress, Lunnen, Okiishi and Burlingame 74 [22]; belonging, measured with a 13-item questionnaire adapted for the study and based on the Institutional Integration Scale Reference French and Oakes 75 [24]; self-efficacy, measured with a 13-item adapted questionnaire Reference Sherer, Maddux, Mercandante, Prentice-Dunn, Jacobs and Rogers 76 , Reference Tipton and Worthington 77 [24]; and self-esteem, measured with Rosenberg's Self-Esteem Scale 78 [4].

One study used multiple measurements for outcomes [28]. It explored psychological adaptation by using a six-item questionnaire similar to the PANAS Reference Koenig-Lewis, Palmer, Dermody and Urbye 79 and a four-item scale gauging life satisfaction. Reference Esses, Burstein, Ravanera, Hallman and Medianu 80 It also measured acculturative stress by using the homesickness and perceived discrimination subscales from the Acculturative Stress Scale for International Students, Reference Sandhu and Asrabadi 81 the language difficulty subscale from the Index of Life Stress Reference Yang and Clum 82 and the Perceived Language Discrimination Scale Reference Wei, Wang and Ku 83 [28].

Quality assessment: risk of bias

The overall risk of bias for each study is outlined in Table 1 . Out of the 28 included studies, five were rated ‘good’ and four were rated ‘good/fair’. In addition, 12 were rated as ‘fair’, one was rated as ‘fair/poor’ and six were rated as ‘poor’.

All studies stated their objective, clarified eligibility criteria, described the representativeness of the population, presented entry criteria, referred to the intervention and defined the well-being or mental health outcome. The quality ratings were thus determined according to sample size, attrition rate, statistical values and multiple time point measurement. Most ( n  = 22) studies were not adequately powered or did not report power analysis [1, 2, 4, 6–15, 17, 20, 22, 23–28]. Many ( n  = 13) had low retention, with loss to follow-up after baseline higher than 20% [3–4, 9, 13, 18–19, 25]. Other studies did not report enough information to determine attrition rates [1, 11, 14, 24, 26–27]. The statistical tests were not reported in five studies [12, 14, 19–20, 24]. Other studies did not report basic statistics such as the number of participants in the intervention/control group at pre- and post- time points, P -values, mean or s.d. at both baseline and follow-up [11–12, 14, 24]. Most studies ( n  = 20) had two time points and did not assess the outcome beyond the pre–post intervention [1, 2, 4, 6, 9–11, 13–15, 17, 19–22, 24–28].

If our synthesis was constrained to studies that were rated as ‘good’ or ‘good/fair’, we would retain studies. Of these, no peer learning would be represented. We only identified three studies of peer learning. All of these studies were rated as ‘fair’ with no power analysis reported and only two time points measured. Constraining the synthesis does not change the proportional representation of peer mentoring and peer-led support group studies.

Individual study results

Every included study is outlined in Table 2 , with the well-being and mental health outcome effect estimates provided where possible. A complete list of the acronyms and definitions of mental health and well-being measures are provided in Supplementary Table 3.

Table 2 Effect estimates for mental health/well-being outcomes per individual study

peer review higher education

Note: Information not reported within the table was not reported in the reviewed studies.

SWEMWBS, Shortened Warwick–Edinburgh Scale of Wellbeing; CES-D, Center for Epidemiologic Studies Depression Scale; GAD-7, Generalised Anxiety Disorder; SMD, Standardized Mean Difference; STAI, State-Trait Anxiety Inventory; CORE-OM, CORE Outcome Measure; SWLS, Satisfaction with Life Scale; PSS, Perceived Stress Scale; BDI-II, Beck Depression Inventory; BSI, Brief Symptom Inventory; PANAS, Positive and Negative Affect Schedule; EDE-Q, Eating Disorder Examination Questionnaire; UCLA, University of California – Los Angeles; SPS, Social Provisions Scale; PHQ-9, Primary Health Questionnaire; LASA, Linear Analogue Self-Assessment; RS15, Resilience; OQ-45.2, Outcomes Questionnaire; DASS-21, Depression, Anxiety and Stress Scale; GHQ-12, General Health Questionnaire; EPDS, Edinburgh Postnatal Examination Questionnaire; AMAS-C, Adult Manifest Anxiety Scale – College Version; ASSIS, Acculturative Stress Scale for International Students; SAQ, Social Anxiety Questionnaire for Adult; R-UCLA, Revised University of California – Los Angeles; LSAS, Liebowitz Social Anxiety Scale.

* P  < 0.05.

Results of syntheses

The most frequent outcomes evaluated were stress, anxiety and depression (for peer-led support groups only). Vote counting is reported for these outcomes based on the direction of effect with the binomial probability test and 95% confidence intervals. Effect estimates for less frequently reported outcomes with sufficient data available are reported in Table 2 .

Four studies analysed the effect of the intervention on depression. One study had significant results (25%, 95% CI 0.63–80.59%, P  = 0.625) with a decline in depression symptoms; however, its risk of bias was ‘fair’. The other three studies all found no significant results.

Three studies reported the effect of the intervention on anxiety. One study (33.3%, 95% CI 0.84–90.57%, P  = 1.00) favoured the intervention with reduced anxiety and was rated as ‘good/fair’ in the risk-of-bias assessment. Two studies found no significant results for anxiety.

Three studies analysed stress as an outcome. One study (33.3%, 95% CI 0.84–90.57%, P  = 1.00) had a significant decline in stress, but it was rated as ‘fair’ in the risk-of-bias assessment. One study in this category did not have any significant findings for stress, whereas the other had mixed results, with no significant findings for stress but significant improvements in functioning.

Two peer mentoring studies measured anxiety. One had a significant decrease in anxiety; the other found non-significant results.

Five studies (62.5%, 95% CI 24.49–91.48%, P  = 0.73) found significant results for stress. Of these significant positive results for stress, two of the studies were rated as ‘fair’ following risk-of-bias assessment. The other studies were rated as ‘poor’. Three studies found no significant reduction in stress, but one of these had mixed results, with significant improvements to negative affect.

All three peer learning studies measured anxiety. Although one study had mixed results, the other two reported significant intervention effects (66.67%, 95% CI 9.43–99.16%, P  = 1.00). All three studies were rated as ‘fair’ in their risk-of-bias assessment, with no power analysis reported and only two time points measured.

This review demonstrates a wide variation in interventions and terminology used to describe peer support. Although many use the label to encapsulate all forms of peer support, this does not capture the nuances of different peer support interventions. Previous reviews only using peer support as a search term exhibit this, finding just three studies and missing relevant work. Reference John, Page, Martin and Whittaker 21 We found peer support for student mental health and well-being referred to as everything from cooperative learning to peer-led social support groups. There is little consistency in the terminology. Without a shared vocabulary, it is difficult to understand how different forms of peer support may benefit higher education students. This review identified three main categories of peer support: peer-led support groups, peer mentoring and peer learning. A shared understanding and use of these categorical terms beyond peer support is imperative to future research and dissemination. However, first the definition of a peer needs to be clarified.

Defining a peer

The lack of consistent terminology brings into question how HEIs define a peer. Although peer support is broadly about people supporting each other based on shared experiences, Reference Gulliver and Byrom 12 more is required to define a peer in HEIs. This review defined peer support as higher education students helping each other since all peer facilitators and students accessing peer support had this identity. However, other identities are also being used to define a peer by ‘course’, ‘year/seniority’, ‘heritage’, ‘age’ and ‘lived experience of mental health difficulties’.

Of the studies that defined peers based on their year of study ( n  = 13), ten were for first-year students. Although this may not be surprising for peer mentoring, as it is defined by a higher-year student supporting a lower-year student, this was also seen in peer-led support groups and peer learning. Being described as an ‘acute stressor’, the transition into higher education strains well-being, as students face many changes and can struggle to settle in. Reference Gall, Evans and Bellerose 111 Perhaps this is why so many peer support interventions are focused on first-year students; however, each year in higher education presents new challenges, with stress levels fluctuating throughout a degree. Conley et al Reference Conley, Shapiro, Huguenel and Kirsch 112 found that students in the USA enrolled in a 4-year degree had the poorest psychological functioning across the first two years of study, with improvements seen in the final two years. In England, anxiety triggered by higher education and psychological well-being fluctuated for 3-year degree students; however, depression rates were highest in the final year of study. Reference Bewick, Koutsopoulou, Miles, Slaa and Barkham 113 This finding raises questions about whether students would also benefit from peer support beyond their first year. Of the four papers that offered peer support for higher-year students such as those in their second and third years, Reference Moir, Henning, Hassed, Moyes and Elley 102 , Reference Pinks, Warren-James and Katsikitis 106 third years Reference Short, Kinman and Baker 107 and seniors, Reference Humphrey 95 all were part of the peer-led support group category. None had significant results for improved student mental health and well-being outcomes. Although a need might exist, more research is needed to understand if peer support does improve the mental health and well-being of higher-year students.

Another common way to define a peer was through a course of study. Healthcare studies and psychology were the most frequent courses to offer peer support, with nine of the 12 studies falling within these disciplines. Compared with students from other degrees, studies indicate that medical students have higher rates of mental and emotional difficulties, increased levels of mental distress during training and are less likely to seek help. Reference MacLean, Booza and Balon 114 , Reference Jacob, Li, Martin, Burren, Watson and Kant 115 In one study, however, students from the sciences and arts and humanities had significantly higher mean levels of depression than students from health sciences and social sciences. Reference Ruiz-Hernández, Guillén, Pina and Puente-López 116 A study of nursing students in Spain and Chile found that levels of mental distress reduced over time, indicating that nursing education may be a protective factor against mental health disorders. Reference Reverté-Villarroya, Ortega, Raigal-Aran, Sauras-Colón, Ricomà-Muntané and Ballester-Ferrando 117 Therefore, peer support should be evaluated with students across various courses to understand any differences.

Peer-led support groups

This review defined peer-led support groups as a type of peer support that aims to gather groups of students together for mutual support, which was a unique factor. Mutual support is ‘a process by which persons voluntarily come together to help each other address common problems or shared concerns’. Reference Davidson, Chinman, Kloos, Weingarten, Stayner and Kraemer Tebes 118 Peers form self-help support groups by meeting for mutual assistance. Reference Byrom 24 Although group settings offer mutual support for those attending, the review did not include outcomes for peer facilitators, so the mutuality of these groups warrants further investigation. From descriptions alone, it is hard to discern the extent of mutuality in support provision. Of the four studies that had all students act as facilitators, Reference Humphrey 95 , Reference Kocak 98 , Reference Pinks, Warren-James and Katsikitis 106 , Reference Short, Kinman and Baker 107 all were part of the peer-led support group category except one, which was categorised as peer learning. Reference Kocak 98 One study had facilitators take turns leading, Reference Eren-Sisman, Cigdemoglu and Geban 87 whereas the others had all students trained with no set facilitator for the group sessions so that everyone was expected to participate equally. Reference Pinks, Warren-James and Katsikitis 106 , Reference Short, Kinman and Baker 107

Peer-led support groups had the most mixed findings, so their efficacy remains to be seen. As the most frequently evaluated intervention type, 20 measures were used to explore 14 mental health and well-being outcomes. The various measures might demonstrate indecision on the objective of peer-led support groups. Similarly, the different measures could also be explained by the different delivery methods. Although the ‘group’ aspect of this category was the defining feature, the studies represented a range of interventions, such as a peer-run self-help group, Reference Byrom 24 mutual support group Reference Freeman, Barker and Pistrang 90 and peer-facilitated/-led stress management group/peer education. Reference Fontana, Hyra, Godfrey and Cermak 89 , Reference Frohn, Turecka, Katz and Noehammer 91 The diverse delivery methods may explain the difference in outcome measures assessed and the mixed results of this category. However, many studies lacked detailed descriptions of the interventions. Hence, it is difficult to assess whether they are indeed distinct or if a difference in the nomenclature used to describe interventions explains these results. Based on the heterogenous literature for this peer support, it is impossible to identify when or why some forms improve student mental health. The peer-led support group is therefore a category of peer support that warrants further investigation using shared terminology and clear descriptions of the interventions to understand the factors associated with its efficacy.

In this review, we defined peer mentoring as a type of peer support that relies on higher-year/more experienced students to support lower-year/less experienced students. Mentoring is known broadly as a transfer of knowledge, Reference Parsloe and Wray 119 where a more experienced, usually older, individual guides a mentee with less experience. Reference Budge 120 Depending on the institution, peer mentoring goes by names such as a ‘parent’ programme, ‘buddy’ scheme or ‘family’ programme. No matter the title, peer mentoring programmes operate on the same belief that students who have more experience in higher education can mentor less experienced students.

Peer-led support groups were defined by their group nature; peer mentoring had more heterogeneity in approach. Most peer mentoring happened with mentors supporting mentees on a one-to-one basis, but three of the 11 papers took alternative approaches. These studies paired one mentor with up to three mentees, Reference Phinney, Torres Campos, Padilla Kallemeyn and Kim 105 had a dyad with a group of mentees connecting with one mentor Reference Fullick, Smith-Jentsch, Yarbrough and Scielzo 92 or took a mixed approach with one-to-one meetings and homework assigned to the students receiving support. Reference Eryilmaz 88 All alternative approaches to one-to-one peer mentoring had significant results in the assessed mental health and well-being measures. Overall, the included studies used 17 measures to evaluate ten outcomes. Stress was the predominant outcome, with 62.5% of the studies demonstrating significant, positive results for stress. Therefore, peer mentoring benefits student stress and takes a mostly one-to-one structure; however, other approaches can be helpful. The literature mostly agrees on peer mentoring terminology to describe this type of peer support.

This review defined peer learning as a type of peer support that convenes students based on academic objectives and tends to be situated in departments. Peer-led team learning Reference Eren-Sisman, Cigdemoglu and Geban 87 and cooperative learning Reference Kocak 98 contributed to this category. Cooperative learning creates spaces where students work toward a common purpose and assist each other in learning. Reference Johnson and Johonson 121 , Reference Lee, Ng and Jacobs 122 Peer-led team learning is an experiential learning environment where students build knowledge, talk to each other and develop higher-level reasoning and problem-solving skills by thinking together about the conceptual side of learning. Reference Tien, Roth and Kampmeier 123 – Reference Varma-Nelson, Coppola, Greenbowe, Pienta and Cooper 125 Peer-assisted learning is also part of this category, Reference Bosmans, Young and McLoughlin 27 with Bournemouth University defining it as ‘a scheme that fosters cross-year support between students on the same course’ while encouraging students to learn together and help each other. 126 The approach to learning is socially focused. Reference Hilsdon 127 In this way, peer learning is distinguished from other supportive activities because it is facilitative of student learning; structured and purposeful with training and support; reliant on small groups; open to everyone, non-compulsory and takes place in a safe, more relaxed environment. Reference Ody, Carey, Dunne and Owen 128

Peer learning traditionally focuses on academic objectives. As such, there are few studies assessing the impact of this type of peer support on mental health and well-being. The data captured here, however, suggests that peer-led learning interventions may improve student mental health, with a significant impact on reducing anxiety. Thus, the positionality of peer learning in departments may be an opportunity for HEIs to take a settings-based approach to improve student mental health in the classroom.

Peer support in higher education versus community peer support

Although the promise of peer support in higher education is underpinned by the more established body of research on peer support in community health settings, two issues have been raised through this review. First, the measures being used differ. Two meta-analyses found significant reductions in depressive symptoms for peer support as an intervention in communities, Reference Bryan and Arkowitz 20 , Reference Pfeiffer, Heisler, Piette, Rogers and Valenstein 129 which have been used to justify further exploration of peer support in higher education. Depression was measured as an outcome in only five studies in the higher education context. Of these, one was peer mentoring, which significantly favoured the intervention. The others were peer-led support groups, with only one of the four studies reporting significant benefits to depression. The lack of depression measures makes comparing findings in community settings to HEIs difficult.

Only two peer-led support group studies defined a peer based on their lived experience of mental health difficulties, Reference Conley, Hundert, Charles, Huguenel, Al-khouja and Qin 86 , Reference Freeman, Barker and Pistrang 90 bringing them together with peer facilitators who self-identified as living with a ‘mental illness’ or ‘psychological problem’. This finding contrasts the definitions of a peer used in community mental health settings. The NHS website defines peer support workers as ‘people who have lived experience of mental health challenges themselves’ and who use their experiences to empathise with and support others. This inconsistency in how HEIs classify a peer in contrast to how a peer is defined in community mental health settings in the UK is essential. Because peer support in higher education does not seem to recruit facilitators or students based on lived experience with mental health difficulties, the basic definitions of a peer in a community versus a HEI differ. This disparity in definition and lack of shared outcome measures mean that the comparison between community programmes and peer support in higher education cannot currently be made with the literature.

Limitations of evidence included in the review

No grey literature met the inclusion criteria. A search was undertaken in OpenGrey 34 and Grey Matters, 35 but no results were found. In addition, no relevant grey literature was encountered through cross-referencing the included full-text studies. Although five reports were discovered in a scoping review, all were excluded after screening. They reported on peer support in higher education generally, undertook qualitative evaluation only or did not use a measure of student mental health or well-being that fit the study criteria. Reference Gulliver and Byrom 12 , Reference Stoll, Yalipende and Haas 30 , Reference Andrews, Clark and Davies 130 – Reference Biggers, Yilmaz and Sweat 132 Although grey literature can reduce publication bias and improve the comprehensiveness of a systematic review, Reference Paez 133 more robust reporting in grey literature is needed to meet basic efficacy measures in higher education peer support.

Most included studies lacked a power analysis to assess whether sample sizes were sufficient to detect intervention effects. Of those that reported a power analysis, many had poor retention and/or small sample sizes, which may explain the many non-significant results of this review. Of the 28 included studies, 21 did not report a power analysis. One included study was a primarily qualitative study, where the quantitative element met the inclusion criteria, but the sample size was small ( n  = 2), affecting its quality. Reference Geng, Midford, Buckworth and Kersten 93 Of the seven studies that did report a power analysis, one did not achieve the sample size required. Reference Head 94 Four of these were rated as ‘good’ in the risk-of-bias assessment, but the two others were rated ‘fair’ Reference Kilpela, Blomquist, Verzijl, Wilfred, Beyl and Becker 97 and ‘fair/poor’ Reference Mattanah, Brooks, Brand, Quimby and Ayers 100 because of low retention and poor reporting of outcome measures. A similar review in higher education settings also found many underpowered studies, indicating the need to run interventions to broader cohorts of students across faculties, programmes or similar institutions to improve power. Reference Upsher, Nobili, Hughes and Byrom 36 With only six studies reporting on the funding received, more funding may be required to make adequately powered studies a reality.

Many studies presented incomplete data; for example, unclear sample sizes and missing statistics/ raw data (i.e. means and s.d.). Demographics were also poorly reported, so that it was not possible to disaggregate gender, age or ethnicity for a helpful discussion. Despite many studies missing integral parts, available data were extracted when possible to calculate mean differences, P -values and standardised mean differences for a more consistent synthesis. The reporting in this review may indicate that better guidelines are required. One review of higher education interventions for student mental health and well-being recommended that medical reporting guidelines Reference Groves 134 , Reference Schulz, Altman and Moher 135 are adapted to improve standards. Reference Upsher, Nobili, Hughes and Byrom 36

Outcome measures were too heterogenous for meaningful comparison. Although anxiety and stress were the most common outcomes investigated in the literature, there was little consistency in measures. Although the Perceived Stress Scale was used most to measure stress ( n  = 8) and the STAI was used to measure anxiety ( n  = 4), many other measures were also applied to assess these common outcomes. Some measures, such as the PANAS, were used to measure different outcomes. For example, Eryilmaz Reference Eryilmaz 88 chose to use PANAS and the SWLS to measure subjective well-being, Kilpela et al Reference Kilpela, Blomquist, Verzijl, Wilfred, Beyl and Becker 97 used PANAS to measure negative affect and Thomson and Esses Reference Thomson and Esses 109 used PANAS to measure psychological adaptation. This lack of consistency is an obstacle to comparing and drawing conclusions on effective interventions. A ‘core set’ of well-being measures validated in higher education student populations has been recommended. Reference Dodd, Priestley, Tyrrell, Cygan, Newell and Byrom 136 Similar guidance is needed for stress, anxiety and perhaps depression, as this review's most common outcome measures, to complement existing toolkits. Reference Broglia, Nisbet, Chow, Bone, Simmonds-Buckley and Knowles 137

Limitations of the review process

A meta-analysis was not possible because of the outcome measure heterogeneity, few reported effect sizes (or raw data to calculate them) and limited information on the interventions to compare similar studies. Vote counting is considered a less robust way to synthesise evidence in a systematic review, since no information is given on the magnitude of effects, sample sizes are not considered and combining P -values is a more robust method. Reference Borenstein, Hedges, Higgins, Rothstein, Borenstein, Hedges, Higgins and Rothstein 138 This systematic review is limited by the narrative synthesis taken; however, using SWiM guidelines improved reporting transparency. Reference Campbell, McKenzie, Sowden, Katikireddi, Brennan and Ellis 33 Nonetheless, the synthesis method stipulated by the current evidence available in the field limits the conclusions that can be drawn.

Although the Cochrane tool for assessing risk of bias in randomised trials and other such tools is widely used, Reference Higgins, Thomas, Chandler, Cumpston, Li and Page 139 most do not support multiple study designs. Reference Upsher, Nobili, Hughes and Byrom 36 As this review had seven randomised controlled trials, two cross-sectional with control and 19 pre–post with and without control designs, a different tool was required. A modified NIH ‘Quality Assessment Tool for ‘Before-After (Pre-Post) Studies With No Control Group’ was used for the risk-of-bias assessment. 40 The chosen method was limited in practice because it is designed for studies without a control group, so there were no criteria acknowledging if a study had a control group, which would strengthen its quality. This approach to risk-of-bias assessment was best-suited for the heterogeneity of our included studies; Reference Upsher, Nobili, Hughes and Byrom 36 however, as some studies also had a cross-sectional design with a comparator, the chosen tool was an imperfect option.

In this study's synthesis, the initial baseline and post-intervention measures were included for pre–post intervention outcome measures. The post-intervention measures were synthesised for cross-sectional with a control design. This approach was used because studies included a mix of interim and follow-up measures at varying durations that did not allow for comparison. Although all time points were extracted to see if comparable data was available, only the pre–post measures for longitudinal studies and cross-sectional post-intervention data with control could be used for synthesis. Using the pre–post time points allowed for more comparison and generalisability in the extraction and synthesis process.

Finally, the methodology has an additional limitation. This paper focused on quantitative studies to meet the second of our objectives: to evaluate the effectiveness peer support in higher education. Future work may benefit from reviewing qualitative studies to confirm our categorisation of types of peer support and definitions of peer.

Implications of the results for practice, policy and future research

This systematic review found that peer support in higher education is defined in the literature according to three categories: peer-led support groups, peer mentoring and peer learning. By identifying this nomenclature, HEIs can start using a shared language when evaluating interventions and communicating best practice. It will also improve understanding of the strengths and limitations of peer support in more detail so that areas for further research can be prioritised.

Peer-led support groups come together for mutual support. Exploring the mutuality of peer support for the peer facilitators and those attending was beyond the scope of this review, but should be studied further. In addition, although this form of peer support was the only one to measure depression outcomes multiple times, results were mixed, which may indicate that the category is too broad. Alternatively, as this form of peer support is most comparable to community mental health settings, it may be that the gap in how HEIs and healthcare settings define a peer and measure different outcomes is the barrier to the identification of effective interventions. Further investigation is needed into what specific peer-led support group components improve efficacy.

Peer mentoring is mostly for incoming students to receive support from a higher-year/more experienced student. This type of peer support was the most homogeneous in the terminology used and implementation (one to one). Of the three peer support types, it was also the most promising for improving stress outcomes. Nonetheless, alternative approaches to peer mentoring (e.g. small groups) demonstrated significant results in other measures (e.g. affect and depression), indicating that more research is needed to understand how the structure of peer mentoring affects mental health and well-being outcomes.

Peer learning operates in groups and convenes for academic objectives. Results indicate that significant improvements in anxiety were linked to peer learning. HEIs should consider incorporating relevant measures into existing peer learning programmes so that further investigations of its benefits to mental health and academic outcomes can be made.

In conclusion, despite hopes that peer support in higher education would offer an accessible, setting-based solution to improving student support, the findings of this review are mixed. Of the three types of peer support, two had the most significantly positive results: peer learning reduced anxiety and peer mentoring reduced stress levels. Results for peer-led support groups, however, were varied. Although peer-led support group interventions assessed depression more than any other type of peer support, they did not show a majority of significant results for any of the outcomes measured.

Peer support interventions aimed at improving student mental health and well-being were set up with specific objectives, such as easing the transition into higher education (peer mentoring), meeting academic objectives (peer learning) or enhancing mutual support (peer-led support groups). Furthermore, how a peer was defined in the higher education context varied, which is crucial to understanding the intervention. Students’ years of study and discipline were common features of defining a peer. However, peer-led support groups were the only type that brought together students with lived experiences of mental health difficulties as peers, which is most similar to community mental health settings. This comparability warrants further investigation, as this type of peer support shows promising applications in wider communities.

Various modes of peer support that use specific definitions of a peer are more or less useful for different needs. Although HEIs consider peer support as a potential addition to support services, defining the type of peer support and what a peer is must be considered. Next, researchers and educators need to set standardised mental health and well-being metrics for the various types of peer support, so that more robust studies can be conducted. These findings should be shared widely, using better reporting guidance to elevate best practice. With this, HEIs can start to assess which types of peer support are helpful when and for whom, as part of a whole-university approach to support all students’ mental health and well-being. The definitions of peer support provided in this review, however, are the first steps toward a consistently shared vocabulary to tackle these challenges.

Supplementary material

Supplementary material is available online at https://doi.org/10.1192/bjo.2023.603

The data that support the findings of this study are available from the corresponding author, J.P.-H., upon reasonable request.

J.P.-H. conceptualised the study, wrote, reviewed and edited the manuscript and was responsible for data curation, formal analysis, study investigation and methodology, project administration, resources, validation and visualisation. L.W. contributed to the study methodology and data curation. R.U. supervised the study and reviewed and edited the manuscript. J.F. and N.B. conceptualised and supervised the study, acquired funding and reviewed and edited the manuscript. J.O. conceptualised and supervised the study and reviewed and edited the manuscript.

This work was supported by the UK Research and Innovation Economic and Social Research Council London Interdisciplinary Social Science Doctoral Training Partnership (grant number ES/P000703/1). N.B. was partially supported by a grant from the Economics and Social Research Council: ES/S00324X/1.

Figure 0

This article has been cited by the following publications. This list is generated based on data provided by Crossref .

  • Google Scholar

View all Google Scholar citations for this article.

Save article to Kindle

To save this article to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • Volume 10, Issue 1
  • Julia Pointon-Haas (a1) , Luqmaan Waqar (a1) , Rebecca Upsher (a1) , Juliet Foster (a1) , Nicola Byrom (a1) and Jennifer Oates (a2)
  • DOI: https://doi.org/10.1192/bjo.2023.603

Save article to Dropbox

To save this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about saving content to Dropbox .

Save article to Google Drive

To save this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about saving content to Google Drive .

Reply to: Submit a response

- No HTML tags allowed - Web page URLs will display as text only - Lines and paragraphs break automatically - Attachments, images or tables are not permitted

Your details

Your email address will be used in order to notify you when your comment has been reviewed by the moderator and in case the author(s) of the article or the moderator need to contact you directly.

You have entered the maximum number of contributors

Conflicting interests.

Please list any fees and grants from, employment by, consultancy for, shared ownership in or any close relationship with, at any time over the preceding 36 months, any organisation whose interests may be affected by the publication of the response. Please also list any non-financial associations or interests (personal, professional, political, institutional, religious or other) that a reasonable reader would want to know about in relation to the submitted work. This pertains to all the authors of the piece, their spouses or partners.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 06 June 2024

Heterogeneous peer effects of college roommates on academic performance

  • Yi Cao   ORCID: orcid.org/0009-0003-4811-8788 1 , 2 ,
  • Tao Zhou   ORCID: orcid.org/0000-0003-0561-2316 1 , 2 &
  • Jian Gao   ORCID: orcid.org/0000-0001-6659-5770 3 , 4 , 5 , 6  

Nature Communications volume  15 , Article number:  4785 ( 2024 ) Cite this article

144 Accesses

1 Altmetric

Metrics details

  • Complex networks
  • Interdisciplinary studies
  • Statistical physics

Understanding how student peers influence learning outcomes is crucial for effective education management in complex social systems. The complexities of peer selection and evolving peer relationships, however, pose challenges for identifying peer effects using static observational data. Here we use both null-model and regression approaches to examine peer effects using longitudinal data from 5,272 undergraduates, where roommate assignments are plausibly random upon enrollment and roommate relationships persist until graduation. Specifically, we construct a roommate null model by randomly shuffling students among dorm rooms and introduce an assimilation metric to quantify similarities in roommate academic performance. We find significantly larger assimilation in actual data than in the roommate null model, suggesting roommate peer effects, whereby roommates have more similar performance than expected by chance alone. Moreover, assimilation exhibits an overall increasing trend over time, suggesting that peer effects become stronger the longer roommates live together. Our regression analysis further reveals the moderating role of peer heterogeneity. In particular, when roommates perform similarly, the positive relationship between a student’s future performance and their roommates’ average prior performance is more pronounced, and their ordinal rank in the dorm room has an independent effect. Our findings contribute to understanding the role of college roommates in influencing student academic performance.

Similar content being viewed by others

peer review higher education

Quantifying hierarchy and prestige in US ballet academies as social predictors of career success

peer review higher education

Unraveling the controversial effect of Covid-19 on college students’ performance

peer review higher education

A national experiment reveals where a growth mindset improves achievement

Introduction.

Peer effects, or peer influence 1 , 2 , 3 , 4 , 5 , have long been studied in the literature on social contagions 6 , 7 , 8 , 9 , 10 , 11 and education 12 , 13 , 14 , 15 , 16 , 17 , 18 . Understanding the influence of student peers on social behavior and learning outcomes is crucial for effective education management 18 , 19 , 20 , 21 , 22 , as it can inform policy decisions on how to improve learning environments inside and outside the classroom 23 , 24 , 25 , 26 , 27 , 28 . Student peers can have both positive and negative effects, depending on their characteristics and behaviors 29 , 30 . For example, when surrounded by high-achieving peers, students may be motivated to improve their academic performance 31 , 32 . Meanwhile, some well-known examples of human behaviors adopted through social influence, such as smoking 33 , 34 , substance abuse 35 , 36 , and alcohol use 37 , 38 , 39 , are often associated with negative student performance. Moreover, student peers may have indirect and lasting effects, for instance, on political ideology 40 , persistence in STEM majors 41 , 42 , 43 , 44 , 45 , occupational preferences 46 , labor market outcomes 47 , 48 , 49 , and earnings 50 , 51 , 52 , 53 . A thorough understanding of peer effects on learning outcomes can inform education management strategies, such as implementing behavioral interventions to mitigate the negative influence of disruptive peers 54 , 55 . Yet, using traditional methods and observational data to study peer effects causally is a challenge.

Dynamic educational and social environments make it difficult to separate peer influence from peer selection due to reverse causality, confounding factors, and complex mechanisms 1 , 2 , 3 , 56 . In particular, similarities in academic performance among student peers may be due to homophily (i.e., the selection of peers based on academic performance similarity) rather than the influence of peers 57 , 58 , 59 . Unlike open and evolving educational environments such as classrooms 23 , 24 , 25 , 26 , dormitories in universities provide a close-knit living environment for students to interact and potentially learn from each other 60 , 61 . While dorm rooms may not be the primary learning place like classrooms and libraries, they offer a highly interpersonal and spillover environment for a small group of stable student peers. In contrast to Western universities, in which freshman students usually have the flexibility to choose dormitories and suite-mates according to their lifestyle and personal preferences, most Chinese universities randomly assign students to dorm rooms 61 , 62 , 63 . There, a typical 4-person dorm room contains four beds and some public areas, providing a more interactive environment than a Western dorm suite containing four separate bedrooms (Supplementary Fig.  1 ).

Research on student peer effects, on the one hand, has primarily relied on static observational data of campus behaviors and performance metrics 11 , 64 . This reliance stems from various factors, such as the high cost and impracticality of conducting large-scale field experiments in learning environments, the dynamic nature of peer relationships 65 , and the scarcity of longitudinal data on student learning outcomes 66 , 67 , 68 , 69 , 70 . The close-knit dormitory environment of Chinese universities, however, provides a unique opportunity to observe a stable group of student peers and track their academic performance over time 61 , 63 . On the other hand, while regression models are widely employed in studying peer effects within the social sciences, methodologies from other disciplines may help expand the functional form in which peer effects can be estimated 64 . Particularly, null models are well suited for studying nontrivial features of complex systems by providing an unbiased random structure of other features 71 , 72 , 73 . Null-model approaches have been applied to test causal effects in complex social systems 74 , 75 , 76 . For instance, in the social network literature, randomizations are used to study the impact of network interventions on social relationships 77 . Utilizing a null model to test whether roommates exhibit similar performance could offer a promising approach to identifying peer effects and quantifying their magnitude, facilitating comparisons across diverse datasets.

One advantage of regression models is their capability to address the issue of inverse causality by utilizing longitudinal data and controlling for confounding factors 68 , 78 . For example, a student’s future performance may be influenced not only by the average prior performance of roommates but also by their own prior performance. Additionally, the composition of roommates may have independent effects 79 . Yet, it remains relatively less explored whether the heterogeneity in performance among roommates provides a ladder for the student to catch up with high-achieving roommates or hamper their motivation due to the inconsistent signal from roommates or the negative impact of disruptive roommates 29 , 30 . Moreover, dorm rooms provide an interactive yet local environment where a student’s ordinal rank in the dorm room, conditional on academic performance, may independently affect learning outcomes 80 , 81 . Therefore, a more comprehensive understanding of the factors contributing to roommate peer effects may help inform education policy and student management strategies, such as designing interventions for dormitories that effectively leverage the influence of high-achieving peers in improving student performance.

In this study, we quantify roommate peer effects using both null models and regression approaches to analyze a longitudinal dataset of student accommodation and academic performance. Sourced from a public research-intensive university in China, our data covers 5,272 undergraduate students residing in 4-person dorm rooms following the random assignment of roommates (see “Methods”). The initialization is plausibly random since the roommate assignment takes into account neither students’ academic performance before college admission nor their personal preferences, and there is no significant reassignment later (see Supplementary Information Section  1.2 for details). Here, we demonstrate the presence of roommate peer effects by showing that roommates with similar performance are more likely to be observed in the actual data than expected by chance alone. We then measure the size of roommate peer effects by developing an assimilation metric of academic performance and contrasting its value in the actual data with that in the roommate null model that we construct by randomly shuffling students among dorm rooms while retaining their controlled characteristics. Further, we use regression models to examine factors influencing roommate peer effects and explore the role of peer heterogeneity in moderating the effects.

Tier combinations within a dorm room

We start by studying the roommate composition of a typical 4-person dorm room in terms of their academic performance. For comparisons across student cohorts (i.e., those who were admitted by the university in the same year), majors, and semesters, we transform each student’s grade point average (GPA) in a semester into the GPA percentile R among students in the same cohort and major, where \(R\)  = 0 and \(R\)  = 1 correspond to the lowest and highest academic performance, respectively. We then divide students into equal-sized tiers based on their GPA percentiles, where those with better performance are in larger tiers. For instance, under the 4-tier classification, students with \(R\)  = 0.3 (i.e., GPA is above 30% of students) and \(R\)  = 0.9 (i.e., GPA is above 90% of students) are in Tier 2 and Tier 4, respectively. Accordingly, each dorm room has a tier combination without particular order. For example, 3444 (i.e., one student is in Tier 3, and the other three are in Tier 4) is identical to 4344 and 4434. Here we use the one in ascending order of tier numbers to delegate all identical ones. Under the 2-tier classification, there are five unique tier combinations (1111, 1112, 1122, 1222, and 2222). The numbers are 15 and 35 under 3-tier and 4-tier classifications, respectively (Fig.  1a ; see Supplementary Information Section  2.1 for details).

figure 1

a The relative ratio \({\mathbb{E}}\) of each combination under the 2-tier, 3-tier, and 4-tier classification of GPA, respectively. The x-axis shows all unique combinations in ascending order of tier numbers under a tier classification, and the y-axis shows the relative ratio \({\mathbb{E}}\) that compares the actual frequency of a combination with its theoretical value. The horizontal dashed line marks 0. Positive and negative \({\mathbb{E}}\) is marked by ‘+‘ and ‘-‘, respectively. b Combinations in ascending order of the relative difference \(D\) , which measures the average pairwise difference between tier numbers of a combination. The staggered shade marks a group of combinations with the same \(D\) . c The negative relationship between the relative ratio \({\mathbb{E}}\) and the relative difference \(D\) based on the actual data. Data points show the \({\mathbb{E}}\) for each combination, and the hollow circle shows the mean \({\mathbb{E}}\) for each group with the same \(D\) .

Given a tier for classification, the probability \({P}_{a}\) of observing a combination in the actual data can be calculated by the fraction of dorm rooms with the combination. The actual probabilities \({P}_{a}\) of observing different combinations (i.e., the frequency of observations), however, shouldn’t be directly compared. This is because their theoretical probabilities \({P}_{t}\) are not always the same even when the tier numbers of roommates are independent of each other, i.e., there is no roommate peer effect (see Supplementary Table  1 and Supplementary Information Section  2.1 ). To give a simple example: under the 2-tier classification, the theoretical probability \({P}_{t}\) of combination 1112 is \({C}_{4}^{1}{\left(\frac{1}{2}\right)}^{3}\left(\frac{1}{2}\right)=\frac{1}{4}\) , which is four times as big as that of combination 1111, namely, \({\left(\frac{1}{2}\right)}^{4}=\frac{1}{16}\) . This leads to the difficulty of assessing, by the value of \({P}_{a}\) , whether a combination is over-represented or under-represented in the actual data. To address this challenge, we calculate the relative ratio \({\mathbb{E}}\) for a combination by comparing the actual probability with its theoretical probability:

where \({P}_{a}\) and \({P}_{t}\) are the actual and theoretical probability of the same combination, respectively. A positive (negative) value of \({\mathbb{E}}\) suggests that the combination is more (less) likely to be observed in data than expected by chance alone (see Supplementary Information Section  2.2 ).

We analyze the student accommodation and academic performance data under 2-tier, 3-tier, and 4-tier classifications and calculate the relative ratio \({\mathbb{E}}\) for each combination (Fig.  1a ). We find that \({\mathbb{E}}\) of different combinations vary substantially and \({\mathbb{E}}\) of some combinations deviates significantly from 0 according to the results of statistical tests (see “Methods” and Supplementary Information Section  3.2 for details). For example, under the 2-tier classification, \({\mathbb{E}}\) of combinations 1111 and 2222 is significantly above 0 and \({\mathbb{E}}\) of combinations 1112 and 1122 are significantly below 0 ( P value < 0.001; see Supplementary Table  2 for the statistical testing results for each combination). More notably, we find that combinations with the same or nearby tier numbers (e.g., 1111 and 1112) tend to have larger \({\mathbb{E}}\) and those with distant tier numbers (e.g., 1122) have smaller \({\mathbb{E}}\) , prompting us to study the relationship between a combination’s tier heterogeneity and its \({\mathbb{E}}\) . Specifically, we first calculate the relative difference \(D\) in the tier numbers for each combination:

where \({l}_{u}\) and \({l}_{v}\) is the tier number of roommates \(u\) and \(v\) , respectively. A smaller \(D\) indicates that roommates have closer tier numbers and thus a smaller difference in their academic performance. We then group combinations with the same \(D\) and arrange them in ascending order of \(D\) . We find that combinations with positive and negative \({\mathbb{E}}\) are overall separated (Fig.  1b ), where those with a smaller \(D\) tend to have positive \({\mathbb{E}}\) (i.e., over-represented in the actual data) and those with a larger \(D\) tend to have negative \({\mathbb{E}}\) (i.e., underrepresented in the actual data). Inspired by this observation, we calculate the mean value of \({\mathbb{E}}\) for each group with the same \(D\) , finding a negative relationship between \(D\) and \({\mathbb{E}}\) (Fig.  1c ). These results demonstrate that roommates tend to have more similar academic performance than random chance, suggesting the presence of roommate peer effects.

Assimilation of roommate academic performance

We generalize the tier combination analysis to the most granular tier for classification by directly dealing with the GPA percentile \(R\in\) [0, 1] (hereafter GPA for short). Specifically, similar to calculating the relative difference \(D\) in the tier combination for each dorm room, we develop an assimilation metric \(A\) to quantify the extent to which the GPAs of roommates differ from each other. Formally, the assimilation metric \(A\) for a 4-person dorm room is calculated by

where \({R}_{u}\) and \({R}_{v}\) are the GPAs of roommates \(u\) and \(v\) , respectively. The assimilation \(A\) of a dorm room is between 0 and 1, with a larger value indicating that roommates have more similar academic performance. If there is no roommate peer effect, each roommate’s GPA should be independent and identically distributed (i.i.d.), and the theoretical assimilation \(A\) of all dorm rooms has a mean value of 0.5 (see Supplementary Information Section  4.1 for detailed explanations).

Inspired by permutation tests, often referred to as the “quadratic assignment procedure” in social network studies 74 , 75 , we perform a statistical hypothesis test to check whether the assimilation of dorm rooms in the actual data deviates significantly from its theoretical value. Specifically, we proxy theoretical assimilation via null-model assimilation that is calculated based on a roommate null model and compare it with actual assimilation. An appropriate null model of a complex system satisfies a collection of constraints and offers a baseline to examine whether displayed features of interest are truly nontrivial 71 , 72 , 73 . We start with the actual roommate configuration and randomly shuffle students between dorm rooms while preserving their compositions of cohort, gender, and major. By repeating this process, we construct a plausible roommate null model that consists of 1000 independent implementations (see Supplementary Information Section  3.1 for details). We find that the mean of actual assimilation (0.549) of all dorm rooms is 10.7% larger than that of null-model assimilation (0.496; Fig.  2a ). A Student’s t -test confirms that the two assimilation distributions have significantly different means ( P value < 0.001; see Supplementary Information Section  4.2 for details). These results suggest that roommate assimilation in academic performance is greater than expected by chance alone, demonstrating significant roommate peer effects.

figure 2

a The density distribution p ( A ) of assimilation \(A\) for all dorm rooms. Larger assimilation means roommates have more similar academic performance. The upper half (in blue) shows the actual assimilation and the lower half (in gray) shows the null-model assimilation. Vertical dashed lines mark the statistically different means of the two assimilation distributions based on a Student’s t -test ( *** P value < 0.001). The mean actual assimilation is 10.7% larger than the mean null-model assimilation, which is close to its theoretical value 0.5. The plot is based on the data from all five semesters. b The overall increasing trend in the actual assimilation from semester 1 to semester 5. The y-axis shows the percentage difference between the mean actual assimilation and the mean null-model assimilation. Error bars represent standard errors clustered for 100 times of independent implementations.

The extent to which the mean of actual assimilation is larger than that of null-model assimilation indicates the magnitude of roommate peer effects, allowing us to examine temporal trends over the five semesters. First, we find that roommate peer effects remain significant when measured using data from each semester (see Supplementary Information Section  4.1 for details). Second, we hypothesize that before the first semester (i.e., the first day of college), roommate peer effects should be 0 due to the plausible random assignment of roommates, where the actual assimilation should be close to the null-model assimilation. As roommates live together longer and establish stronger interactions with each other, the actual assimilation of roommate academic performance would become larger, and the magnitude of roommate peer effects would become bigger. To test this hypothesis, for each semester, we calculate the percentage difference in the means of the actual assimilation and the null-model assimilation that is a proxy of roommate peer effects before the first semester (see Supplementary Information Section  4.1 for details). We find that the percentage difference exhibits an overall increasing trend over time (Fig.  2b ), which supports the hypothesis that, as roommates live together longer, the magnitude of roommate peer effects on academic performance becomes larger. These results are robust when we use an alternative way to estimate the magnitude of roommate peer effects, where we calculate the share of dorm rooms with larger-than-null-model assimilation (see Supplementary Information Section  4.3 for details). Moreover, our further analysis shows that female and male students have similar assimilation, suggesting no significant gender differences (see Supplementary Information Section  4.4 for details).

The effects of heterogeneous peers

The increasing assimilation of roommates in their academic performance raises a question about how a student’s future performance is impacted by their roommates’ prior performance, especially when there is substantial peer heterogeneity in performance, e.g., there are both high-achieving and underachieving roommates. To answer this question, we employ regression models to perform a Granger causality type of statistical analysis. Specifically, we first examine the relationship between a student’s post-GPA (GPA_Post; e.g., their own GPA in the second semester) and the average prior GPA of their roommates (RM_Avg; e.g., their roommate’s average GPA in the first semester) by calculating pairwise correlations for all consecutive semesters and dorm rooms. We find that dorm rooms tend to occupy the diagonal of the “GPA_Post – RM_Avg” plane (Fig.  3a ), suggesting that a student’s post-GPA is positively associated with the average prior GPA of their roommates. We then use an ordinary least squares (OLS) model to study the relationship between GPA_Post and RM_Avg (see “Methods” for the empirical specification) and summarize the regression results in Table  1 . We find that without controlling for the effects of other factors (see column (1) of Table  1 ), the average prior GPA of roommates has a significantly positive effect on a student’s post-GPA (regression coefficient b  = 0.365; P value < 0.001; Fig.  3b ).

figure 3

a The two-dimensional histogram shows the distributions of dorm rooms on the “GPA_Post – RM_Avg” plane. The y-axis shows the student’s post-GPA (GPA_Post), and the x-axis shows the average prior GPA of roommates (RM_Avg). It shows a positive correlation between GPA_Post and RM_Avg (Pearson’s \(r\)  = 0.244; P value < 0.001). b The regression plot for the relationship between GPA_Post and RM_Avg (center line) with the 95% confidence intervals (error bands), where the model includes no controls. c The plot for the relationship between GPA_Post and RM_Avg, where the model includes controls and fixed effects (see Table  1 for details). The “Low” and “High” on the x-axis represent 1 standard deviation (SD) below and above the mean (“Mid”) of RM_Avg, respectively. The horizontal dashed line marks the regression constant. d The plot for the moderating effects of peer heterogeneity. The relationship between GPA_Post and RM_Avg is moderated by the differences in roommate prior GPAs (RM_Diff). The “Low” and “High” in the legend represent 1 SD below and above the mean (“Mid”) of RM_Diff, respectively. The horizontal dashed line marks the regression constant.

Other factors may independently affect a student’s post-GPA and confound its association with the average prior GPA of their roommates. Therefore, we add controls and fixed effects into the OLS model (see “Methods”). The regression results shown in Table  1 convey several findings. First, a student’s prior GPA has the strongest effect on their post-GPA ( \(b\)  = 0.801, which is 16 times as large as \(b\)  = 0.050 for roommate average prior GPA; see columns (2) of Table  1 ), suggesting a significant path dependence on academic achievement. Second, the positive effect of roommate average prior GPA on a student’s post-GPA remains significant with controlling the student’s prior GPA, gender, cohort, major, and semester ( P value < 0.01; see column (2) of Table  1 and Fig.  3c ). Notably, female students perform better than male students on average (see Supplementary Information Section  4.4 for details). Third, the differences in roommate prior GPAs (RM_Diff) have no significant effect ( P value > 0.1; see columns (3) and (4) of Table  1 ), but it significantly moderates the relationship between roommate average prior GPA and post-GPA (see column (5) of Table  1 and Fig.  3d ) such that their positive relationship is more pronounced (slope b  = 0.055; 95% CI = [0.040, 0.070]) when RM_Diff is high (i.e., 1 SD above its mean) and less pronounced (slope b  = 0.028; 95% CI = [ \(-\) 0.001, 0.057]) when RM_Diff is low (i.e., 1 SD below its mean; see Supplementary Information Section  5.1 for detailed results of a simple slope test). The result also shows that high post-GPA is associated with large differences in the roommate’s prior GPA when the roommate’s average prior GPA is low (see the red line on the lower left of Fig.  3d ).

While the regression results suggest that roommate peer effects are significant, it is worth noting that the effect size appears to be modest. Specifically, a 100-point increase in roommate average prior GPA is associated with a 5-point increase in post-GPA ( \(b\)  = 0.050; see column (4) of Table  1 ). The effect is about 6% as large as the effect of a 100-point increase in prior GPA ( \(b\)  = 0.801), and it is about 10% of the average post-GPA. The magnitude is at a similar scale as reported by prior studies for various environments (e.g., dormitories and classrooms) and cultures (e.g., Western universities; see Supplementary Information Section  5.1 for details). To demonstrate its significance, we perform a falsification test by running the same OLS regression on the roommate null model, finding that the reported results are nontrivial (see Supplementary Information Section  5.3 for details). Together, these regression results suggest that a student’s performance is impacted not only by the average performance of roommates but also by their heterogeneity in academic performance.

The effects of in-dorm ordinal rank

Dorm rooms provide a highly interpersonal yet local environment, where competitive dynamics between roommates may affect their academic performance. Conditional on absolute academic performance, the ordinal rank of a student in their dorm room could have an independent effect on future achievement 80 , 81 . For instance, when a student’s ordinal rank is consistently low across all semesters, even if their absolute performance is high (e.g., the student has a GPA \(R\)  = 0.9 and their roommates all have \(R\)  > 0.9), they may still feel discouraged and less motivated, leading to fewer interactions with others and a potential decline in performance (see Supplementary Information Section  5.2 for explanations). This motivates us to study how a student’s in-dorm ordinal rank (OR_InDorm, with 1 being the highest and 4 being the lowest according to their prior performance; i.e., the number of better-achieving roommates including themself) affects their post-GPA. Specifically, we employ an OLS model that not only controls the student’s prior GPA, their roommate’s average prior GPA, and differences in prior GPAs, gender, and semester but also includes the fixed effects of cohort and major (see “Methods” for the empirical specification). We find that ordinal rank has a significantly positive effect on post-GPA ( P value < 0.05; see columns (1) and (2) of Table  2 and Fig.  4a ), suggesting that the number of better-achieving roommates in the dorm room predicts a student’s better academic performance in the future.

figure 4

a The plot for the relationship between a student’s GPA in the current semester (GPA_Post) and their ordinal rank according to GPA in the previous semester (OR_InDorm), where a larger rank value corresponds to a lower GPA. The OLS regression model includes controls and fixed effects (see Table  2 for details). The “Low” and “High” on the x-axis represent 1 standard deviation (SD) below and above the mean (“Mid”) of OR_InDorm, respectively. The horizontal dashed line marks the regression constant. b The plot for the moderating effects of peer heterogeneity. The relationship between GPA_Post and OR_InDorm is moderated by the differences in roommate GPAs in the previous semester (RM_Diff). The “Low” and “High” in the legend represent 1 SD below and above the mean (“Mid”) of RM_Diff, respectively. The horizontal dashed line marks the regression constant.

Through regression, we further examine whether the positive relationship between ordinal rank and post-GPA is moderated by other factors. We find that neither the interaction term of ordinal rank and own prior GPA nor the interaction term of ordinal rank and average roommate’s prior GPA is significant ( P value > 0.1; see columns (3) and (4) of Table  2 ). Yet, the interaction term of ordinal rank and differences in roommate prior GPA (RM_Diff) is significantly negative ( P value < 0.05; see columns (5) of Table  2 ). Specifically, the effect of ordinal rank on post-GPA is more pronounced (slope b  = 0.007; 95% CI = [0.002, 0.012]) when RM_Diff is low (Fig.  4b ), while the effect is not significant (slope b  =  \(-\) 0.000; 95% CI = [ \(-\) 0.007, 0.007]) when RM_Diff is high (see Supplementary Information Section  5.2 for detailed results of a simple slope test). The result also shows that high post-GPA is associated with large differences in roommate prior GPA when ordinal rank is low (see the red line on the lower left of Fig.  4b ). Although the effect size is modest, our falsification test on the roommate null model demonstrates that the results are nontrivial and significant (see Supplementary Information Section  5.3 for details). Taken together, these results suggest that roommate peer effects tend to disproportionately benefit underachieving students with homogeneous roommates (i.e., those who have similar performance) and high-achieving students with heterogeneous peers (i.e., those who have widely varied performance).

We quantified roommate peer effects on academic performance by applying both null-model and regression approaches to analyze a longitudinal dataset of student accommodation and academic performance, where roommate assignments are plausibly random upon enrollment and roommate relationships persist until graduation. We found evidence showing that roommates have a direct influence on a student’s performance, with some heterogeneity in the variation among the roommates and the baseline achievement of the student. Specifically, by constructing a roommate null model and calculating an assimilation metric, we showed that roommates have more similar performance than expected by chance alone. Moreover, the average assimilation of roommate academic performance exhibits an overall increasing trend over time, suggesting that peer effects become stronger as roommates live together longer, get more familiar with each other, and establish stronger interactions that facilitate knowledge spillovers 61 , 65 , 82 . More specifically, the increase in assimilation is more pronounced in the third semester (Fig.  2b and Supplementary Fig.  8 ), which is consistent with previous literature showing that peer effects are strong and persistent when friendships last over a year 79 , 83 , and it appears to be disrupted in the fifth semester, which may be because senior students have a higher chance of taking different elective courses and have more outside activities that might decrease the interactions between roommates 84 .

Our regression analysis further unpacks roommate peer effects, especially along the dimension of peer heterogeneity. We found that a student’s future performance is not only strongly predicted by their prior performance, suggesting a significant path dependence in academic development 85 , 86 , 87 , but also impacted by their roommates’ prior performance. Also, the positive relationship between a student’s future performance and the average prior performance of roommates is moderated by peer heterogeneity such that it is more pronounced when roommates are similar. In particular, when living with roommates who have, on average low prior performance, a student benefits more if roommates are more different, suggesting the positive role of peer heterogeneity 88 , 89 , 90 . Moreover, ordinal rank in the dorm room has an independent effect since the number of better-achieving roommates is positively associated with future performance. Yet, peer heterogeneity moderates this relationship such that it is significant only when roommates are more similar. The magnitudes of peer effects assessed using regression may appear modest, but they are significant and in line with the literature. Together, these results paint a rich picture of roommate peer effects and suggest that the effective strategy for improving a student’s performance may depend on their position in a high-dimensional space of ordinal rank, peer average performance, and peer heterogeneity.

While our work helps better understand roommate peer effects, the results should be interpreted in light of the limitations of the data and analysis. First, the longitudinal data were limited to two cohorts of Chinese undergraduates in one university. The extent to which these findings can be generalized to other student populations, universities, and countries should be further investigated where relevant data on student accommodation and academic outcomes are available. Second, the roommate assignments were plausibly random according to the administrative procedures. While providing some supporting evidence for this assumption (see Supplementary Information Section  1.2 for details), we lacked comprehensive data on student demographics, personal information, and pre-college academic performance to examine it directly. Third, the analysis relies on GPA percentiles normalized for each cohort and major, which allows for fair comparisons between disciplines but, at the same time, may lose more information in the data. A better normalization that preserves the distribution of GPAs, for example, would be an improvement. Fourth, factors outside of the dormitory environment may mediate the assimilation of roommates’ academic performance, such as orderliness, classroom interactions, social networks, behavior patterns, and common external factors 16 , 17 , 65 . Unraveling the mechanisms underlying roommate peer effects (e.g., peer pressure and student identity 91 ) was beyond the reach of this study but is desirable as future work.

In summary, we demonstrate the peer effect of college roommates and assess its magnitude by employing basic statistical methods to analyze new longitudinal data from a quasi-experiment. The university dorm room environment is ideal for identifying a group of frequently interacting and stable student peers whose learning outcomes can be easily tracked over time. The null model we use, which is essentially permutation tests 75 , 76 , does not assume linear relationships between variables and is flexible enough to be applied to study peer effects in other complex social systems. Also, effect sizes assessed by the null model can facilitate comparisons between different datasets. Moreover, the regression model allows us to address concerns about inverse causality and better understand peer effects. Particularly, the regression findings have potential policy implications for education and dormitory management. For example, by adjusting the composition of roommates, such as reducing peer heterogeneity for students with, on average, high-achieving roommates, dorm rooms may be engineered, to some extent, to enhance the positive influence of roommates in improving students’ academic performance. Furthermore, our findings suggest the benefits of exposure to student role models and learning from peers in everyday life in addition to teachers in classrooms only.

Chinese universities provide on-campus dormitories for almost all undergraduates, allowing us to observe a large-scale longitudinal sample of student roommates and relate it to their academic performance. From a public university in China, we collected the accommodation and academic performance data of 5,272 undergraduates, who lived in identical 4-person dorm rooms in the same or nearby dorm building on campus. Different from a dorm suite that contains four separate bedrooms, a 4-person dorm room is a single bedroom with four beds, where each student occupies one bed and shares public areas with roommates (see Supplementary Fig.  1 for an example layout). Per the university’s student accommodation management regulations, newly admitted students were assigned to dorm rooms under the condition that those in the same administrative unit, major, or school live together as much as possible and there is no gender mix in dorm rooms or buildings. The process neither allowed students to choose roommates or rooms nor took into account their academic performance before admission, socioeconomic backgrounds, or personal preferences. Students were informed of their accommodation only when they moved in before the first semester. As a quasi-experiment, the administrative procedure resulted in a plausibly, if not perfect, random assignment of roommates concerning their prior academic performance and personal information. Moreover, there was no significant individual selection later in the semesters. Once assigned together, roommates lived together until their graduation. Moving out or changing roommates was very rare on a few occasions (see Supplementary Information Section  1.2 for more details).

The dataset covers two cohorts of Chinese undergraduates who were admitted by the university in 2011 and 2012, respectively. For each student, we solicited information about their cohort, gender, major, and dorm room, based on which we determined roommate relationships. As a measure of academic performance, we collected the GPA data of these students for the first five successive semesters up to 2014 and further normalized it for each semester to a GPA percentile for students in the same cohort and major (see Supplementary Information Section  1.2 for details). The stable roommate relationship and the longitudinal academic performance data allowed us to study how a student is affected by roommates over time. All students were anonymized in the data collection and analysis process, and the dataset contains no identifiable information. This study was approved by the Institutional Review Board (IRB) at the University of Electronic Science and Technology of China (IRB No. 1061420210802005).

Statistical hypothesis test

Given a tier of classification for students’ GPA, following permutation tests 74 , 75 , 76 , we perform a statistical test to examine whether the relative ratio \({\mathbb{E}}\) of each combination (e.g., 1111) in the actual data deviates significantly from its theoretical value 0. Specifically, we generate a roommate null model by implementing the random shuffling process and calculate the null-model relative ratio for each combination: \(\widetilde{{\mathbb{E}}}=\left({P}_{n}-{P}_{t}\right)/{P}_{t}\) , where \({P}_{n}\) and \({P}_{t}\) is the null-model and theoretical frequency of the combination, respectively. By null-model construction, \({P}_{n}\) should approach \({P}_{t}\) , and thus \(\widetilde{{\mathbb{E}}}\) should be close to 0. For each combination, we compare the actual \({\mathbb{E}}\) with its null-model \(\widetilde{{\mathbb{E}}}\) . If \({\mathbb{E}}\) is significantly above 0, the probability of observing \({\mathbb{E}}\le \widetilde{{\mathbb{E}}}\) in the actual data should be sufficiently small, e.g., less than 0.001. Accordingly, our null hypothesis (H0) is \({\mathbb{E}}\le \widetilde{{\mathbb{E}}}\) , and the alternative hypothesis (H1) is \({\mathbb{E}} \, > \, \widetilde{{\mathbb{E}}}\) . To empirically test H0, we generate 1000 roommate null models (where each null model is an independent implementation of the random shuffling process) and calculate \(\widetilde{{\mathbb{E}}}\) under 2-tier, 3-tier, and 4-tier classifications, respectively. We find that \({\mathbb{E}}\) of some combinations is larger than \(\widetilde{{\mathbb{E}}}\) for all 1000 roommate null models, allowing us to reject H0 and support H1 (i.e., \({\mathbb{E}}\) is significantly larger than 0 with a P value < 0.001 in the one-sided statistical test; the combination is over-represented in the actual data). Similarly, we test whether \({\mathbb{E}}\) of a combination is significantly below 0. Under the 2-tier classification, for example, combinations with significantly positive \({\mathbb{E}}\) include 1111 and 2222 ( P value < 0.001) and those with significantly negative \({\mathbb{E}}\) include 1112 and 1122 ( P value < 0.001) as well as 1222 ( P value < 0.05; see Supplementary Table  2 for the statistical testing results for each combination under these tier classifications). Overall, we find that significantly positive combinations have the same or nearby tier numbers and significantly negative ones have distant tier numbers.

To perform a single statistical test for all combinations together given a tier of classification, we calculate the total relative ratio \(\sum \left|{\mathbb{E}}\right|\) and \(\sum \left|\widetilde{{\mathbb{E}}}\right|\) by summing up the absolute \({\mathbb{E}}\) and \(\widetilde{{\mathbb{E}}}\) of each combination, respectively. As \(\widetilde{{\mathbb{E}}}\) is close to 0, \(\sum \left|\widetilde{{\mathbb{E}}}\right|\) should also be close to 0. If we assume \(\sum \left|{\mathbb{E}}\right|\le \sum \left|\widetilde{{\mathbb{E}}}\right|\) , it is naturally that \(\sum \left|{\mathbb{E}}\right|\) is close to 0, yielding \({\mathbb{E}}\) to be close to 0. There, \({\mathbb{E}}\) and \(\widetilde{{\mathbb{E}}}\) wouldn’t have a significant difference because they are both close to 0. Thereby, to say \({\mathbb{E}}\) is significantly different from \(\widetilde{{\mathbb{E}}}\) , the probability of observing \(\sum \left|{\mathbb{E}}\right|\le \sum \left|\widetilde{{\mathbb{E}}}\right|\) should be sufficiently small, e.g., less than 0.001. Accordingly, our null hypothesis (H0) is \(\sum \left|{\mathbb{E}}\right|\le \sum \left|\widetilde{{\mathbb{E}}}\right|\) , and the alternative hypothesis (H1) is \(\sum \left|{\mathbb{E}}\right| > \sum \left|\widetilde{{\mathbb{E}}}\right|\) . We find that, under 2-tier, 3-tier, and 4-tier classifications, \(\sum \left|{\mathbb{E}}\right|\) is always larger than \(\sum \left|\widetilde{{\mathbb{E}}}\right|\) for all 1000 roommate null models, allowing us to reject H0 and support H1 with a P value < 0.001 (i.e., the overall \({\mathbb{E}}\) of all combinations is different from 0). Taken together, our hypothesis testing results suggest that \({\mathbb{E}}\) of some combinations in the actual data deviate significantly from 0, where those with nearby tier numbers are more likely to be observed and those with distant tier numbers are less likely to be observed than random chance, suggesting significant roommate peer effects (see Supplementary Information Section  3.2 for details).

Regression model

We employ an ordinary least squares (OLS) model to study the relationship between a student’s future performance (GPA_Post) and the average prior performance of their roommate (RM_Avg) and how this relationship is moderated by the differences in roommate prior performance (RM_Diff). The OLS model includes several controls on student demographics and prior performance. Specifically, the empirical specification is given by

where \({\epsilon }_{i}\) is the error term for student i , and the semester index s ranges from 1 to 4. The dependent variable \({G}_{i}^{s+1}\) is the student’s GPA in semester s  + 1 (GPA_Post), and the independent variable of interest \({G}_{i}^{s}\) is the student’s GPA in semester s (GPA_Prior). The variable \({{RA}}_{i}^{s}\) is the roommate average GPA in semester s (RM_Avg), \({{RD}}_{i}^{s}\) is the differences in roommate GPAs in semester s (RM_Diff), and \({{RA}}_{i}^{s}\times {{RD}}_{i}^{s}\) is their interaction term. The variable \({D}^{{Ge}}\) is a gender dummy, which is coded as 1 and 0 for females and males, respectively. The variables \({D}^{{Ma}}\) , \({D}^{{Co}}\) , and \({D}^{{Se}}\) are major, cohort, and semester dummies, respectively (see Supplementary Table  3 for details).

Moreover, we employ an OLS model to study the relationship between a student’s in-dorm ordinal rank (OR_InDorm) according to prior performance and their future performance after controlling their prior performance, the average and differences in roommate prior performance, their gender, major, cohort, and semester. Meanwhile, we examine how this relationship is moderated by other factors, including peer heterogeneity. Specifically, the empirical specification is given by

where \({{OR}}_{i}^{s}\) is the OR_InDorm of student \(i\) in semester \(s\) (ranging from 1 to 4) and \({\epsilon }_{i}\) is the error term. The interaction terms are \({{OR}}_{i}^{s}\times {G}_{i}^{s}\) between OR_InDorm and GPA_Prior, \({{OR}}_{i}^{s}\times {{RA}}_{i}^{s}\) between OR_InDorm and RM_Avg, and \({{OR}}_{i}^{s}\times {{RD}}_{i}^{s}\) between OR_InDorm and RM_Diff for student \(i\) in semester \(s\) . All other controls are the same as above (see Supplementary Information Section  5 for details on these variables and Supplementary Table  3 for summary statistics).

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

All data necessary to replicate the statistical analyses and main figures are available in Supplementary Information and have been deposited in the open-access repository Figshare ( https://doi.org/10.6084/m9.figshare.25286017 ) 92 . The raw data of anonymized student accommodation and academic performance are protected by a data use agreement. Those who are interested in the raw data may contact the corresponding authors for access after obtaining Institutional Review Board (IRB) approval.

Code availability

All code necessary to replicate the statistical analyses and main figures has been deposited in the open-access repository Figshare ( https://doi.org/10.6084/m9.figshare.25286017 ) 92 .

Lewis, K., Gonzalez, M. & Kaufman, J. Social selection and peer influence in an online social network. Proc. Natl Acad. Sci. USA 109 , 68–72 (2012).

Article   ADS   CAS   PubMed   Google Scholar  

Aral, S., Muchnik, L. & Sundararajan, A. Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks. Proc. Natl Acad. Sci. USA 106 , 21544–21549 (2009).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Centola, D. The spread of behavior in an online social network experiment. Science 329 , 1194–1197 (2010).

Muchnik, L., Aral, S. & Taylor, S. J. Social influence bias: a randomized experiment. Science 341 , 647–651 (2013).

Wolske, K. S., Gillingham, K. T. & Schultz, P. Peer influence on household energy behaviours. Nat. Energy 5 , 202–212 (2020).

Article   ADS   Google Scholar  

Falk, A. & Ichino, A. Clean evidence on peer effects. J. Labor Econ. 24 , 39–57 (2006).

Article   Google Scholar  

Banerjee, A., Chandrasekhar, A. G., Duflo, E. & Jackson, M. O. The diffusion of microfinance. Science 341 , 1236498 (2013).

Article   PubMed   Google Scholar  

Contractor, N. S. & DeChurch, L. A. Integrating social networks and human social motives to achieve social influence at scale. Proc. Natl Acad. Sci. USA 111 , 13650–13657 (2014).

Aral, S. & Nicolaides, C. Exercise contagion in a global social network. Nat. Commun. 8 , 14753 (2017).

Quispe-Torreblanca, E. G. & Stewart, N. Causal peer effects in police misconduct. Nat. Hum. Behav. 3 , 797–807 (2019).

Bramoull. Peer effects in networks: A survey. Annual Review of Economics 12 , 603–629 (2020).

Zimmerman, D. J. Peer effects in academic outcomes: evidence from a natural experiment. Rev. Econ. Stat. 85 , 9–23 (2003).

Imberman, S. A., Kugler, A. D. & Sacerdote, B. I. Katrina’s children: evidence on the structure of peer effects from hurricane evacuees. Am. Economic Rev. 102 , 2048–2082 (2012).

Feld, J. & Zlitz, U. Understanding peer effects: on the nature, estimation, and channels of peer effects. J. Labor Econ. 35 , 387–428 (2017).

Golsteyn, B. H. H., Non, A. & Zlitz, U. The impact of peer personality on academic achievement. J. Political Econ. 129 , 1052–1099 (2021).

Cao, Y. et al. Orderliness predicts academic performance: behavioural analysis on campus lifestyle. J. R. Soc. Interface 15 , 20180210 (2018).

Article   PubMed   PubMed Central   Google Scholar  

Yao, H., Lian, D., Cao, Y., Wu, Y. & Zhou, T. Predicting academic performance for college students: a campus behavior perspective. ACM Trans. Intell. Syst. Technol. 10 , 24 (2019).

Sacerdote, B. Peer effects in education: how might they work, how big are they and how much do we know thus far? Handb. Econ. Educ. 3 , 249–277 (2011).

Lewbel, A., Qu, X. & Tang, X. Social networks with unobserved links. J. Political Econ. 131 , 898–946 (2023).

Ha, W. Quasi-experimental evidence of academic peer effects at an Elite University in People’s Republic of China. Asia Pac. Educ. Rev. 17 , 703–718 (2016).

Ma, L. & Huang, Y. Roommate effects on students’ performance in higher education. Peking. Univ. Educ. Rev. 19 , 41–63 (2021).

Google Scholar  

Gao, J., Zhang, Y. C. & Zhou, T. Computational socioeconomics. Phys. Rep. 817 , 1–104 (2019).

Article   ADS   MathSciNet   Google Scholar  

Hoxby, C. M. Peer effects in the classroom: learning from gender and race variation. National Bureau of Economic Research. NBER Working Paper No. w7867 https://doi.org/10.3386/w7867 (2000).

Carman, K. G. & Zhang, L. Classroom peer effects and academic achievement: evidence from a Chinese middle school. China Economic Rev. 23 , 223–237 (2012).

Lavy, V., Paserman, M. D. & Schlosser, A. Inside the black box of ability peer effects: evidence from variation in the proportion of low achievers in the classroom. Economic J. 122 , 208–237 (2012).

Burke, M. A. & Sass, T. R. Classroom peer effects and student achievement. J. Labor Econ. 31 , 51–82 (2013).

Carrell, S. E., Fullerton, R. L. & West, J. E. Does your cohort matter? measuring peer effects in college achievement. J. Labor Econ. 27 , 439–464 (2009).

Lavy, V. & Schlosser, A. Mechanisms and impacts of gender peer effects at school. Am. Economic J.: Appl. Econ. 3 , 1–33 (2011).

Padilla-Walker, L. M. & Bean, R. A. Negative and positive peer influence: relations to positive and negative behaviors for African American, European American, and Hispanic adolescents. J. Adolescence 32 , 323–337 (2009).

Brady, R. R., Insler, M. A. & Rahman, A. S. Bad company: understanding negative peer effects in college achievement. Eur. Economic Rev. 98 , 144–168 (2017).

Eckles, D., Kizilcec, R. F. & Bakshy, E. Estimating peer effects in networks with peer encouragement designs. Proc. Natl Acad. Sci. USA 113 , 7316–7322 (2016).

Booij, A. S., Leuven, E. & Oosterbeek, H. Ability peer effects in university: evidence from a randomized experiment. Rev. Economic Stud. 84 , 547–578 (2017).

Christakis, N. A. & Fowler, J. H. The collective dynamics of smoking in a large social network. N. Engl. J. Med. 358 , 2249–2258 (2008).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sotoudeh, R., Harris, K. M. & Conley, D. Effects of the peer metagenomic environment on smoking behavior. Proc. Natl Acad. Sci. USA 116 , 16302–16307 (2019).

Kawaguchi, D. Peer effects on substance use among American teenagers. J. Popul. Econ. 17 , 351–367 (2004).

Duncan, G. J., Boisjoly, J., Kremer, M., Levy, D. M. & Eccles, J. Peer effects in drug use and sex among college students. J. Abnorm. Child Psychol. 33 , 375–385 (2005).

Borsari, B. & Carey, K. B. Peer influences on college drinking: a review of the research. J. Subst. Abus. 13 , 391–424 (2001).

Article   CAS   Google Scholar  

Kremer, M. & Levy, D. Peer effects and alcohol use among college students. J. Economic Perspect. 22 , 189–206 (2008).

Han, Y., Grogan-Kaylor, A., Delva, J. & Xie, Y. Estimating the heterogeneous relationship between peer drinking and youth alcohol consumption in Chile using propensity score stratification. Int. J. Environ. Res. Public Health 11 , 11879–11897 (2014).

Strother, L., Piston, S., Golberstein, E., Gollust, S. E. & Eisenberg, D. College roommates have a modest but significant influence on each other’s political ideology. Proc. Natl Acad. Sci. USA 118 , e2015514117 (2021).

Article   CAS   PubMed   Google Scholar  

Xie, Y., Fang, M. & Shauman, K. STEM education. Annu. Rev. Sociol. 41 , 331–357 (2015).

Dennehy, T. C. & Dasgupta, N. Female peer mentors early in college increase women’s positive academic experiences and retention in engineering. Proc. Natl Acad. Sci. USA 114 , 5964–5969 (2017).

Wu, D. J., Thiem, K. C. & Dasgupta, N. Female peer mentors early in college have lasting positive impacts on female engineering students that persist beyond graduation. Nat. Commun. 13 , 6837 (2022).

Bostwick, V. K. & Weinberg, B. A. Nevertheless she persisted? gender peer effects in doctoral STEM programs. J. Labor Econ. 40 , 397–436 (2022).

Thiemann, P. The persistent effects of short-term peer groups on performance: evidence from a natural experiment in higher education. Manag. Sci. 68 , 1131–1148 (2022).

Jones, T. R. & Kofoed, M. S. Do peers influence occupational preferences? evidence from randomly-assigned peer groups at West Point. J. Public Econ. 184 , 104154 (2020).

Herbst, D. & Mas, A. Peer effects on worker output in the laboratory generalize to the field. Science 350 , 545–549 (2015).

Anelli, M. & Peri, G. The effects of high school peers’ gender on college major, college performance and income. Economic J. 129 , 553–602 (2019).

Hasan, S. & Koning, R. Prior ties and the limits of peer effects on startup team performance. Strategic Manag. J. 40 , 1394–1416 (2019).

Chetty, R. et al. How does your kindergarten classroom affect your earnings? evidence from Project STAR. Q. J. Econ. 126 , 1593–1660 (2011).

Cornelissen, T., Dustmann, C. & Schnberg, U. Peer effects in the workplace. Am. Economic Rev. 107 , 425–456 (2017).

Carrell, S. E., Hoekstra, M. & Kuka, E. The long-run effects of disruptive peers. Am. Economic Rev. 108 , 3377–3415 (2018).

Cheng, S., Brand, J. E., Zhou, X., Xie, Y. & Hout, M. Heterogeneous returns to college over the life course. Sci. Adv. 7 , eabg7641 (2021).

Article   ADS   PubMed   PubMed Central   Google Scholar  

DeLay, D. et al. Peer influence on academic performance: a social network analysis of social-emotional intervention effects. Prev. Sci. 17 , 903–913 (2016).

Article   ADS   PubMed   Google Scholar  

Li, Z. A., Wang, G. & Wang, H. Peer effects in competitive environments: field experiments on information provision and interventions. Manag. Inf. Syst. Q. 45 , 163–191 (2021).

Kiessling, L., Radbruch, J. & Schaube, S. Self-selection of peers and performance. Manag. Sci. 68 , 8184–8201 (2022).

Currarini, S., Jackson, M. O. & Pin, P. Identifying the roles of race-based choice and chance in high school friendship network formation. Proc. Natl Acad. Sci. USA 107 , 4857–4861 (2010).

Mercken, L., Steglich, C., Sinclair, P., Holliday, J. & Moore, L. A longitudinal social network analysis of peer influence, peer selection, and smoking behavior among adolescents in British schools. Health Psychol. 31 , 450 (2012).

Jackson, M. O., Nei, S. M., Snowberg, E. & Yariv, L. The dynamics of networks and homophily. National Bureau of Economic Research, NBER Working Paper No. w30815 https://doi.org/10.3386/w30815 (2023).

Sacerdote, B. Peer effects with random assignment: results for Dartmouth roommates. Q. J. Econ. 116 , 681–704 (2001).

Zhang, L. & Pu, S. It takes two shining lights to brighten the room: peer effects with random roommate assignments. Educ. Econ. 25 , 3–21 (2017).

Chen, Y. & Snmez, T. Improving efficiency of on-campus housing: an experimental study. Am. Economic Rev. 92 , 1669–1686 (2002).

Pu, S., Yan, Y. & Zhang, L. Peers, study effort, and academic performance in college education: evidence from randomly assigned roommates in a flipped classroom. Res. High. Educ. 61 , 248–269 (2020).

Eckles, D. & Bakshy, E. Bias and high-dimensional adjustment in observational studies of peer effects. J. Am. Stat. Assoc. 116 , 507–517 (2021).

Article   MathSciNet   CAS   Google Scholar  

Stadtfeld, C., Vrs, A., Elmer, T., Boda, Z. & Raabe, I. J. Integration in emerging social networks explains academic failure and success. Proc. Natl Acad. Sci. USA 116 , 792–797 (2019).

Whitmore, D. Resource and peer impacts on girls’ academic achievement: evidence from a randomized experiment. Am. Economic Rev. 95 , 199–203 (2005).

Duflo, E., Dupas, P. & Kremer, M. Peer effects, teacher incentives, and the impact of tracking: evidence from a randomized evaluation in Kenya. Am. Economic Rev. 101 , 1739–1774 (2011).

Lomi, A., Snijders, T. A., Steglich, C. E. & Torlo, V. J. Why are some more peer than others? evidence from a longitudinal study of social networks and individual academic performance. Soc. Sci. Res. 40 , 1506–1520 (2011).

Sacerdote, B. Experimental and quasi-experimental analysis of peer effects: two steps forward? Annu. Rev. Econ. 6 , 253–272 (2014).

Article   MathSciNet   Google Scholar  

Raudenbush, S. W. & Schwartz, D. Randomized experiments in education, with implications for multilevel causal inference. Annu. Rev. Stat. Its Application 7 , 177–208 (2020).

Mucha, P. J., Richardson, T., Macon, K., Porter, M. A. & Onnela, J.-P. Community structure in time-dependent, multiscale, and multiplex networks. Science 328 , 876–878 (2010).

Article   ADS   MathSciNet   CAS   PubMed   Google Scholar  

Cimini, G. et al. The statistical physics of real-world networks. Nat. Rev. Phys. 1 , 58–71 (2019).

Váša, F. & Mišić, B. Null models in network neuroscience. Nat. Rev. Neurosci. 23 , 493–504 (2022).

Krackhardt, D. Predicting with networks: nonparametric multiple regression analysis of dyadic data. Soc. Netw. 10 , 359–381 (1988).

Hubert, L. & Schultz, J. Quadratic assignment as a general data analysis strategy. Br. J. Math. Stat. Psychol. 29 , 190–241 (1976).

Raudenbush, S. W. & Bryk, A. S. Hierarchical Linear Models: Applications and Data Analysis Methods . Vol. 1 (Sage Publications, 2002).

Boda, Z., Elmer, T., Vörös, A. & Stadtfeld, C. Short-term and long-term effects of a social network intervention on friendships among university students. Sci. Rep. 10 , 2889 (2020).

Angrist, J. D. The perils of peer effects. Labour Econ. 30 , 98–108 (2014).

Patacchini, E., Rainone, E. & Zenou, Y. Heterogeneous peer effects in education. J. Economic Behav. Organ. 134 , 190–227 (2017).

Murphy, R. & Weinhardt, F. Top of the class: the importance of ordinal rank. Rev. Economic Stud. 87 , 2777–2826 (2020).

Bertoni, M. & Nisticò, R. Ordinal rank and the structure of ability peer effects. J. Public Econ . 217 , 104797 (2023).

Webb, N. M. Student interaction and learning in small groups. Rev. Educ. Res. 52 , 421–445 (1982).

Foster, G. It’s not your peers, and it’s not your friends: some progress toward understanding the educational peer effect mechanism. J. Public Econ. 90 , 1455–1475 (2006).

Deslauriers, L., McCarty, L. S., Miller, K., Callaghan, K. & Kestin, G. Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom. Proc. Natl Acad. Sci. USA 116 , 19251–19257 (2019).

Erikson, R., Goldthorpe, J. H., Jackson, M., Yaish, M. & Cox, D. R. On class differentials in educational attainment. Proc. Natl Acad. Sci. USA 102 , 9730–9733 (2005).

Petersen, A. M., Jung, W.-S., Yang, J.-S. & Stanley, H. E. Quantitative and empirical demonstration of the Matthew effect in a study of career longevity. Proc. Natl Acad. Sci. USA 108 , 18–23 (2011).

Miao, L. et al. The latent structure of global scientific development. Nat. Hum. Behav. 6 , 1206–1217 (2022).

Hamilton, B. H., Nickerson, J. A. & Owan, H. Team incentives and worker heterogeneity: an empirical analysis of the impact of teams on productivity and participation. J. Political Econ. 111 , 465–497 (2003).

Lyle, D. S. The effects of peer group heterogeneity on the production of human capital at West Point. Am. Economic J.: Appl. Econ. 1 , 69–84 (2009).

Chan, T. Y., Li, J. & Pierce, L. Compensation and peer effects in competing sales teams. Manag. Sci. 60 , 1965–1984 (2014).

Akerlof, G. A. & Kranton, R. E. Identity and schooling: some lessons for the economics of education. J. Economic Lit. 40 , 1167–1201 (2002).

Cao, Y., Zhou, T. & Gao, J. Heterogeneous peer effects of college roommates on academic performance. Figshare. https://doi.org/10.6084/m9.figshare.25286017 (2024).

Download references

Acknowledgements

The authors thank Min Nie, Shimin Cai, Defu Lian, Zhihai Rong, Huaxiu Yao, Yifan Wu, Lili Miao, and Linyan Zhang for their valuable discussions. This work was partially supported by the National Natural Science Foundation of China Grant Nos. 42361144718 and 11975071 (T.Z.) and the Ministry of Education of Humanities and Social Science Project Grant No. 21JZD055 (T.Z.).

Author information

Authors and affiliations.

CompleX Lab, University of Electronic Science and Technology of China, Chengdu, China

Yi Cao & Tao Zhou

Big Data Research Center, University of Electronic Science and Technology of China, Chengdu, China

Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA

Kellogg School of Management, Northwestern University, Evanston, IL, USA

Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA

Faculty of Social Sciences, The University of Hong Kong, Hong Kong SAR, China

You can also search for this author in PubMed   Google Scholar

Contributions

T.Z. and J.G. designed research; T.Z. collected data; Y.C. and J.G. performed research; Y.C., T.Z., and J.G. analyzed data; J.G. wrote the paper; Y.C. and T.Z. revised the paper.

Corresponding authors

Correspondence to Tao Zhou or Jian Gao .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Communications thanks the anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Cao, Y., Zhou, T. & Gao, J. Heterogeneous peer effects of college roommates on academic performance. Nat Commun 15 , 4785 (2024). https://doi.org/10.1038/s41467-024-49228-7

Download citation

Received : 13 September 2023

Accepted : 24 May 2024

Published : 06 June 2024

DOI : https://doi.org/10.1038/s41467-024-49228-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

peer review higher education

  • Open access
  • Published: 30 May 2024

Differential attainment in assessment of postgraduate surgical trainees: a scoping review

  • Rebecca L. Jones 1 , 2 ,
  • Suwimol Prusmetikul 1 , 3 &
  • Sarah Whitehorn 1  

BMC Medical Education volume  24 , Article number:  597 ( 2024 ) Cite this article

164 Accesses

Metrics details

Introduction

Solving disparities in assessments is crucial to a successful surgical training programme. The first step in levelling these inequalities is recognising in what contexts they occur, and what protected characteristics are potentially implicated.

This scoping review was based on Arksey & O’Malley’s guiding principles. OVID and Embase were used to identify articles, which were then screened by three reviewers.

From an initial 358 articles, 53 reported on the presence of differential attainment in postgraduate surgical assessments. The majority were quantitative studies (77.4%), using retrospective designs. 11.3% were qualitative. Differential attainment affects a varied range of protected characteristics. The characteristics most likely to be investigated were gender (85%), ethnicity (37%) and socioeconomic background (7.5%). Evidence of inequalities are present in many types of assessment, including: academic achievements, assessments of progression in training, workplace-based assessments, logs of surgical experience and tests of technical skills.

Attainment gaps have been demonstrated in many types of assessment, including supposedly “objective” written assessments and at revalidation. Further research is necessary to delineate the most effective methods to eliminate bias in higher surgical training. Surgical curriculum providers should be informed by the available literature on inequalities in surgical training, as well as other neighbouring specialties such as medicine or general practice, when designing assessments and considering how to mitigate for potential causes of differential attainment.

Peer Review reports

Diversity in the surgical workforce has been a hot topic for the last 10 years, increasing in traction following the BlackLivesMatter movement in 2016 [ 1 ]. In the UK this culminated in publication of the Kennedy report in 2021 [ 2 ]. Before this the focus was principally on gender imbalance in surgery, with the 2010 Surgical Workforce report only reporting gender percentages by speciality, with no comment on racial profile, sexuality distribution, disability occurrence, or socioeconomic background [ 3 ].

Gender is not the only protected characteristic deserving of equity in surgery; many groups find themselves at a disadvantage during postgraduate surgical examinations [ 4 ] and at revalidation [ 5 ]. This phenomenon is termed ‘differential attainment’ (DA), in which disparities in educational outcomes, progression rates, or achievements between groups with protected characteristics occur [ 4 ]. This may be due to the assessors’ subconscious bias, or a deficit in training and education before assessment.

One of the four pillars of medical ethics is “justice”, emphasising that healthcare should be provided in a fair, equitable, and ethical manner, benefiting all individuals and promoting the well-being of society as a whole. This applies not only to our patients but also to our colleagues; training should be provided in a fair, equitable, and ethical manner, benefiting all. By applying the principle of justice to surgical trainees, we can create an environment that is supportive, inclusive, and conducive to professional growth and well-being.

A diverse consultant body is crucial for providing high-quality healthcare to a diverse patient population. It has been shown that patients are happier when cared for by a doctor with the same ethnic background [ 6 ]. Takeshita et al. [ 6 ] proposed this is due to a greater likelihood of mutual understanding of cultural values, beliefs, and preferences and is therefore more likely to cultivate a trusting relationship, leading to accurate diagnosis, treatment adherence and improved patient understanding. As such, ensuring that all trainees are justly educated and assessed throughout their training may contribute to improving patient care by diversifying the consultant body.

Surgery is well known to have its own specific culture, language, and social rules which are unique even within the world of medicine [ 7 , 8 ]. Through training, graduates develop into surgeons, distinct from other physicians and practitioners [ 9 ]. As such, research conducted in other medical domains is not automatically applicable to surgery, and behavioural interventions focused on reducing or eliminating bias in training need to be tailored specifically to surgical settings.

Consequently, it’s important that the surgical community asks the questions:

Does DA exist in postgraduate surgical training, and to what extent?

Why does DA occur?

What groups or assessments are under-researched?

How can we apply this knowledge, or acquire new knowledge, to provide equity for trainees?

The following scoping review hopes to provide the surgical community with robust answers for future of surgical training.

Aims and research question

The aim of this scoping review is to understand the breadth of research about the presence of DA in postgraduate surgical education and to determine themes pertaining to causes of inequalities. A scoping review was chosen to provide a means to map the available literature, including published peer-reviewed primary research and grey literature.

Following the methodological framework set out by Arksey and O’Malley [ 10 ], our research was intended to characterise the literature addressing DA in HST, including Ophthalmology, Obstetrics & Gynaecology (O&G). We included literature from English-language speaking countries, including the UK and USA.

Search strategy

We used search terms tailored to our target population characteristics (e.g., gender, ethnicity), concept (i.e., DA) and context (i.e., assessment in postgraduate surgical education). Medline and Embase were searched with the assistance of a research librarian, with addition of synonyms. This was conducted in May 2023, and was exported to Microsoft Excel for further review. The reference lists of included articles were also searched to find any relevant data sources that had yet to be considered. In addition, to identify grey literature, a search was performed for the term “differential attainment” and “disparity” on the relevant stakeholders’ websites (See supplemental Table 1 for full listing). Stakeholders were included on the basis of their involvement in governance or training of surgical trainees.

Study selection

To start we excluded conference abstracts that were subsequently published as full papers to avoid duplications ( n  = 337). After an initial screen by title to exclude obviously irrelevant articles, articles were filtered to meet our inclusion and exclusion criteria (Table  1 ). The remaining articles ( n  = 47) were then reviewed in their entirety, with the addition of five reports found in grey literature. Following the screening process, 45 studies were recruited for scoping review (Fig.  1 ).

Charting the data

The extracted data included literature title, authors, year of publication, country of study, study design, population characteristic, case number, context, type of assessment, research question and main findings (Appendix 1). Extraction was performed initially by a single author and then subsequently by a second author to ensure thorough review. Group discussion was conducted in case of any disagreements. As charting occurred, papers were discovered within reference lists of included studies which were eligible for inclusion; these were assimilated into the data charting table and included in the data extraction ( n  = 8).

Collating, summarizing and reporting the results

The included studies were not formally assessed in their quality or risk of bias, consistent with a scoping review approach [ 10 ]. However, group discussion was conducted during charting to aid argumentation and identify themes and trends.

We conducted a descriptive numerical summary to describe the characteristics of included studies. Then thematic analysis was implemented to examine key details and organise the attainment quality and population characteristics based on their description. The coding of themes was an iterative process and involved discussion between authors, to identify and refine codes to group into themes.

We categorised the main themes as gender, ethnicity, country of graduation, individual and family background in education, socioeconomic background, age, and disability. The number of articles in each theme is demonstrated in Table  2 . Data was reviewed and organised into subtopics based on assessment types included: academic achievement (e.g., MRCS, FRCS), assessments for progression (e.g., ARCP), workplace-based assessment (e.g., EPA, feedback), surgical experience (e.g., case volume), and technical skills (e.g., visuo-spatial tasks).

figure 1

PRISMA flow diagram

44 articles defined the number of included participants (89,399 participants in total; range of participants across individual studies 16–34,755). Two articles reported the number of included studies for their meta-analysis (18 and 63 included articles respectively). Two reports from grey literature did not define the number of participants they included in their analysis. The characteristics of the included articles are displayed in Table  2 .

figure 2

Growth in published literature on differential attainment over the past 40 years

Academic achievement

In the American Board of Surgery Certifying Exam (ABSCE), Maker [ 11 ] found there to be no significant differences in terms of gender when comparing those who passed on their first attempt and those who did not in general surgery training, a finding supported by Ong et al. [ 12 ]. Pico et al. [ 13 ] reported that in Orthopaedic training, Orthopaedic In-Training Examination (OITE) and American Board of Orthopaedic Surgery (ABOS) Part 1 scores were similar between genders, but that female trainees took more attempts in order to pass. In the UK, two studies reported significantly lower Membership of the Royal College of Surgeons (MRCS) pass rates for female trainees compared to males [ 4 , 14 ]. However, Robinson et al. [ 15 ] presented no significant gender differences in MRCS success rates. A study assessing Fellowship of the Royal College of Surgeons (FRCS) examination results found no significant gender disparities in pass rates [ 16 ]. In MRCOG examination, no significant gender differences were found in Part 1 scores, but women had higher pass rates and scores in Part 2 [ 17 ].

Assessment for Progression

ARCP is the annual process of revalidation that UK doctors must perform to progress through training. A satisfactory progress outcome (“outcome 1”) allows trainees to advance through to the next training year, whereas non-satisfactory outcomes (“2–5”) suggest inadequate progress and recommends solutions, such as further time in training or being released from the training programme. Two studies reported that women received 60% more non-satisfactory outcomes than men [ 16 , 18 ]. In contrast, in O&G men had higher non-satisfactory ARCP outcomes without explicit reasons for this given [ 19 ].

Regarding Milestone evaluations based from the US Accreditation Council for Graduate Medical Education (ACGME), Anderson et al. [ 20 ] reported men had higher ratings of knowledge of diseases at postgraduate year 5 (PGY-5), while women had lower mean score achievements. This was similar to another study finding that men and women had similar competencies at PGY-1 to 3, and that it was only at PGY-5 that women were evaluated lower than men [ 21 ]. However, Kwasny et al. [ 22 ] found no difference in trainers’ ratings between genders, but women self-rated themselves lower. Salles et al. [ 23 ] demonstrated significant improvement in scoring in women following a value-affirmation intervention, while this intervention did not affect men.

Workplace-based Assessment

Galvin et al. [ 24 ] reported better evaluation scores from nurses for PGY-2 male trainees, while females received fewer positive and more negative comments. Gerull et al. [ 25 ] demonstrated men received compliments with superlatives or standout words, whereas women were more likely to receive compliments with mitigating phrases (e.g., excellent vs. quite competent).

Hayward et al. [ 26 ] investigated assessment of attributes of clinical performance (ethics, judgement, technical skills, knowledge and interpersonal skills) and found similar scoring between genders.

Several authors have studied autonomy given to trainees in theatre [ 27 , 28 , 29 , 30 , 31 ]. Two groups found no difference in level of granted autonomy between genders but that women rated lower perceived autonomy on self-evaluation [ 27 , 28 ]. Other studies found that assessors consistently gave female trainees lower autonomy ratings, but only in one paper was this replicated in lower performance scores [ 29 , 30 , 31 ].

Padilla et al. [ 32 ] reported no difference in entrustable professional activity assessment (EPA) levels between genders, yet women rated themselves much lower, which they regarded as evidence of imposter syndrome amongst female trainees. Cooney et al. [ 33 ] found that male trainers scored EPAs for women significantly lower than men, while female trainers rated both genders similarly. Conversely, Roshan et al. [ 34 ] found that male assessors were more positive in feedback comments to female trainees than male trainees, whereas they also found that comments from female assessors were comparable for each gender.

Surgical Experience

Gong et al. [ 35 ] found significantly fewer cataract operations were performed by women in ophthalmology residency programmes, which they suggested could be due to trainers being more likely to give cases to male trainees. Female trainees also participated in fewer robotic colorectal procedures, with less operative time on the robotic console afforded [ 36 ]. Similarly, a systematic review highlighted female trainees in various specialties performed fewer cases per week and potentially had limited access to training facilities [ 37 ]. Eruchalu et al. [ 38 ] found that female trainees performed fewer cases, that is, until gender parity was reached, after which case logs were equivalent.

Technical skills

Antonoff et al. [ 39 ] found higher scores for men in coronary anastomosis skills, with women receiving more “fail” assessments. Dill-Macky et al. [ 40 ] analysed laparoscopic skill assessment using blinded videos of trainees and unblinded assessments. While there was no difference in blinded scores between genders, when comparing blinded and unblinded scores individually, assessors were less likely to agree on the scores of women compared to men. However, another study about laparoscopic skills by Skjold-Ødegaard et al. [ 41 ] reported higher performance scores in female residents, particularly when rated by women. The lowest score was shown in male trainees rated by men. While some studies showed disparities in assessment, several studies reported no difference in technical skill assessments (arthroscopic, knot tying, and suturing skills) between genders [ 42 , 43 , 44 , 45 , 46 ].

Several studies investigated trainees’ abilities to complete isolated tasks associated with surgical skills. In laparoscopic tasks, men were initially more skilful in peg transfer and intracorporeal knot tying than women. Following training, the performance was not different between genders [ 47 ]. A study on microsurgical skills reported better initial visual-spatial and perceptual ability in men, while women had better fine motor psychomotor ability. However, these differences were not significant, and all trainees improved significantly after training [ 48 ]. A study by Milam et al. [ 49 ] revealed men performed better in mental rotation tasks and women outperformed in working memory. They hypothesised that female trainees would experience stereotype threat, fear of being reduced to a stereotype, which would impair their performance. They found no evidence of stereotype threat influencing female performance, disproving their hypothesis, a finding supported by Myers et al. [ 50 ].

Ethnicity and country of graduation

Most papers reported ethnicity and country of graduation concurrently, for example grouping trainees as White UK graduates (WUKG), Black and minority ethnicity UK graduates (BME UKG), and international medical graduates (IMG). Therefore, these areas will be addressed together in the following section.

When assessing the likelihood of passing American Board of Surgery (ABS) examinations on first attempt, Yeo et al. [ 51 ] found that White trainees were more likely than non-White. They found that the influence of ethnicity was more significant in the end-of-training certifying exam than in the start-of-training qualifying exam. This finding was corroborated in a study of both the OITE and ABOS certifying exam, suggesting widening inequalities during training [ 52 ].

Two UK-based studies reported significantly higher MRCS pass rates in White trainees compared to BMEs [ 4 , 14 ]. BMEs were less likely to pass MRCS Part A and B, though this was not true for Part A when variations in socioeconomic background were corrected for [ 14 ]. However, Robinson et al. [ 53 ] found no difference in MRCS pass rates based on ethnicity. Another study by Robinson et al. [ 15 ] demonstrated similar pass rates between WUKGs and BME UKGs, but IMGs had significantly lower pass rates than all UKGs. The FRCS pass rates of WUKGs, BME UKGs and IMGs were 76.9%, 52.9%, and 53.9%, respectively, though these percentages were not statistically significantly different [ 16 ].

There was no difference in MRCOG results based on ethnicity, but higher success rates were found in UKGs [ 19 ]. In FRCOphth, WUKGs had a pass rate of 70%, higher than other groups of trainees, with a pass rate of only 45% for White IMGs [ 52 ].

By gathering data from training programmes reporting little to no DA due to ethnicity, Roe et al. [ 54 ] were able to provide a list of factors they felt were protective against DA, such as having supportive supervisors and developing peer networks.

Assessment for progression

RCOphth [ 55 ] found higher rates of satisfactory ARCP outcomes for WUKGs compared to BME UKGs, followed by IMGs. RCOG [ 19 ] discovered higher rates of non-satisfactory ARCP outcomes from non-UK graduates, particularly amongst BMEs and those from the European Economic Area (EEA). Tiffin et al. [ 56 ] considered the difference in experience between UK graduates and UK nationals whose primary medical qualification was gained outside of the UK, and found that the latter were more likely to receive a non-satisfactory ARCP outcome, even when compared to non-UK nationals.

Woolf et al. [ 57 ] explored reasons behind DA by conducting interview studies with trainees. They investigated trainees’ perceptions of fairness in evaluation and found that trainees felt relationships developed with colleagues who gave feedback could affect ARCP results, and might be challenging for BME UKGs and IMGs who have less in common with their trainers.

Workplace-based assessment

Brooks et al. [ 58 ] surveyed the prevalence of microaggressions against Black orthopaedic surgeons during assessment and found 87% of participants experienced some level of racial discrimination during workplace-based performance feedback. Black women reported having more racially focused and devaluing statements from their seniors than men.

Surgical experience

Eruchalu et al. [ 38 ] found that white trainees performed more major surgical cases and more cases as a supervisor than did their BME counterparts.

Dill-Macky et al. [ 40 ] reported no significant difference in laparoscopic surgery assessments between ethnicities.

Individual and family background in education

Two studies [ 4 , 16 ] concentrated on educational background, considering factors such as parental occupation and attendance of a fee-paying school. MRCS part A pass rate was significantly higher for trainees for whom Medicine was their first Degree, those with university-educated parents, higher POLAR (Participation In Local Areas classification group) quintile, and those from fee-paying schools. Higher part B pass rate was associated with graduating from non-Graduate Entry Medicine programmes and parents with managerial or professional occupations [ 4 ]. Trainees with higher degrees were associated with an almost fivefold increase in FRCS success and seven times more scientific publications than their counterparts [ 16 ].

Socioeconomic background

Two studies used Index of Multiple Deprivation quintile, the official measure of relative deprivation in England based on geographical areas for grading socioeconomic level. The area was defined at the time of medical school application. Deprivation quintiles (DQ) were calculated, ranging from DQ1 (most deprived) to DQ5 (least deprived) [ 4 , 14 ].

Trainees with history of less deprivation were associated with higher MRCS part A pass rate. More success in part B was associated with history of no requirement for income support and less deprived areas [ 4 ]. Trainees from DQ1 and DQ2 had lower pass rates and higher number of attempts to pass [ 14 ]. A general trend of better outcomes in examination was found from O&G trainees in less deprived quintiles [ 19 ].

Trainees from DQ1 and DQ2 received significantly more non-satisfactory ARCP outcomes (24.4%) than DQ4 and DQ5 (14.2%) [ 14 ].

Trainees who graduated at age less than 29 years old were more likely to pass MRCS than their counterparts [ 4 ].

Authors [ 18 , 56 ] found that older trainees received more non-satisfactory ARCP outcomes. Likewise, there was higher percentage of non-satisfactory ARCP outcomes in O&G trainees aged over 45 compared with those aged 25–29 regardless of gender [ 19 ].

Trainees with disability had significantly lower pass rates in MRCS part A compared to candidates without disability. However, the difference was not significant for part B [ 59 ].

What have we learnt from the literature?

It is heartening to note the recent increase in interest in DA (27 studies in the last 4 years, compared to 26 in the preceding 40) (Fig.  2 ). The vast majority (77%) of studies are quantitative, based in the US or UK (89%), focus on gender (85%) and relate to clinical assessments (51%) rather than examination results. Therefore, the surgical community has invested primarily in researching the experience of women in the USA and UK.

Interestingly, a report by RCOG [ 19 ] showed that men were more likely to receive non-satisfactory ARCP outcomes than women, and a study by Rushd et al. [ 17 ] found that women were more likely to pass part 2 of MRCOG than men. This may be because within O&G men are the “out-group” (a social group or category characterised by marginalisation or exclusion by the dominant cultural group) as 75% of O&G trainees are female [ 60 ].

This contrasts with other specialities in which men are the in-group and women are seen to underperform. Outside of O&G, in comparison to men, women are less likely to pass MRCS [ 4 , 14 ], receive satisfactory ARCP outcome [ 16 , 18 ], or receive positive feedback [ 24 ], whilst not performing the same number of procedures as men [ 34 , 35 ]. This often leads to poor self-confidence in women [ 32 ], which can then worsen performance [ 21 ].

It proves difficult to comment on DA for many groups due to a lack of evidence. The current research suggests that being older, having a disability, graduate entry to medicine, low parental education, and living in a lower socioeconomic area at the time of entering medical school are all associated with lower MRCS pass rates. Being older and having a lower socioeconomic background are also associated with non-satisfactory ARCP outcomes, slowing progression through training.

These characteristics may provide a compounding negative effect – for example having a previous degree will automatically make a trainee older, and living in a lower socioeconomic area makes it more likely their parents will have a non-professional job and not hold a higher degree. When multiple protected characteristics interact to produce a compounded negative effect for a person, it is often referred to as “intersectional discrimination” or “intersectionality” [ 61 ]. This is a concept which remains underrepresented in the current literature.

The literature is not yet in agreement over the presence of DA due to ethnicity. There are many studies that report perceived discrimination, however the data for exam and clinical assessment outcomes is equivocal. This may be due to the fluctuating nature of in-groups and out-groups, and multiple intersecting characteristics. Despite this, the lived experience of BME surgeons should not be ignored and requires further investigation.

What are the gaps in the literature?

The overwhelming majority of literature exploring DA addresses issues of gender, ethnicity or country of medical qualification. Whilst bias related to these characteristics is crucial to recognise, studies into other protected characteristics are few and far between. The only paper on disability reported striking differences in attainment between disabled and non-disabled registrars [ 59 ]. There has also been increased awareness about neurodiversity amongst doctors and yet an exploration into the experience of neurodiverse surgeons and their progress through training has yet to be published [ 62 ].

The implications of being LGBTQ + in surgical training have not been recognised nor formally addressed in the literature. Promisingly, the experiences of LGBTQ + medical students have been recognised at an undergraduate level, so one can hope that this will be translated into postgraduate education [ 63 , 64 ]. While this is deeply entwined with experiences of gender discrimination, it is an important characteristic that the surgical community would benefit from addressing, along with disability. To a lesser extent, the effect of socioeconomic background and age have also been overlooked.

Characterising trainees for the purpose of research

Ethnicity is deeply personal, self-defined, and may change over time as personal identity evolves, and therefore arbitrarily grouping diverse ethnic backgrounds is unlikely to capture an accurate representation of experiences. There are levels of discrimination even within minority groups; colourism in India means dark-skinned Indians will experience more discrimination than light-skinned Indians, even from those within in their own ethnic group [ 65 ]. Therefore, although the studies included in the scoping review accepted self-definitions of ethnicity, this is likely not enough to fully capture the nuances of bias and discrimination present in society. For example, Ellis et al. [ 4 ] grouped participants as “White”, “Mixed”, “Asian”, “Black” and “Other”, however they could have also assigned a skin tone value such as the NIS Skin Colour Scale [ 66 ], thus providing more detail.

Ethnicity is more than genetic heritage; it is also cultural expression. The experience of an IMG in UK postgraduate training will differ from that of a UKG, an Indian UKG who grew up in India, and an Indian UKG who grew up in the UK. These are important distinctions which are noted in the literature (e.g. by Woolf et al., 2016 [ 57 ]) however some do not distinguish between ethnicity and graduate status [ 15 ] and none delve into an individual’s cultural expression (e.g., clothing choice) and how this affects the perception of their assessors.

Reasons for DA

Despite the recognition of inequalities in all specialties of surgery, there is a paucity of data explicitly addressing why DA occurs. Reasons behind the phenomenon must be explored to enable change and eliminate biases. Qualitative research is more attuned to capturing the complexities of DA through observation or interview-based studies. Currently most published data is quantitative, and relies on performance metrics to demonstrate the presence of DA while ignoring the causes. Promisingly, there are a gradually increasing number of qualitative, predominantly interview-based, studies (Fig.  2 ).

To create a map of DA in all its guises, an analysis of the themes reported to be contributory to its development is helpful. In our review of the literature, four themes have been identified:

Training culture

In higher surgical training, for there to be equality in outcomes, there needs to be equity in opportunities. Ellis et al. [ 4 ] recognised that variation in training experiences, such as accessibility of supportive peers and senior role models, can have implications on attainment. Trainees would benefit from targeted support at times of transition, such as induction or at examinations, and it may be that currently the needs of certain groups are being met before others, reinforcing differential attainment [ 4 ].

Experience of assessment

Most literature in DA relates to the presence (or lack of) an attainment gap in assessments, such as ARCP or MRCS. It is assumed that these assessments of trainee development are objective and free of bias, and indeed several authors have described a lack of bias in these high-stakes examinations (e.g., Ong et al., 2019 [ 12 ]; Robinson et al., 2019 [ 53 ]). However, in some populations, such as disabled trainees, there are differences in attainment [ 59 ]. This is demonstrated despite legislation requiring professional bodies to make reasonable adjustments to examinations for disabled candidates, such as additional time, text formatting amendments, or wheelchair-accessible venues [ 67 ]. Therefore it would be beneficial to investigate the implementation of these adjustments across higher surgical examinations and identify any deficits.

Social networks

Relationships between colleagues may influence DA in multiple ways. Several studies identified that a lack of a relatable and inspiring mentor may explain why female or BME doctors fail to excel in surgery [ 4 , 55 ]. Certain groups may receive preferential treatment due to their perceived familiarity to seniors [ 35 ]. Robinson et al. [ 15 ] recognised that peer-to-peer relationships were also implicated in professional development, and the lack thereof could lead to poor learning outcomes. Therefore, a non-discriminatory culture and inclusion of trainees within the social network of training is posited as beneficial.

Personal characteristics

Finally, personal factors directly related to protected characteristics have been suggested as a cause of DA. For example, IMGs may perform worse in examinations due to language barriers, and those from disadvantaged backgrounds may have less opportunity to attend expensive courses [ 14 , 16 ]. Although it is impossible to exclude these innate deficits from training, we may mitigate their influence by recognising their presence and providing solutions.

The causes of DA may also be grouped into three levels, as described by Regan de Bere et al. [ 68 ]: macro (the implications of high-level policy), meso (focusing on institutional or working environments) and micro (the influence of individual factors). This can intersect with the four themes identified above, as training culture can be enshrined at both an institutional and individual level, influencing decisions that relate to opportunities for trainees, or at a macro level, such as in the decisions made on nationwide recruitment processes. These three levels can be used to more deeply explore each of the four themes to enrich the discovery of causes of DA.

Discussions outside of surgery

Authors in General Practice (e.g., Unwin et al., 2019 [ 69 ]; Pattinson et al., 2019 [ 70 ]), postgraduate medical training (e.g., Andrews, Chartash, and Hay, 2021 [ 71 ]), and undergraduate medical education (e.g., Yeates et al., 2017 [ 72 ]; Woolf et al., 2013 [ 73 ]) have published more extensively in the aetiology of DA. A study by Hope et al. [ 74 ] evaluating the bias present in MRCP exams used differential item functioning to identify individual questions which demonstrated an attainment gap between male and female and Caucasian and non-Caucasian medical trainees. Conclusions drawn about MRCP Part 1 examinations may be generalisable to MRCS Part A or FRCOphth Part 1: they are all multiple-choice examinations testing applied basic science and usually taken within the first few years of postgraduate training. Therefore it is advisable that differential item functioning should also be applied to these examinations. However, it is possible that findings in some subspecialities may not be generalisable to others, as training environments can vary profoundly. The RCOphth [ 55 ] reported that in 2021, 53% of ophthalmic trainees identified as male, whereas in Orthopaedics 85% identified as male, suggesting different training environments [ 5 ]. It is useful to identify commonalities of DA between surgical specialties and in the wider scope of medical training.

Limitations of our paper

Firstly, whilst aiming to provide a review focussed on the experience of surgical trainees, four papers contained data about either non-surgical trainees or medical students. It is difficult to draw out the surgeons from this data and therefore it is possible that there are issues with generalisability. Furthermore, we did not consider the background of each paper’s authors, as their own lived experience of attainment gap could form the lens through which they commented on surgical education, colouring their interpretation. Despite intending to include as many protected characteristics as possible, inevitably there will be lived experiences missed. Lastly, the experience of surgical trainees outside of the English-speaking world were omitted. No studies were found that originated outside of Europe or North America and therefore the presence or characteristics of DA outside of this area cannot be assumed.

Experiences of inequality in surgical assessment are prevalent in all surgical subspecialities. In order to further investigate DA, researchers should ensure all protected characteristics are considered - and how these interact - to gain insight into intersectionality. Given the paucity of current evidence, particular focus should be given to the implications of disability, and specifically neurodiversity, in progress through training as they are yet to be explored in depth. In defining protected characteristics, future authors should be explicit and should avoid generalisation of cultural backgrounds to allow authentic appreciation of attainment gap. Few authors have considered the driving forces between bias in assessment and DA, and therefore qualitative studies should be prioritised to uncover causes for and protective factors against DA. Once these influences have been identified, educational designers can develop new assessment methods that ensure equity across surgical trainees.

Data availability

All data provided during this study are included in the supplementary information files.

Abbreviations

Accreditation Council for Graduate Medical Education

American Board of Orthopaedic Surgery

American Board of Surgery

American Board of Surgery Certifying Exam

Annual Review of Competence Progression

Black, Asian, and Minority Ethnicity

Council on Resident Education in Obstetrics and Gynecology

Differential Attainment

Deprivation Quintile

European Economic Area

Entrustable Professional Activities

Fellowship of The Royal College of Ophthalmologists

Fellow of the Royal College of Surgeons

General Medical Council

Higher Surgical Training

International Medical Graduate

In-Training Evaluation Report

Member of the Royal College of Obstetricians and Gynaecologists

Member of the Royal College of Physicians

Member of the Royal College of Surgeons

Obstetrics and Gynaecology

Orthopaedic In-Training Examination

Participation In Local Areas

Postgraduate Year

The Royal College of Ophthalmologists

The Royal College of Obstetricians and Gynaecologists

The Royal College of Surgeons of England

United Kingdom Graduate

White United Kingdom Graduate

Joseph JP, Joseph AO, Jayanthi NVG, et al. BAME Underrepresentation in Surgery Leadership in the UK and Ireland in 2020: An Uncomfortable Truth. The Bulletin of the Royal College of Surgeons of England. 2020; 102 (6): 232–33.

Royal College of Surgeons of England. The Royal College – Our Professional Home. An independent review on diversity and inclusion for the Royal College of Surgeons of England. Review conducted by Baroness Helena Kennedy QC. RCS England. 2021.

Sarafidou K, Greatorex R. Surgical workforce: planning today for the workforce of the future. Bull Royal Coll Surg Engl. 2011;93(2):48–9. https://doi.org/10.1308/147363511X552575 .

Article   Google Scholar  

Ellis R, Brennan P, Lee AJ, et al. Differential attainment at MRCS according to gender, ethnicity, age and socioeconomic factors: a retrospective cohort study. J R Soc Med. 2022;115(7):257–72. https://doi.org/10.1177/01410768221079018 .

Hope C, Humes D, Griffiths G, et al. Personal Characteristics Associated with Progression in Trauma and Orthopaedic Specialty Training: A Longitudinal Cohort Study.Journal of Surgical Education 2022; 79 (1): 253–59. doi:10.1016/j.jsurg.2021.06.027.

Takeshita J, Wang S, Loren AW, et al. Association of Racial/Ethnic and Gender Concordance Between Patients and Physicians With Patient Experience Ratings. JAMA Network Open. 2022; 3(11). doi:10.1001/jamanetworkopen.2020.24583.

Katz, P. The Scalpel’s Edge: The Culture of Surgeons. Allyn and Bacon, 1999.

Tørring B, Gittell JH, Laursen M, et al. (2019) Communication and relationship dynamics in surgical teams in the operating room: an ethnographic study. BMC Health Services Research. 2019;19, 528. doi:10.1186/s12913-019-4362-0.

Veazey Brooks J & Bosk CL. (2012) Remaking surgical socialization: work hour restrictions, rites of passage, and occupational identity. Social Science & Medicine. 2012;75(9):1625-32. doi: 10.1016/j.socscimed.2012.07.007.

Arksey H & OʼMalley L. Scoping studies: Towards a methodological framework. International Journal of Social Research Methodology. 2005;8(1), 19–32.

Maker VK, Marco MZ, Dana V, et al. Can We Predict Which Residents Are Going to Pass/Fail the Oral Boards? Journal of Surgical Education. 2012;69 (6): 705–13.

Ong TQ, Kopp JP, Jones AT, et al. Is there gender Bias on the American Board of Surgery general surgery certifying examination? J Surg Res. 2019;237:131–5. https://doi.org/10.1016/j.jss.2018.06.014 .

Pico K, Gioe TJ, Vanheest A, et al. Do men outperform women during orthopaedic residency training? Clin Orthop Relat Res. 2010;468(7):1804–8. https://doi.org/10.1007/s11999-010-1318-4 .

Vinnicombe Z, Little M, Super J, et al. Differential attainment, socioeconomic factors and surgical training. Ann R Coll Surg Engl. 2022;104(8):577–82. https://doi.org/10.1308/rcsann.2021.0255 .

Robinson DBT, Hopkins L, James OP, et al. Egalitarianism in surgical training: let equity prevail. Postgraduate Medical Journal. 2020;96 (1141), 650–654. doi:10.1136/postgradmedj-2020-137563.

Luton OW, Mellor K, Robinson DBT, et al. Differential attainment in higher surgical training: scoping pan-specialty spectra. Postgraduate Medical Journal. 2022;99(1174),849–854. doi:10.1136/postgradmedj-2022-141638.

Rushd S, Landau AB, Khan JA, Allgar V & Lindow SW. An analysis of the performance of UK medical graduates in the MRCOG Part 1 and Part 2 written examinations. Postgraduate Medical Journal. 2012;88 (1039), 249–254. doi:10.1136/postgradmedj-2011-130479.

Hope C, Lund J, Griffiths G, et al. Differences in ARCP outcome by surgical specialty: a longitudinal cohort study. Br J Surg. 2021;108. https://doi.org/10.1093/bjs/znab282.051 .

Royal College of Obstetricians and Gynaecologists. Report Differential Attainment 2019. https://www.rcog.org.uk/media/jscgfgwr/differential-attainment-tef-report-2019.pdf [Last accessed 28/12/23].

Anderson JE, Zern NK, Calhoun KE, et al. Assessment of Potential Gender Bias in General Surgery Resident Milestone Evaluations. JAMA Surgery. 2022;157 (12), 1164–1166. doi:10.1001/jamasurg.2022.3929.

Landau SI, Syvyk S, Wirtalla C, et al. Trainee Sex and Accreditation Council for Graduate Medical Education Milestone Assessments during general surgery residency. JAMA Surg. 2021;156(10):925–31. https://doi.org/10.1001/jamasurg.2021.3005 .

Kwasny L, Shebrain S, Munene G, et al. Is there a gender bias in milestones evaluations in general surgery residency training? Am J Surg. 2021;221(3):505–8. https://doi.org/10.1016/j.amjsurg.2020.12.020 .

Salles A, Mueller CM & Cohen GL. A Values Affirmation Intervention to Improve Female Residents’ Surgical Performance. Journal of Graduate Medical Education. 2016;8 (3), 378–383. doi:10.4300/JGME-D-15-00214.1.

Galvin S, Parlier A, Martino E, et al. Gender Bias in nurse evaluations of residents in Obstetrics and Gynecology. Obstet Gynecol. 2015;126(7S–12S). https://doi.org/10.1097/AOG.0000000000001044 .

Gerull KM, Loe M, Seiler K, et al. Assessing gender bias in qualitative evaluations of surgical residents. Am J Surg. 2019;217(2):306–13. https://doi.org/10.1016/j.amjsurg.2018.09.029 .

Hayward CZ, Sachdeva A, Clarke JR. Is there gender bias in the evaluation of surgical residents? Surgery. 1987;102(2):297–9.

Google Scholar  

Cookenmaster C, Shebrain S, Vos D, et al. Gender perception bias of operative autonomy evaluations among residents and faculty in general surgery training. Am J Surg. 2021;221(3):515–20. https://doi.org/10.1016/j.amjsurg.2020.11.016 .

Olumolade OO, Rollins PD, Daignault-Newton S, et al. Closing the Gap: Evaluation of Gender Disparities in Urology Resident Operative Autonomy and Performance.Journal of Surgical Education.2022;79 (2), 524–530. doi.org/10.1016/j.jsurg.2021.10.010.

Chen JX, Chang EH, Deng F, et al. Autonomy in the Operating Room: A Multicenter Study of Gender Disparities During Surgical Training. Journal of Graduate Medical Education. 2021;13(5), 666–672. doi: 10.4300/JGME-D-21-00217.1.

Meyerson SL, Sternbach JM, Zwischenberger JB, & Bender EM. The Effect of Gender on Resident Autonomy in the Operating room. Journal of Surgical Education. 2017. 74(6), e111–e118. doi.org/10.1016/j.jsurg.2017.06.014.

Hoops H, Heston A, Dewey E, et al. Resident autonomy in the operating room: Does gender matter? The AmericanJournalofSurgery. 2019; 217(2), 301–305. doi.org/10.1016/j.amjsurg.2018.12.023.

Padilla EP, Stahl CC, Jung SA, et al. Gender Differences in Entrustable Professional Activity Evaluations of General Surgery Residents. Annals of Surgery. 2022;275 (2), 222–229. doi:10.1097/SLA.0000000000004905.

Cooney CM, Aravind P, Hultman CS, et al. An Analysis of Gender Bias in Plastic Surgery Resident Assessment. Journal of Graduate Medical Education. 2021;13 (4), 500–506. doi:10.4300/JGME-D-20-01394.1.

Roshan A, Farooq A, Acai A, et al. The effect of gender dyads on the quality of narrative assessments of general surgery trainees. The American Journal of Surgery. 2022; 224 (1A), 179–184. doi.org/10.1016/j.amjsurg.2021.12.001.

Gong D, Winn BJ, Beal CJ, et al. Gender Differences in Case Volume Among Ophthalmology Residents. Archives of Ophthalmology. 2019;137 (9), 1015–1020. doi:10.1001/jamaophthalmol.2019.2427.

Foley KE, Izquierdo KM, von Muchow MG, et al. Colon and Rectal Surgery Robotic Training Programs: An Evaluation of Gender Disparities. Diseases of the Colon and Rectum. 2020; 63(7), 974–979. doi.org/10.1097/DCR.0000000000001625.

Ali A, Subhi Y, Ringsted C et al. Gender differences in the acquisition of surgical skills: a systematic review. Surgical Endoscopy. 2015;29 (11), 3065–3073. doi:10.1007/s00464-015-4092-2.

Eruchalu CN, He K, Etheridge JC, et al. Gender and Racial/Ethnic Disparities in Operative Volumes of Graduating General Surgery Residents.The Journal of Surgical Research. 2022; 279, 104–112. doi.org/10.1016/j.jss.2022.05.020.

Antonoff MB, Feldman H, Luc JGY, et al. Gender Bias in the Evaluation of Surgical Performance: Results of a Prospective Randomized Trial. Annals of Surgery. 2023;277 (2), 206–213. doi:10.1097/SLA.0000000000005015.

Dill-Macky A, Hsu C, Neumayer LA, et al. The Role of Implicit Bias in Surgical Resident Evaluations. Journal of Surgical Education. 2022;79 (3), 761–768. doi:10.1016/j.jsurg.2021.12.003.

Skjold-Ødegaard B, Ersdal HL, Assmus J et al. Comparison of Performance Score for Female and Male Residents in General Surgery Doing Supervised Real-Life Laparoscopic Appendectomy: Is There a Norse Shield-Maiden Effect? World Journal of Surgery. 2021;45 (4), 997–1005. doi:10.1007/s00268-020-05921-4.

Leape CP, Hawken JB, Geng X, et al. An investigation into gender bias in the evaluation of orthopedic trainee arthroscopic skills. Journal of Shoulder and Elbow Surgery. 2022;31 (11), 2402–2409. doi:10.1016/j.jse.2022.05.024.

Vogt VY, Givens VM, Keathley CA, et al. Is a resident’s score on a videotaped objective structured assessment of technical skills affected by revealing the resident’s identity? American Journal of Obstetrics and Gynecology. 2023;189 (3), 688–691. doi:10.1067/S0002-9378(03)00887-1.

Fjørtoft K, Konge L, Christensen J et al. Overcoming Gender Bias in Assessment of Surgical Skills. Journal of Surgical Education. 2022;79 (3), 753–760. doi:10.1016/j.jsurg.2022.01.006.

Grantcharov TP, Bardram L, Funch-Jensen P, et al. Impact of Hand Dominance, Gender, and Experience with Computer Games on Performance in Virtual Reality Laparoscopy. Surgical Endoscopy 2003;17 (7): 1082–85.

Rosser Jr JC, Rosser LE & Savalgi RS. Objective Evaluation of a Laparoscopic Surgical Skill Program for Residents and Senior Surgeons. Archives of Surgery. 1998; 133 (6): 657–61.

White MT & Welch K. Does gender predict performance of novices undergoing Fundamentals of Laparoscopic Surgery (FLS) training? The American Journal of Surgery. 2012;203 (3), 397–400. doi:10.1016/j.amjsurg.2011.09.020.

Nugent E, Joyce C, Perez-Abadia G, et al. Factors influencing microsurgical skill acquisition during a dedicated training course. Microsurgery. 2012;32 (8), 649–656. doi:10.1002/micr.22047.

Milam LA, Cohen GL, Mueller C et al. Stereotype threat and working memory among surgical residents. The American Journal of Surgery. 2018;216 (4), 824–829. doi:10.1016/j.amjsurg.2018.07.064.

Myers SP, Dasari M, Brown JB, et al. Effects of Gender Bias and Stereotypes in Surgical Training: A Randomized Clinical Trial. JAMA Surgery. 2020; 155(7), 552–560. doi.org/10.1001/jamasurg.2020.1127.

Yeo HL, Patrick TD, Jialin M, et al. Association of Demographic and Program Factors With American Board of Surgery Qualifying and Certifying Examinations Pass Rates. JAMA Surgery 2020; 155 (1): 22–30. doi:0.1001/jamasurg.2019.4081.

Foster N, Meghan P, Bettger JP, et al. Objective Test Scores Throughout Orthopedic Surgery Residency Suggest Disparities in Training Experience. Journal of Surgical Education 2021;78 (5): 1400–1405. doi:10.1016/j.jsurg.2021.01.003.

Robinson DBT, Hopkins L, Brown C, et al. Prognostic Significance of Ethnicity on Differential Attainment in Core Surgical Training (CST). Journal of the American College of Surgeons. 2019;229 (4), e191. doi:10.1016/j.jamcollsurg.2019.08.1254.

Roe V, Patterson F, Kerrin M, et al. What supported your success in training? A qualitative exploration of the factors associated with an absence of an ethnic attainment gap in post-graduate specialty training. General Medical Council. 2019. https://www.gmc-uk.org/-/media/documents/gmc-da-final-report-success-factors-in-training-211119_pdf-80914221.pdf [Last accessed 28/12/23].

Royal College of Ophthalmologists. Data on Differential attainment in ophthalmology and monitoring equality, diversity, and inclusion: Recommendations to the RCOphth. London, Royal College of Ophthalmologists. 2022. https://www.rcophth.ac.uk/wp-content/uploads/2023/01/Differential-Attainment-Report-2022.pdf [Last accessed 28/12/23].

Tiffin PA, Orr J, Paton LW, et al. UK nationals who received their medical degrees abroad: selection into, and subsequent performance in postgraduate training: a national data linkage study. BMJ Open. 2018;8:e023060. doi: 10.1136/bmjopen-2018-023060.

Woolf K, Rich A, Viney R, et al. Perceived causes of differential attainment in UK postgraduate medical training: a national qualitative study. BMJ Open. 2016;6 (11), e013429. doi:10.1136/bmjopen-2016-013429.

Brooks JT, Porter SE, Middleton KK, et al. The Majority of Black Orthopaedic Surgeons Report Experiencing Racial Microaggressions During Their Residency Training. Clinical Orthopaedics and Related Research. 2023;481 (4), 675–686. doi:10.1097/CORR.0000000000002455.

Ellis R, Cleland J, Scrimgeour D, et al. The impact of disability on performance in a high-stakes postgraduate surgical examination: a retrospective cohort study. Journal of the Royal Society of Medicine. 2022;115 (2), 58–68. doi:10.1177/01410768211032573.

Royal College of Obstetricians & Gynaecologists. RCOGWorkforceReport2022. Available at: https://www.rcog.org.uk/media/fdtlufuh/workforce-report-july-2022-update.pdf [Last accessed 28/12/23].

Crenshaw KW. On Intersectionality: Essential Writings. Faculty Books. 2017; 255.

Brennan CM & Harrison W. The Dyslexic Surgeon. The Bulletin of the Royal College of Surgeons of England. 2020;102 (3): 72–75. doi:10.1308/rcsbull.2020.72.

Toman L. Navigating medical culture and LGBTQ identity. Clinical Teacher. 2019;16: 335–338. doi:10.1111/tct.13078.

Torales J, Castaldelli-Maia JM & Ventriglio A. LGBT + medical students and disclosure of their sexual orientation: more than in and out of the closet. International Review of Psychiatry. 2022;34:3–4, 402–406. doi:10.1080/09540261.2022.2101881.

Guda VA & Kundu RV. India’s Fair Skin Phenomena. SKINmed. 2021;19(3), 177–178.

Massey D & Martin JA. The NIS skin color scale. Princeton University Press. 2003.

Intercollegiate Committee for Basic Surgical Examinations.AccessArrangementsandReasonableAdjustmentsPolicyforCandidateswithaDisabilityorSpecificLearningdifficulty. 2020. https://www.intercollegiatemrcsexams.org.uk/-/media/files/imrcs/mrcs/mrcs-regulations/access-arrangements-and-reasonable-adjustments-january-2020.pdf [Last accessed 28/12/23].

Regan de Bere S, Nunn S & Nasser M. Understanding differential attainment across medical training pathways: A rapid review of the literature. General Medical Council. 2015. https://www.gmc-uk.org/-/media/documents/gmc-understanding-differential-attainment_pdf-63533431.pdf [Last accessed 28/12/23].

Unwin E, Woolf K, Dacre J, et al. Sex Differences in Fitness to Practise Test Scores: A Cohort Study of GPs. The British Journal of General Practice: The Journal of the Royal College of General Practitioners. 2019; 69 (681): e287–93. doi:10.3399/bjgp19X701789.

Pattinson J, Blow C, Sinha B et al. Exploring Reasons for Differences in Performance between UK and International Medical Graduates in the Membership of the Royal College of General Practitioners Applied Knowledge Test: A Cognitive Interview Study. BMJ Open. 2019;9 (5): e030341. doi:10.1136/bmjopen-2019-030341.

Andrews J, Chartash D & Hay S. Gender Bias in Resident Evaluations: Natural Language Processing and Competency Evaluation. Medical Education. 2021;55 (12): 1383–87. doi:10.1111/medu.14593.

Yeates P, Woolf K, Benbow E, et al. A Randomised Trial of the Influence of Racial Stereotype Bias on Examiners’ Scores, Feedback and Recollections in Undergraduate Clinical Exams. BMC Medicine 2017;15 (1): 179. doi:10.1186/s12916-017-0943-0.

Woolf K, McManus IC, Potts HWW et al. The Mediators of Minority Ethnic Underperformance in Final Medical School Examinations. British Journal of Educational Psychology. 2013; 83 (1): 135–59. doi:10.1111/j.2044-8279.2011.02060.x.

Hope D, Adamson K, McManus IC, et al. Using Differential Item Functioning to Evaluate Potential Bias in a High Stakes Postgraduate Knowledge Based Assessment. BMC Medical Education. 2018;18 (1): 64. doi:10.1186/s12909-018-1143-0.

Download references

No sources of funding to be declared.

Author information

Authors and affiliations.

Department of Surgery and Cancer, Imperial College London, London, UK

Rebecca L. Jones, Suwimol Prusmetikul & Sarah Whitehorn

Department of Ophthalmology, Cheltenham General Hospital, Gloucestershire Hospitals NHS Foundation Trust, Alexandra House, Sandford Road, Cheltenham, GL53 7AN, UK

Rebecca L. Jones

Department of Orthopaedics, Faculty of Medicine, Ramathibodi Hospital, Mahidol University, Bangkok, Thailand

Suwimol Prusmetikul

You can also search for this author in PubMed   Google Scholar

Contributions

RJ, SP and SW conceived the study. RJ carried out the search. RJ, SP and SW reviewed and appraised articles. RJ, SP and SW extracted data and synthesized results from articles. RJ, SP and SW prepared the original draft of the manuscript. RJ and SP prepared Figs. 1 and 2. All authors reviewed and edited the manuscript and agreed to the final version.

Corresponding author

Correspondence to Rebecca L. Jones .

Ethics declarations

Ethics approval and consent to participate.

Not required for this scoping review.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Jones, R.L., Prusmetikul, S. & Whitehorn, S. Differential attainment in assessment of postgraduate surgical trainees: a scoping review. BMC Med Educ 24 , 597 (2024). https://doi.org/10.1186/s12909-024-05580-2

Download citation

Received : 27 February 2024

Accepted : 20 May 2024

Published : 30 May 2024

DOI : https://doi.org/10.1186/s12909-024-05580-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Differential attainment
  • Postgraduate

BMC Medical Education

ISSN: 1472-6920

peer review higher education

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • For authors
  • Browse by collection
  • BMJ Journals

You are here

  • Volume 14, Issue 6
  • Group-format, peer-facilitated mental health promotion interventions for students in higher education settings: a scoping review protocol
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0002-9489-8717 Carrie Brooke-Sumner 1 , 2 ,
  • http://orcid.org/0000-0001-7275-1100 Mercilene T Machisa 2 , 3 ,
  • Yandisa Sikweyiya 3 , 4 ,
  • Pinky Mahlangu 2 , 3
  • 1 Mental Health, Alcohol, Substance Use and Tobacco Research Unit , South African Medical Research Council , Cape Town , South Africa
  • 2 School of Nursing and Public Health, College of Health Sciences, Howard College Campus , University of KwaZulu Natal , Durban , South Africa
  • 3 Gender and Health Research Unit , South African Medical Research Council , Pretoria , South Africa
  • 4 School of Public Health , University of the Witwatersrand , Johannesburg , South Africa
  • Correspondence to Dr Carrie Brooke-Sumner; carrie.brooke-sumner{at}mrc.ac.za

Introduction Young people in higher education face various stressors that can make them vulnerable to mental ill-health. Mental health promotion in this group therefore has important potential benefits. Peer-facilitated and group-format interventions may be feasible and sustainable. The scoping review outlined in this protocol aims to map the literature on group-format, peer-facilitated, in-person interventions for mental health promotion for higher education students attending courses on campuses in high and low/middle-income countries.

Methods and analysis Relevant studies will be identified through conducting searches of electronic databases, including Medline, CINAHL, Scopus, ERIC and PsycINFO. Searches will be conducted using Boolean operators (AND, OR, NOT) and truncation functions appropriate for each database. We will include a grey literature search. We will include articles from student participants of any gender, and published in peer-reviewed journals between 2008 and 2023. We will include English-language studies and all study types including randomised controlled trials, pilot studies and descriptive studies of intervention development. A draft charting table has been developed, which includes the fields: author, publication date, country/countries, aims, population and sample size, demographics, methods, intervention type, comparisons, peer training, number of sessions/duration of intervention, outcomes and details of measures.

Ethics and dissemination No primary data will be collected from research participants to produce this review so ethics committee approval is not required. All data will be collated from published peer-reviewed studies already in the public domain. We will publish the review in an open-access, peer-reviewed journal accessible to researchers in low/middle-income countries. This protocol is registered on Open Science Framework ( https://osf.io/agbfj/ ).

  • MENTAL HEALTH
  • PUBLIC HEALTH

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See:  https://creativecommons.org/licenses/by/4.0/ .

https://doi.org/10.1136/bmjopen-2023-080629

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

This scoping review will map literature on group-format, peer-facilitated and in-person interventions for mental health promotion among higher education students globally.

The methods are grounded in established methods and guidance for scoping reviews (Preferred Reporting Items for Systematic Reviews and Meta-Analyses).

The review will include a search of published and grey literature to synthesise the range of interventions for group-based, in-person, peer-facilitated mental health promotion for students.

The protocol is limited in focusing on English-language articles.

Mental health of young people is a global public health priority, 1 with COVID-19 having accentuated the urgency of addressing this area. 2 3 Mental well-being is crucial for higher education (university/college) students to be resilient in facing the demands of academic life 4 as well as being a building block for every society through youth development. Young people entering higher education face a period of substantial change, with new and varied stressors and changing interpersonal relationships, financial circumstances and life challenges. 5–7 Academic and other stresses are associated with the development of mental health conditions. 5 7 Mental distress is thus prevalent in higher education institutions with young female students particularly vulnerable to conditions such as anxiety, depression and post-traumatic stress disorder (eg, 8–11 ). The stressors experienced in student life overlay other vulnerability factors and social determinants 12 for developing a mental health condition, including previous experiences of childhood adversity and trauma, 13 14 and being from a sexual minority group or gender non-conforming. 13 In low/middle-income countries (LMICs) in which human resources for mental health treatments are especially scarce, 15 the potential gains from prevention of mental disorders for young people are significant and may correspond to population-level benefits in the long term. 16 17

Mental health promotion (encompassing building well-being and prevention of development of mental health conditions) remains underprioritised in many settings in comparison with efforts for treatment provision. However, such work is a crucial component of the WHO mix of mental health services (which includes self-care and informal care, primary healthcare and specialist healthcare). 18 Good mental health for young people may be contextually nuanced, but is dependent on a range of domains, including mental health literacy, attitudes towards mental health conditions, self-perceptions, cognitive skills, academic performance, emotions, behaviours, self-management, social skills, quality of significant relationships, physical and sexual health, meaning in life and quality of life. 19 20 With this range of domains, transdiagnostic or common elements of treatment and prevention approaches are relevant, 17 underpinned conceptually by the building of adaptive social and emotional skills, stress management, positive self-perceptions and supportive relationships. Prevention in relation to mental ill-health also encompasses other health behaviours (eg, dietary health, physical activity, sleep) that can have an impact on mental health outcomes. 21 Harnessing common practice elements can enable an evidence-informed approach to prevention even when a specific evidence-based intervention may be lacking (as may be the case in many LMIC settings where this research area is less developed). 17 22

Given the breadth of domains of good mental health for higher education students, mental health promotion and prevention interventions necessarily cover a wide variety of interventions at the individual, community and societal levels 23 which can be universal (population based) or selective (targeting individuals or subgroups at higher risk). 16 24 Young people in higher education settings present a particular set of opportunities for preventive and promotive interventions given the social connectedness of student populations. Many mental health conditions have their onset in adolescence. This is a key point for intervention and although not the focus of this review, there is developing evidence that school-based prevention can be effective in improving mental health literacy and reducing mental health stigma. 25–28

There is also a developing systematised evidence base for higher education student mental health promotion from high-income countries (HICs). Systematic reviews of prevention programmes for student mental health showed moderate effects for common practice elements including psychoeducation (mental health literacy training), relaxation techniques and cognitive restructuring. 19 Guided mental health skills training programmes 22 29–31 and computer and web-based interventions delivered by a variety of professionals 32–34 similarly showed moderate effects. Current developments in mental health promotion include a movement towards a positive conceptualisation of mental health literacy which is not focused on ‘illness’ and symptoms. Within this conceptualisation, positive mental health literacy includes problem-solving ability, independence, relational skills and self-control. 35 A similar evidence base for student mental health promotion in LMICs is required 36 building on evidence for approaches from HICs, but with consideration for contextual differences in relation to availability of human and other resources.

Most notably, a key aspect of mental health prevention interventions is the delivery agent. In LMIC settings, specialists such as psychologists, social workers and counsellors are scarce 37 and task-sharing (service provision by non-specialists with upward referral if necessary) has long been advocated as a feasible approach for mental health service delivery. 23 38 39 Digital interventions have also been put forward as a low-resource approach for mental health promotion. 36 Evidence for digital interventions for mental health and access to technology in HICs and LMICs is growing and indicates potential for these approaches, in conjunction with other supportive service options. 40 41 As such, in-person interventions enable building of rapport and support and are needed in contexts where access to computer or mobile technology may be limited. Group approaches may be a feasible low-resource approach in some LMIC contexts. Hence, in-person group approaches are the focus of this review, which will inform development of in-person group interventions. The group format presents the opportunity for reaching higher numbers of students in a less resource-intensive approach than individual-level support, which may be advantageous for mental health promotion. 36 Further to this, peer-led approaches, in which people who share a common lived experience (eg, being a higher education student) are involved in providing services, have traction in other areas of public health, for young people in particular (eg, 42 43 ) and in other areas of mental health service provision (eg, 44 45 ). Peer-facilitated approaches may be particularly appropriate for mental health promotion among young people given the importance of peer influence at this point in the life course. 46 These approaches have the potential to be low resource and sustainable in resource-constrained settings, 42 though still requiring investment and engagement 47 to be productively implemented.

This scoping review will map the evidence for in-person, group-based, peer-facilitated mental health promotion interventions for young people in higher education settings globally. Once conducted, this review will enable progress in bringing together evidence for such preventive approaches that are applicable for resource-constrained LMIC contexts. This will serve as a step towards building the field of evidence-informed mental health promotion among students in higher education settings in LMICs.

Research questions

What range of in-person, group-based, peer-facilitated mental health promotion interventions are offered to higher education students on campuses globally?

What are the common practices, processes and outcome measures used for these interventions?

To map the literature on group-based, peer-facilitated, in-person group interventions for mental health promotion for higher education students on campuses globally.

To describe these peer-facilitated group interventions and the ways they have been evaluated.

To describe common practice elements and those identified as having positive effects, no effects or negative effects, if possible.

Methods and analysis

Methodology for the review will be based on that presented by Arksey and O’Malley 48 and further developed by Levac et al . 49 A key recommendation from this advancement relates to clarifying and linking the purpose of the review with the research question. 49 The purpose of the review will be to identify common practice elements of group-format, peer-facilitated mental health promotion interventions used in higher education populations. We will include a grey literature search as recommended by Joanna Briggs Institute Guidance. 50 51 We will also use an iterative approach for identifying and selecting studies 49 including identifying studies from reference lists of relevant studies with discussion of the review team. The following steps will be undertaken to investigate the review question and are detailed in the sections that follow:

Identifying the research question

Identifying relevant studies, study selection, charting the data.

Collating, summarising and reporting results

The Population, Concept, Context mnemonic has been identified as appropriate for scoping review methodology. 50 The population of interest is defined as students studying at higher education campuses; the concept is defined as any mental health promotion intervention delivered in person by a peer and in a group format; context is defined as articles from HICs and LMICs. This scoping review therefore has the following research questions:

Primary research question

What evidence is available globally for peer-facilitated, group-format in-person interventions for mental health promotion for students on higher education campuses?

Secondary research questions

What are the common practice elements and delivery methods for interventions reported in this body of literature?

What research gaps are identified by the literature included in the scoping review?

The review will be conducted from June to December 2024. Relevant studies will be identified through conducting searches of electronic databases, including Medline, CINAHL, Scopus, PsycINFO and ERIC. Searches will be conducted using Boolean operators (AND, OR, NOT) and truncation functions appropriate for each database. Search strings will be developed in line with literature on peer approaches in higher education 52 and according to each database. Searches will include for example: “mental health literacy” OR “mental health promotion” OR “psychoeducation” AND “intervention” OR “program” OR “mental health intervention” AND “group intervention” OR “peer” OR “peer support” OR “peer health education” AND “student” OR “campus” OR “university” OR “college” AND “depression” OR “anxiety” OR “mental health literacy” OR “positive mental health” OR “self care” OR “resilience” or “stress management”. We will seek assistance from a specialist librarian in refining search strings for databases based on these key terms to generate search results feasible for review by the study team. We will take an iterative approach to the searches 50 to include relevant keywords and to maintain feasibility of the number of records to review. Review authors will conduct backward citation tracking for all articles selected for inclusion as an approach to identify further relevant studies. If required, review authors will contact corresponding authors of these articles to identify relevant studies. 50 Relevant studies will be downloaded and managed using Rayyan software to conduct screening. The grey literature search will be conducted with terms above using databases such as WorldCat, OpenMD, Eldis, OpenGrey, Canadian Agency for Drugs and Technologies in Health and Grey Matters. Programme reports, case studies and manuals will be included.

Titles and abstracts of identified studies will be subjected to double-screening for relevance, with records divided among authors to facilitate faster screening. Titles and abstracts will be screened according to the predeveloped inclusion and exclusion criteria developed for answering the review question ( figure 1 ). Articles identified as relevant in this first screening process will then be included in a full-text review conducted by all authors (double-screening, two authors review each article). Agreement and disagreement on full-text inclusion will be discussed by reviewing authors, and any disagreements resolved through discussion with the full review team.

  • Download figure
  • Open in new tab
  • Download powerpoint

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow diagram describing the process of article review and inclusion and exclusion for the scoping review.

Inclusion criteria

The purposes of the scoping review mean that included articles will describe in-person interventions delivered on higher education campuses in HICs and LMICs, by peers. Articles from HICs will be included as there may be practice elements and delivery approaches from HICs that are relevant for advancing the field in LMICs where peer-delivered approaches may be less well developed. We will include articles that describe group-format interventions delivered in higher education settings among student participants of any gender, and published in peer-reviewed journals between 2008 and 2024 (15-year range for feasibility in number of records to review and to capture recent developments in the field). We will include English-language studies and all study types including randomised controlled trials, pilot studies and descriptive studies of intervention development. We will not exclude papers that do not report where the study was conducted.

Exclusion criteria

Articles describing individual and group interventions delivered by clinical professionals (eg, psychologists, social workers, counsellors or other university staff) will be excluded, as well as articles reporting on clinical treatment interventions as opposed to prevention interventions. Articles reporting on interventions not delivered in person, by peers or in groups (eg, online/web based) will also be excluded. Non-English-language papers and those published outside of the time period indicated will be excluded.

Data extraction and management

Data extraction will be conducted from PDF or other electronic files into an Excel-based data management file ( online supplemental file 1 ). We will conduct a pilot test of this abstraction form with initial articles to ensure consistency in data extraction between team members.

Supplemental material

The charting process will provide a descriptive summary of the evidence presented in included studies, 50 as the initial step towards answering the primary and secondary research questions. The draft charting table includes the fields: author, publication date, country/countries, aims, population and sample size, methods, demographics of peers and/or target population, intervention type, peer training, number of sessions/duration, retention, comparisons, duration of intervention, outcomes and details of measures. The data charting table completed for each reviewer will be collated and reviewed by the whole review team to address any inconsistencies.

Assessment of study quality

For this scoping review, we will not conduct a formal assessment of risk of bias using one of the established reporting tools (eg, 53 54 ) as the aim is to provide an overview of existing evidence regardless of methodological quality. We will however use the Template for Intervention Description and Replication (TIDieR) 55 checklist to assess completeness in the reporting of included intervention studies. Available since 2014, this guide may have been used by study authors to describe interventions and can be used in this scoping review to provide a characterisation of interventions along with charting of data.

Summarising and reporting findings

Appropriate tables will be used to give the reader an overall picture of the spread of evidence across publication years and country of origin. A narrative description of the studies will also be presented (including TIDieR), aligning findings with the primary and secondary research questions. 50 For ease of use for the reader, findings will be presented under the conceptual headings ‘Definition of peer’, ‘Intervention practice elements’, ‘Intervention delivery’ and ‘Research gaps’. The final manuscript will be prepared using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews. 56 Depending on the amount of literature found, data abstraction may also include categorising the extracted information into themes to identify patterns across the literature, and according to high-income, low-income or lower middle-income country status (given differences in availability of resources across these settings).

Ethical considerations and dissemination

No primary data will be collected from research participants to produce this review. All data will be collated from published peer-reviewed studies already in the public domain. This study therefore does not require ethical approval from a Human Research Ethics Committee. To minimise potential bias and ensure an accurate reflection of the scope of the literature, we have a review team with four members who will work collaboratively. These researchers are senior scientists at the South African Medical Research Council, with backgrounds in quantitative and qualitative research in public health and mental health. We will publish the review in an open-access, peer-reviewed journal accessible to researchers in LMICs. We will also disseminate findings through our established Community Advisory Board and stakeholder engagement platforms of the South African Medical Research Council.

Patient and public involvement

Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

The United Nations Sustainable Development Goals (SDGs) prioritise mental wellness in SDG 3. 57 This indicates high-level commitment to improvement in mental well-being, which requires a response from all sectors of society, and quality research to guide this response. 58 Addressing social determinants of mental health, 12 the social context that precedes the development of mental health conditions (eg, through mental health promotion) holds great potential for prevention of the growing burden of mental disorders globally and in young people in particular who make up a large proportion of the global population. Young people in higher education in particular face developmental and academic stressors which can make them vulnerable to mental health conditions. 59–61

Prevention efforts that promote mental health will need to begin to address vulnerability as well as improve coping and resilience. Intervention research to develop and test interventions in this area is required. 58 The scoping review outlined in this protocol will contribute to moving the mental health promotion field in LMICs forward through mapping group-format, peer-facilitated mental health promotion practice elements. This will add to the body of work on adaptation of complex interventions, 62 63 incorporating learning from HIC settings with practice elements that are appropriate and can be tailored for LMIC settings. Evidence for approaches that are feasible, acceptable and crucially tailored for LMIC contexts is required, given the risk factors associated with mental health conditions in these settings. 60 64 For example, such approaches may give prominence to developing social support, promoting aspects of well-being including social, academic and spiritual well-being, and building resilience and coping strategies which are crucial in LMIC contexts. 65 Through mapping the body of literature, this scoping review will also contribute to evidence for state of the field of task-sharing for mental health promotion in LMIC settings. Finally, findings from the review will identify gaps and opportunities for researchers and practitioners interested in implementing group-format, peer-led approaches for improving mental health for students in LMICs and in HICs.

Ethics statements

Patient consent for publication.

Not applicable.

  • Flisher AJ ,
  • Hetrick S , et al
  • Law-van Wyk E
  • Ahorsu DK ,
  • Sánchez Vidaña DI ,
  • Lipardo D , et al
  • Karyotaki E ,
  • Cuijpers P ,
  • Albor Y , et al
  • Jessop DC ,
  • Herberts C ,
  • Macaskill A
  • Storrie K ,
  • Rotenstein LS ,
  • Torre M , et al
  • Abuidhail J , et al
  • Brooke-Sumner C ,
  • Baingana F , et al
  • Campbell F ,
  • Cantrell A , et al
  • Berenz EC ,
  • Kevorkian S ,
  • Chowdhury N , et al
  • Pinninti N ,
  • Irfan M , et al
  • Salazar de Pablo G ,
  • De Micheli A ,
  • Nieman DH , et al
  • Colizzi M ,
  • Lasalvia A ,
  • Shidhaye R ,
  • Fusar-Poli P ,
  • De Micheli A , et al
  • Priestley M ,
  • Tyrrell K , et al
  • Hutchesson MJ ,
  • Whatnall MC ,
  • Yazin N , et al
  • Rith-Najarian LR ,
  • Boustani MM ,
  • Chorpita BF
  • Petersen I ,
  • Evans-Lacko S ,
  • Semrau M , et al
  • World Health Organization
  • Anderson JK ,
  • Anderson JK
  • Amudhan S ,
  • Mani K , et al
  • Conley CS ,
  • Durlak JA ,
  • Worsley JD ,
  • Pennington A ,
  • Davies EB ,
  • Morriss R ,
  • Glazebrook C
  • Cunningham S , et al
  • Gilbert S ,
  • Mishina K , et al
  • Carvalho D ,
  • Sequeira C ,
  • Querido A , et al
  • Thornicroft G ,
  • Knapp M , et al
  • Raviola G ,
  • Naslund JA ,
  • Smith SL , et al
  • Kessler RC , et al
  • Anjur K , et al
  • Luthuli M , et al
  • Rose-Clarke K ,
  • Bentley A ,
  • Marston C , et al
  • Delisle VC ,
  • Gumuchian ST ,
  • Kloda LA , et al
  • Cabassa LJ ,
  • Camacho D ,
  • Vélez-Grau CM , et al
  • Clifton AV ,
  • Zhao S , et al
  • Laursen B ,
  • Pointon-Haas J ,
  • Foster J , et al
  • Colquhoun H ,
  • Peters MDJ ,
  • Godfrey C ,
  • McInerney P , et al
  • Upsher R , et al
  • Sterne JA ,
  • Hernán MA ,
  • Reeves BC , et al
  • Kennedy CE ,
  • Fonner VA ,
  • Armstrong KA , et al
  • Hoffmann TC ,
  • Glasziou PP ,
  • Boutron I , et al
  • Tricco AC ,
  • Zarin W , et al
  • United Nations
  • Mahlangu P ,
  • Machisa M ,
  • Sikweyiya Y , et al
  • Bantjes J ,
  • Kessler M ,
  • Lochner C , et al
  • Langtree EM ,
  • Skivington K ,
  • Matthews L ,
  • Simpson SA , et al
  • Movsisyan A ,
  • Evans R , et al
  • Machisa MT ,
  • Mahlangu P , et al
  • Rebecca F-M ,
  • McDougal L ,
  • Frost E , et al

Supplementary materials

Supplementary data.

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

  • Data supplement 1

Contributors All authors conceptualised the study and developed the methods collaboratively. PM, MTM and YS contributed to conceptualisation of research questions, development of search strategy and scoping review methods. CB-S led the development of the research questions and methods and produced the initial draft of this manuscript. All authors reviewed and approved the final manuscript.

Funding CB-S, MTM, PM and YS are supported in this work by funds from the South African Medical Research Council Flagship projects funding.

Competing interests None declared.

Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Read the full text or download the PDF:

  • Open access
  • Published: 31 May 2024

Public involvement and engagement in scientific research and higher education: the only way is ethics?

  • Claire Nollett 1 ,
  • Matthias Eberl 2 , 3 ,
  • Jim Fitzgibbon 4 ,
  • Natalie Joseph-Williams 5 , 6 &
  • Sarah Hatch 7  

Research Involvement and Engagement volume  10 , Article number:  50 ( 2024 ) Cite this article

502 Accesses

33 Altmetric

Metrics details

Involving and engaging the public in scientific research and higher education is slowly becoming the norm for academic institutions in the United Kingdom and elsewhere. Driven by a wide range of stakeholders including regulators, funders, research policymakers and charities public involvement and public engagement are increasingly seen as essential in delivering open and transparent activity that is relevant and positively impacts on our society. It is obvious that any activities involving and engaging members of the public should be conducted safely and ethically. However, it is not clear whether conducting activities ethically means they require ethical approval from a research ethics committee.

Although there is some guidance available from government organisations (e.g. the UK Health Research Authority) to suggest ethical approval is not required for such activities, requests from funders and publishers to have ethical approval in place is commonplace in the authors’ experience. We explore this using case studies from our own institution.

We conclude that any public-facing activity with the purpose to systemically investigate knowledge, attitudes and experiences of members of the public as research and as human participants requires prior approval from an ethics committee. In contrast, engaging and involving members of the public and drawing on lived experience to inform aspects of research and teaching does not. However, lack of clarity around this distinction often results in the academic community seeking ethical approval ‘just in case’, leading to wasted time and resources and erecting unnecessary barriers for public involvement and public engagement. Instead, ethical issues and risks should be appropriately considered and mitigated by the relevant staff within their professional roles, be it academic or a professional service. Often this can involve following published guidelines and conducting an activity risk assessment, or similar. Moving forward, it is critical that academic funders and publishers acknowledge the distinction and agree on an accepted approach to avoid further exacerbating the problem.

Plain English summary

Involving and engaging members of the public is recognised best practice in university research and teaching. Involvement and engagement activities (for instance, working with the public to design a research study) continue to increase in priority and are an important part of an academic’s role. However, there is often confusion amongst researchers and educators around whether involving the public in these activities requires prior ethical approval, similar to what would be the case when inviting members of the public to participate in a clinical research study, or to donate samples such as blood for experiments. As an example, sometimes researchers are asked for ethical approval by scientific journals when trying to publish the findings from their public involvement and engagement work, when in fact this is not needed. The ongoing uncertainty about the difference between actual research on one hand and public involvement and engagement on the other hand wastes precious time and resources, and is a barrier for scientists to working with the public. We have developed guidance for academic staff on when ethical approval is and is not required, using examples from our own experience. We wrote this article to bring awareness to this problem; share our views with the wider academic community; encourage discussion around the problem and possible solutions; and ultimately contribute to educating on when research ethics approval is needed, and when not.

Peer Review reports

Public involvement (PI) is ‘important, expected and possible in all types of health and social care research’ [ 1 ]. It is now commonly embedded and reported in health research papers in the UK, with approximately half mentioning public involvement activities [ 2 ]. Public engagement (PE) is also encouraged and recognised by funders and other stakeholders across the higher education sector to raise awareness, increase trust and transparency, share knowledge, foster learning and deliver positive impact to society [ 3 ].

In 2019, the UK Standards Partnership published the UK Standards for Public Involvement ‘to help researchers and organisations improve the quality and consistency of public involvement in health and care research’ [ 4 ], and a large knowledge base is developing around how to do public involvement well. However, PI is not without its challenges, as identified both in the literature e.g [ 5 ]. and through our own experience as academic researchers, professional services staff and members of several national public involvement committees. Key issues include how to efficiently pay and reimburse public contributors within organisations, how to effectively evaluate the impact, and how to provide inclusive opportunities and reach under-served groups to increase the diversity of those involved [ 6 ].

The Research Excellence Framework (REF) 2029, the UK’s national assessment of the quality of research produced by its higher education institutions held every 6–7 years, will see a 25% weighting of returns with respect to the social, economic and political influence of the research conducted. The 2029 round will in fact be the first REF assessment where impact will be measured as “ Engagement and Impact” (our emphasis), alongside an accompanying statement to evidence engagement and impact activity beyond case studies [ 7 ]. As with PI, researchers face challenges in delivering PE including achieving the inclusion of under-served communities [ 8 ] and how to evaluate impact [ 3 ].

With individual researchers and their host institutions increasingly embracing PI and PE as part of their research and scholarship activities, there is one issue that we have found particularly contentious with researchers, employers, funders and publishers across both involvement and engagement and that is the focus of this commentary: the role of ethical approval in PI and PE activity.

Public involvement, sometimes referred to as Patient & Public Involvement (PPI) in health and social care research, is defined as ‘research being carried out ‘with’ or ‘by’ members of the public, rather than ‘to’, ‘about’ or ‘for’ them’ [ 9 ]. PE, adopting the UK’s National Coordinating Centre for Public Engagement’s definition, is a ‘myriad of ways in which the activity and benefits of higher education and research can be shared with the public’ [ 10 ]. PE is by definition a two-way process, involving interaction and listening, with the goal of generating mutual benefit. Both PI and PE are distinct from human participation in research whereby a member of the public agrees via informed consent to be a participant in research, e.g. receiving a study intervention, donating samples or sharing lived experiences. Whilst health and social care research involving human participants requires approval from a research ethics committee (REC), PI and PE activities typically do not.

In the UK, ethical approval is granted by a REC under the auspices of the National Health Service (NHS) for research on patients or healthcare professionals, or a local review committee or panel for research that does not include NHS patients. In academic research, this would usually be a university or school REC (referred to here as an Institutional Review Board, IRB). Other countries may use different approaches but the general need for RECs to approve research with human participants is ubiquitous. With regard to public involvement, the UK Health Research Authority (HRA) that is responsible for all NHS RECs explicitly states that ‘You do not need to submit an application to a Research Ethics Committee in order to involve the public in the planning or the design stage of research, even if the people involved are patients’ [ 11 ]. This advice would also apply to university ethics committees. However, despite this clear distinction, we have encountered and become aware of situations in which investigators were asked to acquire ethical approval for activities with the public – including PI, PE and impact activities. This highlights a potential misunderstanding of the nature of PI and PE, and their role alongside research. Whilst either activity can raise ethical considerations for the individuals involved, the requests to acquire research ethics approval for PI and PE need to be challenged within the academic community to increase awareness, understanding of and best practice around these activities. Seeking unnecessary approval adds a heavy additional burden on researchers which effectively acts as a barrier to carrying out PI and PE; can significantly delay timely activities; and uses valuable resources.

We propose that the requests to gain ethical approval for PI and PE activities stem largely from three main issues.

Firstly, ‘grey’ areas, such as a blurring of the boundary between qualitative research and PI and PE activities, including confusion amongst the research community over the differences between research involvement, engagement and participation.

Secondly, a perception amongst the research community that it is best to seek ethical approval ‘just in case’ or to ‘be on the safe side’, e.g. if asked by journal editors when trying to publish, rather than complete appropriate risk assessments to address any ethical considerations when carrying out PI and PE.

And finally, lack of knowledge of an alternative recognised process on how to evidence that PI and PE activities with the public have been conducted in an ethical manner, if not approved by an NHS REC or local IRB.

Despite guidance indicating other ways to address ethical concerns in PI and PE [ 12 , 13 , 14 , 15 , 16 , 17 , 18 ], researchers, funders and publishers appear to be turning increasingly to university IRBs as the (perceived) ultimate arbiters of deciding ethical issues related to PI and PE activities. We see the need to highlight this as a growing problem and suggest ways the issues above can be overcome. We will firstly explore in more detail the distinction between qualitative research and PI and PE activities before outlining examples from our own experience around the three issues identified, and then proceeding to make recommendations for moving forwards.

Public involvement and engagement vs. qualitative research

Distinguishing between whether activities with members of the public constitute PI and PE or qualitative research (and therefore require ethical approval) is a particularly ‘grey’ area [ 19 ]. This is especially true when consulting with a number of people at one time in what is usually referred to as a ‘focus group’. Going forward, it may be helpful to distinguish between ‘focus groups’, which are used for research, and ‘discussion groups’ used for PI and PE [ 20 ].

Several authors and organisations have described the difference between the two activities and developed useful side-by-side comparisons [ 19 , 20 ]. In focus groups which are part of research, people attending are research participants who receive a standard Participant Information Sheet and provide informed consent. Their input will usually be recorded via an audio device, transcribed verbatim, treated as ‘data’, and systematically be analysed to answer a research question. For this, ethical approval is usually required. On the other hand, the contributions of people attending PI discussion groups will be recorded only as key points (e.g. a list of key themes emerging or key priorities discussed by the group in relation to a specific topic) to help shape and guide the research itself, such as agreeing which research outcome measures to use, helping to shape the intervention or the development of data collection materials like participant information sheets or interview guides. PI discussion groups do not require ethical approval but should be conducted in an ethical manner. Those involved should still be provided with information about the activity up front to ensure they are clear what their involvement will entail, and they may be asked to provide agreement or consent, but not in the formally documented way required for research. This is discussed in more detail in the recommendations.

Another grey area concerns whether direct quotes gathered from people in a discussion group can be used in a publication. Whilst ethical approval is not required for this, we do advise gaining documented agreement if you wish to do this, e.g. an email from the group member agreeing to quotes being used in a publication to illustrate the key points identified (not as data). In some cases, researchers will need to combine PI activities with a qualitative research approach and there may be confusion regarding which activities require approval. For example, an investigator may wish to interview new mothers as research participants to get their views on motherhood (research participation). This would require ethical approval. But prior to interviews, they may want to involve a separate group of new mothers in a discussion to help shape the topic guide for the interviews (PI). This would not need ethical approval [ 21 ].

The extent of the problem - examples from our own experience

Through requesting examples from colleagues on their experiences, we uncovered many different situations within our own institution highlighting a difference of opinion on whether research ethics should be sought for PI and PE activity. We here outline three examples, giving the background to the project, the activity undertaken and the issues encountered.

Writing a training program with charity service users and staff – request from charity and publication to seek ethical approval from the university IRB for the project .

This project involved service users and charity staff in writing a mental health training curriculum for staff to identify depression in service users. Staff and service user input was sought through online meetings and email feedback. The attendees gave their opinions (based on their lived experience) on what should be included in the curriculum, and the key points were summarised to inform curriculum development. The information they gave was not treated as data to answer a research question and was not systematically analysed using qualitative methods. In this respect, HRA state that ’if you are collecting opinions rather than study data, your activity is likely an involvement activity’ [ 22 ].

Regardless of the above considerations, the project lead was asked by third sector organisations to seek university IRB approval, to ensure the service users would be treated in an ethical manner. An academic colleague agreed this was a good idea ‘just in case’ it was questioned by others, in particular by a journal editor when seeking to publish (which indeed it was). However, we view this as unnecessary given the activity was not classed as research and therefore not in the remit of the IRB. The IRB provided written agreement that ethical review was not required for this project and the project team agreed a standard engagement risk assessment would consider and address any ethical issues.

Co-producing an educational online resource for school children – request from publication to seek ethical approval for the project .

This co-production project working with researchers, a PI and PE professional, school teachers and web designers aimed to develop an educational online resource for school age children and their teachers. This interdisciplinary team of experts were involved in four online workshops to support the delivery and development of a website that would support teachers and enhance learning. All individuals involved fully signed up to the coproduction focus of the project and provided verbal agreement to take part in the workshops and off-line discussions. However, when trying to publish the co-production process, the journal editor stressed that according to journal policy ‘research involving human subjects, human material or human data must have been approved by an appropriate ethics committee’.

The authors explained that the project did not involve human subjects, human material or human data (as it was not research) and therefore in their opinion did not require ethical approval. The journal editor disagreed, arguing that the project was a research study that collected and analysed data, and that the teachers and web designers involved in this project were human participants of the study and data had been generated of their opinions. The editor recommended seeking either retrospective ethical approval or else removing all human data. The team saw no alternative but to withdraw their original manuscript and submit the work elsewhere.

Co-production project involving people from minority ethnic backgrounds in discussion about inclusive health research – project investigators not comfortable including quotes from public contributors due to lack of informed consent.

This project involving researchers, an artist, charity project workers serving the most ethnically diverse ward in Wales and local residents aimed to answer the question: ‘How can people from minority ethnic backgrounds influence health research in terms of both what and how this research is done?’ Eight co-production workshops drawing on the participatory democracy approach were held and delivered a set of recommendations for the health research community. In advance of these workshops, a university IRB Chair helped to clarify that ethical approval was not needed.

When publishing this work, researchers did not include quotes obtained from the workshops as informed consent had not been sought (as it was not research) [ 23 ]. On reflection, the authors would like to have gained agreement for the residents’ quotes to be used, in the absence of the requirement for documented informed consent.

Identified exceptions

Whilst PI and PE activities do not generally require ethical approval, there are at least two example scenarios where approval is required. Firstly, for example, when systematically comparing two methods of involvement and/or engagement to understand which is better i.e. answering a research question about PI/PE to produce generalisable or transferable findings. Secondly, when public members come into direct contact with study participants or their data e.g. if assisting with conducting research interviews or analysing the transcripts. In this situation, ethical approval is required because human participants are involved in the research.

Recommendations for moving forwards

We encourage the research community, including researchers, publishers, reviewers, funders and ethics committees to better appreciate the difference between PI and PE and research involving human participants; to recognise that all involved stakeholders operate within professional boundaries; and to work together to agree an alternative accepted approach when the PI and PE activity raises ethical considerations (e.g. when working with vulnerable groups or publishing of public contributor quotes). The responsibility of determining whether research ethic approval is required falls on the individuals/team planning the activity. We understand that it is tempting to seek research ethical approval for PI and PE activity ‘just in case’ or ‘to be on the safe side’, but we do believe this is detrimental for several reasons including:

Sustains the confusion between qualitative research and PI and PE activity, and the different purposes of each.

Wastes valuable researcher and committee time and resources.

Undermines the importance of the research ethics approval process.

Delays PI and PE activities in the research process, potentially leading to missing out on the benefits of earlier involvement.

Undermines coproduction principles such as equality and shared responsibility between researchers and members of the public. The process of acquiring ethical approval itself asserts a hierarchy whereby a researcher is identified as Chief/Principal investigator, and other members of the team are listed below an identified leader.

Acts as an additional barrier and disincentive to researchers carrying out PI and PE activity.

figure 1

Simple flow diagram to support researchers to decide on the need for research ethical approval via an IRB

There is a need to address this growing problem, via education and generating solutions acceptable to the community as a whole, providing confidence in decisions made and assurances that the health and safety and any risks associated with the proposed PI and PE activity have been carefully considered and approved. Here we present key recommendations for those conducting public involvement and engagement activities based on our internal guidance (Appendix 1) for alternative courses of action moving forwards when faced with these challenges.

Purpose - Consider the purpose of the activity. Is it to answer a scientific or clinical question (research) or help shape, guide or disseminate the research (PI/PE)? If you are unsure if your project is research, you can consult the UK Health Research Authority’s ‘Is my study research’ decision tool. Following response to three questions, (1. Are the participants in your study randomised to different groups? 2. Does your study protocol demand changing treatment/care/services from accepted standards for any of the patients/service users involved? 3. Is your study designed to produce generalisable or transferable findings? ) The tool confirms if your study would be considered as research. This result can be downloaded and further advice can be sought [ 24 ]. The HRA table ‘Defining Research’ can also help provide clarification [ 25 ].

Internally, a simple flow diagram (Fig.  1 ) has been created to support researchers in making a decision on the need for research ethics approval when carrying out public involvement activity.

Risk assessment – To ensure PI and PE activities are conducted in a safe and ethical manner, particularly when engaging and/or involving ‘vulnerable’ groups, refer to published guidance on conducting ethical PI&E [ 12 , 13 , 14 , 15 , 16 ], consider completing a specifically designed PI and PE risk assessment (See Appendix 2 for an example) or using the PIRIT tool [ 26 ]to assess your planned activities and undertake adequate training (See Appendix 2 for an example). Use the same considerations as you might for research or teaching e.g. what to do if an individual becomes upset in a discussion group, how to support them, where to refer them. Also consider safety, protection of anonymity and confidentiality of personal data. Use the UK Standards on Public Involvement [ 4 ] to guide your thinking around accessibility and inclusivity when completing the assessment. If possible, involve a public contributor and have this signed off by a senior academic/responsible member of staff in your organisation.

Adequate information and agreement to take part – Ensure that public members being invited to take part in PI and PE activity agree for you to use their anonymous quotes in any output. But understand that standard Participant Information Sheets and Informed Consent Forms are not required as formal consent is not required.

Language – To avoid confusion for reviewers and publishers, think carefully about the language you use to describe your PI and PE activities. For example, use the term ‘discussion group’ rather than ‘focus group’; refer to members of the public as ‘attendees’ not ‘participants’ and input as ‘contributions’ rather than ‘data’; and ‘summarising key points or themes’ as opposed to ‘thematic analysis’ when describing your activities (if that is indeed what you are doing).

Written confirmation – Some institutions have established infrastructure to support researchers through a self-assessment process for governance and ethics, providing a confirmatory statement as to whether ethical approval is required if challenged by funders and publishers [ 27 ]. However, not all institutions have this facility and until this area of contention is resolved, some individuals may wish to seek written confirmation from their local IRB. In our experience, a letter confirming approval is not required is acceptable by journal editors. Liaise with your local IRB to determine if this is within their remit.

Training – The development and inclusion of training for researchers and support staff is required on when to seek ethical approval and how to effectively manage ethical, risks, and health and safety aspects of PI and PE in a considered, widely accepted and non-burdensome way.

Conclusions

Our experience suggests that ambiguity remains in the academic community about whether ethical approval is needed for PI and PE activities. We believe this stems from (1) the grey area between qualitative research and PI and PE activities; (2) seeking approval ‘just in case’ they are requested by funders, publishers or authorities (based on previous experience) (3) funders, publishers and authorities not being clear in the distinction and equally asking for approval ‘just in case’ and (4) a lack of an alternative recognised way to evidence that ethical issues have been considered and mitigated against. We have used real world examples to demonstrate the issues encountered in a single institution and make several recommendations aimed at researchers for addressing this area of contention going forward. We appreciate that our views may be framed by our experience of conducting PI&E in a healthcare context and in the UK, and the experiences of researchers in other disciplines and countries may vary significantly.

We hope this commentary triggers debate in the community to highlight, educate and clarify the position surrounding research ethics and PI and PE activity amongst researchers, funders and journal editors. Our experience shows that this issue is effectively acting as a barrier to researchers conducting PI and PE activity and publishing PI and PE learning. An alternative recognised process needs to be established by the community to resolve this growing detrimental development.

Data availability

Not applicable.

Abbreviations

Health Research Authority

Institutional review board

  • Public engagement
  • Public involvement

Research Ethics Committee

Research Excellence Framework

Health Research Authority. Putting people first - embedding public involvement in health and social care research 2023 [ https://www.hra.nhs.uk/planning-and-improving-research/best-practice/public-involvement/putting-people-first-embedding-public-involvement-health-and-social-care-research/

Lang I, King A, Jenkins G, Boddy K, Khan Z, Liabo K. How common is patient and public involvement (PPI)? Cross-sectional analysis of frequency of PPI reporting in health research papers and associations with methods, funding sources and other factors. BMJ Open. 2022;12:e063356.

Article   PubMed   PubMed Central   Google Scholar  

Eberl M, Joseph-Williams N, Nollett C, Fitzgibbon J, Hatch S. Overcoming the disconnect between scientific research and the public. Immunol Cell Biol. 2023;101(7):590.

Article   PubMed   Google Scholar  

UK Public Involvement Standards Development Partnership. UK Standards for Public Involvement 2019 [ https://drive.google.com/file/d/1U-IJNJCfFepaAOruEhzz1TdLvAcHTt2Q/view

Staniszewska S, Denegri S, Matthews R, Minogue V. Reviewing progress in public involvement in NIHR research: developing and implementing a new vision for the future. BMJ Open. 2018;8(7):e017124.

Hatch S, Fitzgibbon J, Tonks AJ, Forty L. Diversity in patient and public involvement in healthcare research and education—realising the potential. Health Expect. 2023;27(1):e13896.

Research Excellence Framework. Research Excellence Framework 2029 2024 [ https://www.ref.ac.uk/

Nguyen Thanh H, Cheah P, Chambers M. Identifying ‘hard-to-reach’ groups and strategies to engage them in biomedical research: perspectives from engagement practitioners in Southeast Asia. Wellcome Open Res. 2019;4(102).

National Institute for Health and Care Research. Briefing notes for researchers-public involvement in NHS, health and social care research. 2021.

National Co-ordinating Centre for Public Engagement. Introducing Public Engagament 2024 [ https://www.publicengagement.ac.uk/introducing-public-engagement

Health Research Authority. What do I need to do? 2020 [ https://www.hra.nhs.uk/planning-and-improving-research/best-practice/public-involvement/what-do-i-need-do/

Pandya-Wood R, Barron D, Elliott J. A framework for public involvement at the design stage of NHS health and social care research: time to develop ethically conscious standards. Res Involv Engagem. 2017;3(3).

Canadian Institutes of Health Research. Ethics Guidance for Developing Partnerships with Patients and Researchers 2020 [ https://cihr-irsc.gc.ca/e/51910.html#2

Hersh D, Israel M. C. S. The ethics of patient and public involvement across the research process: towards partnership with people with aphasia. Aphaisology. 2021.

Abma T, Groot B, Widdershoven B. The Ethics of Public and Service User Involvement in Health Research: the need for participatory reflection on everyday ethical issues. Am J Bioeth. 2019;19:23–5.

Groot B, Abma T. Ethics framework for citizen science and public and patient participation in research. BMC Med Ethics. 2022;23.

National Institute for Health and Care Research. Ethical dimensions of community engagement and involvement in global health research 2021 [ https://www.nihr.ac.uk/documents/ethical-dimensions-of-community-engagement-and-involvement-in-global-health-research/28258#references

Martineau JT, Minyaoui A, Boivin A. Partnering with patients in healthcare research: a scoping review of ethical issues, challenges, and recommendations for practice. BMC Med Ethics. 2020;21.

Hanley B, Staley K, Stewart D, Barber R. Qualitative research and patient and public involvement in health and social care research: What are the key differences? 2019 [ https://www.learningforinvolvement.org.uk/content/resource/qualitative-research-and-patient-and-public-involvement-in-health-and-social-care-research-what-are-the-key-differences/

Doria N, Condran B, Curtis Maillet LB, Dowling D, Levy L. A. Sharpening the focus: differentiating between focus groups for patient engagement vs. Qualitative Res Res Involv Engagem. 2018;4.

Morgan H, Thomson G, Crossland N, Dykes F, Hoddinott P, ‘BIBS’ study team. Combining PPI with qualitative research to engage ‘harder-to-reach’ populations: service user groups as co-applicants on a platform study for a trial. Res Involv Engagem. 2016;24(7).

Health Research Authority. Accessing study support and advice services 2024 [ https://www.hra.nhs.uk/planning-and-improving-research/research-planning/access-study-support-advice-services/

Bridges S, Lamont-Robinson C, Herbert A, Din M, Smith C, Ahmed N et al. Talking trials: an arts-based exploration of attitudes to clinical trials amongst minority ethnic members of the South Riverside Community of Cardiff Health expectations. 2023(26):3.

Health Research Authority. Is my study research? 2024 [ https://www.hra-decisiontools.org.uk/research/

Health Research Authority. Defining Research Table 2022 [ https://www.hra-decisiontools.org.uk/research/docs/DefiningResearchTable_Oct2022.pdf

Marie Curie Research Centre. Public Involvement in Research Impact Toolkit (PIRIT) [ https://www.cardiff.ac.uk/marie-curie-research-centre/patient-and-public-involvement/public-involvement-in-research-impact-toolkit-pirit

University of Surrey. Ethics Guide 2022 [ https://issuu.com/universityofsurrey/docs/surrey_ethics_cop_v20_single?fr=sNDM5NzUwNzU0OTI

Download references

Acknowledgements

We are grateful to our colleagues Martina Svobodoba, Sarah Bridges and Dr Vicky Shepherd for providing useful insights and resources from their experiences and to Dr Emma Yhnell for her helpful review and comment on the first draft.

Author information

Authors and affiliations.

Centre for Trials Research, Cardiff University, 7th Floor, Neuadd Meirionnydd, Heath Park Campus, Cardiff, UK

Claire Nollett

Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff, UK

Matthias Eberl

Systems Immunity Research Institute, Cardiff University, Cardiff, UK

School of Medicine, Lead Public Contributor, Cardiff University, Cardiff, UK

Jim Fitzgibbon

Division of Population Medicine, School of Medicine, Cardiff University, Cardiff, UK

Natalie Joseph-Williams

Health and Care Research Wales Evidence Centre, Cardiff, UK

Public Involvement and Engagement Team, School of Medicine, Cardiff University, Cardiff, UK

Sarah Hatch

You can also search for this author in PubMed   Google Scholar

Contributions

CN and SH drafted the first version; CN, SH, ME and NJW added case studies; ME, NJW and JF contributed to revised versions and all authors read and approved the final manuscript.

Corresponding authors

Correspondence to Claire Nollett or Sarah Hatch .

Ethics declarations

Ethics approval and consent to participate.

Not applicable – no participants involved.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Authors information

CN is Academic Lead for Public Involvement and Engagement in the Centre for Trials Research, Cardiff University. SH is the Public Involvement and Engagement Manager for the School of Medicine, Cardiff University, alongside researchers ME and NJW who are the Joint Academic Leads for Public Involvement and Engagement in the School of Medicine, Cardiff University. ME is also the Engagement Lead for the Systems Immunity Research Institute at Cardiff University and the Engagement Secretary for the British Society for Immunology. JF was the Lead Public Contributor in the School of Medicine at the time of writing.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Nollett, C., Eberl, M., Fitzgibbon, J. et al. Public involvement and engagement in scientific research and higher education: the only way is ethics?. Res Involv Engagem 10 , 50 (2024). https://doi.org/10.1186/s40900-024-00587-x

Download citation

Received : 15 February 2024

Accepted : 24 May 2024

Published : 31 May 2024

DOI : https://doi.org/10.1186/s40900-024-00587-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Research ethics committee
  • Ethical approval

Research Involvement and Engagement

ISSN: 2056-7529

peer review higher education

Peer Review, Higher Education

  • Living reference work entry
  • First Online: 20 April 2017
  • Cite this living reference work entry

peer review higher education

  • Janosch Baumann 3 &
  • Christian Schneijderberg 4  

127 Accesses

Assessment ; Evaluation ; Expert examination ; Referee

Striving for the identification of high quality of research, teaching, etc., peer review relies on expert judgment. Peer review is the core steering mechanism for the distribution of resources and for granting prestige in science.

Peer Review in Science

The principle of peer review is widely used in science for the assessment and evaluation ( Academic Evaluation ) of research (e.g., manuscripts and grant proposals), people (e.g., jobs, promotion, and scholarships), and structures and procedures (e.g., organizations and programs). Despite the great variety of applications, research about peer review mostly focuses on research and the peer review of journal manuscripts (e.g., Hirschauer 2010 ) and research grants (e.g., Lamont 2009 ). Relying on expert judgment, peer review has become the core steering mechanism for the distribution of resources and for granting prestige in science. Studies about peer review address...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Bornmann, Lutz. 2011. Scientific peer review. Annual Review of Information Science and Technology 45: 197–245.

Article   Google Scholar  

Bornmann, Lutz, Sandra Mittag, and Hans-Dieter Daniel. 2006. Quality assurance in higher education – Meta-evaluation of multi-stage evaluation procedures in Germany. Higher Education 52: 687–709.

Gläser, Jochen, and Grit Laudel. 2005. Advantages and dangers of ‘remote’ peer evaluation. Research Evaluation 14: 186–198.

Hirschauer, Stefan. 2010. Editorial judgments: A praxeology of ‘Voting’ in peer review. Social Studies of Science 40: 71–103.

Huutoniemi, Katri. 2012. Communicating and compromising on disciplinary expertise in the peer review of research proposals. Social Studies of Science 42: 897–921.

Lamont, Michèle. 2009. How professors think . Cambridge/London: Harvard University Press.

Book   Google Scholar  

Lamont, Michèle. 2012. Toward a comparative sociology of valuation and evaluation. Annual Review of Sociology 38: 201–221.

Langfeldt, Liv, Bjørn Stensaker, Harvey Lee, Jeroen Huisman, and Don Westerheijden. 2010. The role of peer review in Norwegian quality assurance: Potential consequences for excellence and diversity. Higher Education 59: 391–405.

Lee, Carole, Cassidy Sugimoto, Zhang Guo, and Blaise Cronin. 2013. Bias in peer review. Journal of the American Society for Information Science and Technology 64: 2–17.

Mallard, Grégoire, Michèle Lamont, and Joshua Guetzkow. 2009. Fairness as appropriateness: Negotiating epistemological differences in peer review. Science, Technology & Human Values 34: 573–606.

Musselin, Christine. 2013. How peer review empowers the academic profession and university managers: Changes in relationships between the state, universities and the professoriate. Research Policy 42: 1165–1173.

Mulligan, Arian, Louise Hall, and Ellen Raphael. 2013. Peer review in a changing world. Journal of the American Society for Information Science and Technology . doi:10.1002/asi.22798.

Google Scholar  

Olbrecht, Meike, and Lutz Bornmann. 2010. Panel peer review of grant applications: What do we know from research in social psychology on judgment and decision-making in groups? Research Evaluation 4: 293–304.

Zuckerman, Harriet, and Robert K. Merton. 1971. Patterns of evaluation in science: Institutionalisation, structure and functions of the referee system. Minerva 9: 66–100.

Download references

Author information

Authors and affiliations.

University of Kassel, Kassel, Germany

Janosch Baumann

International Centre for Higher Education Research, University of Kassel, Kassel, Germany

Christian Schneijderberg

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Janosch Baumann .

Editor information

Editors and affiliations.

Department of Education, Seoul National University, Seoul, Korea (Republic of)

Jung Cheol Shin

CIPES, Faculty of Economics, University of Porto, Porto, Portugal

Pedro Teixeira

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Dordrecht

About this entry

Cite this entry.

Baumann, J., Schneijderberg, C. (2017). Peer Review, Higher Education. In: Shin, J., Teixeira, P. (eds) Encyclopedia of International Higher Education Systems and Institutions. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-9553-1_318-1

Download citation

DOI : https://doi.org/10.1007/978-94-017-9553-1_318-1

Received : 24 March 2017

Accepted : 09 April 2017

Published : 20 April 2017

Publisher Name : Springer, Dordrecht

Print ISBN : 978-94-017-9553-1

Online ISBN : 978-94-017-9553-1

eBook Packages : Springer Reference Education Reference Module Humanities and Social Sciences Reference Module Education

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Type your tag names separated by a space and hit enter

peer review higher education

Group-format, peer-facilitated mental health promotion interventions for students in higher education settings: a scoping review protocol. BMJ Open . 2024 Jun 03; 14(6):e080629. BO

Introduction.

Young people in higher education face various stressors that can make them vulnerable to mental ill-health. Mental health promotion in this group therefore has important potential benefits. Peer-facilitated and group-format interventions may be feasible and sustainable. The scoping review outlined in this protocol aims to map the literature on group-format, peer-facilitated, in-person interventions for mental health promotion for higher education students attending courses on campuses in high and low/middle-income countries.

METHODS AND ANALYSIS

Relevant studies will be identified through conducting searches of electronic databases, including Medline, CINAHL, Scopus, ERIC and PsycINFO. Searches will be conducted using Boolean operators (AND, OR, NOT) and truncation functions appropriate for each database. We will include a grey literature search. We will include articles from student participants of any gender, and published in peer-reviewed journals between 2008 and 2023. We will include English-language studies and all study types including randomised controlled trials, pilot studies and descriptive studies of intervention development. A draft charting table has been developed, which includes the fields: author, publication date, country/countries, aims, population and sample size, demographics, methods, intervention type, comparisons, peer training, number of sessions/duration of intervention, outcomes and details of measures.

ETHICS AND DISSEMINATION

No primary data will be collected from research participants to produce this review so ethics committee approval is not required. All data will be collated from published peer-reviewed studies already in the public domain. We will publish the review in an open-access, peer-reviewed journal accessible to researchers in low/middle-income countries. This protocol is registered on Open Science Framework (https://osf.io/agbfj/).

Authors +Show Affiliations

Pub type(s).

  • PMC Free PDF
  • Brooke-Sumner C
  • Sikweyiya Y
  • Health Promotion
  • Mental Health
  • Universities
  • Research Design
  • Review Literature as Topic

Prime PubMed app for Android

Related Citations

  • Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.
  • Mental health of new and recent graduates during the university-to-work transition: a scoping review protocol.
  • Beyond the black stump: rapid reviews of health research issues affecting regional, rural and remote Australia.
  • Global overview of pharmacy support personnel training programmes: a scoping review protocol.
  • Suicide prevention for international students: a scoping review protocol.
  • Athletes' access to, attitudes towards and experiences of help-seeking for mental health: a scoping review protocol.
  • Interprofessional education interventions in undergraduate students of musculoskeletal healthcare professions: a scoping review protocol.
  • Vocational rehabilitation for mental health service users with chronic mental illness in low-income to upper-middle-income countries: a scoping review protocol.
  • Coproduction of accessible digital mental health supports in partnership with young people from marginalised backgrounds: a scoping review protocol.
  • Community-based interventions targeting multiple forms of malnutrition among adolescents in low-income and middle-income countries: protocol for a scoping review.
  • Open access
  • Published: 05 June 2024

Comparison of the effects of burn assessment mission game with feedback lecture on nursing students’ knowledge and skills in the burn patients’ assessment: a randomized clinical trial

  • Amirreza Nasirzade 1 ,
  • Kolsoum Deldar 2 ,
  • Razieh Froutan 1 , 3 &
  • Mohammad Taghi Shakeri 4  

BMC Medical Informatics and Decision Making volume  24 , Article number:  157 ( 2024 ) Cite this article

62 Accesses

Metrics details

Learning of burn patient assessment is very important, but heart-breaking for nursing students. This study aimed to compare the effects of feedback lecture method with a serious game (BAM Game) on nursing students’ knowledge and skills in the assessment of burn patients.

In this randomized controlled clinical trial, 42 nursing students in their 5th semester at Mashhad University of Medical Sciences School of Nursing and Midwifery, were randomly assigned to intervention (BAM game, available for two weeks) and control (feedback lecture method presented in two 90-minute sessions) groups. Two weeks after the intervention, all students were evaluated for their knowledge (using knowledge assessment test) and skills (using an Objective Structured Clinical Examination). Statistical analysis involved independent t-test, Fisher’s exact test, analysis of covariance (ANCOVA), and univariable and multivariable ordinal logistic regression models.

Following the intervention, the skill scores were 16.4 (SD 2.2) for the intervention group and 11.8 (SD 3.8) for the control group. Similarly, the knowledge scores were 17.4 (SD 2.2) for the intervention group and 14.7 (SD 2.6) for the control group. Both differences were statistically significant ( P  < .001). These differences remained significant even after adjusting for various factors such as age, gender, marital status, residence, university entrance exam rank, and annual GPA ( P  < .05). Furthermore, the BAM game group showed significantly higher skills rank than the feedback lecture group across most stations (eight of ten) ( P  < .05) in the univariable analysis. Multivariable analysis also revealed a significantly higher skills score across most stations even after adjusting for the mentioned factors ( P  < .05). These results suggest that the BAM game group had higher skills scores over a range of 1.5 to 3.9 compared to the feedback lecture group.

Conclusions

This study demonstrated that nursing students who participated in the BAM game group exhibited superior performance in knowledge acquisition and skill development, compared to those in the control group. These results underscore a significant enhancement in educational outcomes for students involved with the BAM game, confirming its utility as a potent and effective pedagogical instrument within the realm of nursing education.

Trial registration

Iranian Registry of Clinical Trials: IRCT20220410054483N1, Registration date: 18/04/2022.

Peer Review reports

Introduction

Burn patients experience physical, mental, and psychological complications in cases of incorrect and late assessment, which imposes an extremely heavy economic burden on them [ 1 , 2 ] and worsens the life-threatening complications caused by burns. If competent nurses assess burn victims early, they can prevent many burn complications and emergencies, such as circulatory disorders, airway damage, and compartment syndrome, and save the patients’ lives [ 3 , 4 , 5 ]. On the other hand, such emergencies need capable and skilful nurses to be prevented [ 6 ]. These skills and competencies depend on improving one’s knowledge and cognition [ 7 ].

Lecturers and professors continually challenge to develop teaching-learning methods with appropriate content and structure that can effectively enhance nursing students’ assessment knowledge and skills, allowing them to access these skills anytime and anywhere [ 8 , 9 , 10 ]. One of the primary concerns of universities worldwide is to enhance the professional competency of students and establish a strong link between theoretical knowledge and specialized skills [ 11 ]. Currently, nursing students acquire the assessment knowledge and skills related to patient through traditional methods, including lectures [ 12 ]. Among different lecture methods, the feedback lecture can lead to more involvement of students with different learning characteristics [ 13 ]. Although feedback lectures encourage critical thinking and problem-solving skills [ 14 ], they offer a limited number of repetitions for understanding educational materials and require prolonged follow-up time to tailor learning concepts [ 15 ]. Due to the challenge in translating written knowledge from reference books into clinical skills, many lecturers are turning to new technologies and approaches [ 16 , 17 ] to strengthen decision-making processes and clinical reasoning [ 18 ].

Serious games in medical education employ game elements with goals that extend beyond mere entertainment [ 19 ]. Game-based learning boosts an individual’s motivation and engagement in learning the desired concepts and makes the learning experience enjoyable, convenient, and effective [ 20 , 21 ]. A serious game simulates real-world events or processes aiming to educate users and has shown better outcomes than traditional classroom learning [ 22 ]. Serious games are beneficial to train health professionals and patients [ 23 , 24 , 25 , 26 ], an anaesthesia techniques [ 27 ], surgical procedures [ 28 ], and in the principles of cardiopulmonary resuscitation [ 29 ].

Serious games can elevate learners’ motivation and improve the efficacy of the learning environment. As a result, they have become a vital component of educational programs in universities. In well-designed serious games, users feel as though they are actively participating in and learning from a real-world experience [ 30 ]. Game-based learning environments provide individuals the chance to make mistakes without fear of serious consequences [ 31 ].

Sometimes, the educational content for medical students contains distressing and unpleasant materials that lead them to disengage from these materials and avoid memorizing them [ 32 , 33 ]. For instance, students often show reluctance in engaging with the burn course due to distressing scenes and severe injuries [ 34 , 35 ]. Moreover, the lack of clinical experience can significantly hinder nursing students’ ability to develop essential assessment skills [ 36 ]. Developing professional competence in health-related fields is a top priority for universities globally. This necessitates nurturing a close connection between theoretical knowledge and practical skills [ 11 ]. It appears that serious games, as a novel and appealing teaching method, could mitigate such challenges. This approach has been deemed more effective for acquiring knowledge and cognitive skills [ 37 ].

The creation of an educational board game for teaching burn care has been shown to enhance the knowledge of healthcare members [ 38 ], but, to the best of our knowledge, no study has yet explored the assessment of burn patients. Training nursing students in patient assessment skills is crucial to avoid fatal complications of burns. This study aimed to compare the impact of the feedback lecture method and a newly designed serious game on nursing students’ knowledge and skills in assessing burn patients.

Study type and participants

In this randomized controlled clinical trial, all 5th-semester nursing students at Mashhad University of Medical Sciences (MUMS), School of Nursing and Midwifery ( n  = 44), were eligible to participate in the study. They were randomly assigned to intervention and control groups. The randomization process was carried out by an external third party, using random number lists obtained from an online randomization website ( www.randomization.com ). An assistance from a statistical consultant (blinded) was sought to randomly allocate the participants. Since all eligible students were included in the study, a sample size calculation was not performed.

Inclusion and exclusion criteria

Inclusion criteria were undergraduate nursing students studying at School of Nursing and Midwifery of the MUMS, who had access to the Internet through a mobile phone or computer and were assessing burn patients for the first time, meaning they had no previous knowledge or skill about this field. According to the nursing curriculum in the MUMS, this course is offered in the 5th semester of nursing, and in the previous semesters, students had no theoretical knowledge or practical encounters with burn patients in the classrooms, workshops, or hospitals.

Exclusion criteria included students who were unwilling to continue the study, students in the control group who were absent in more than one session, and students in the BAM game (Burn Assessment Mission game) group who were absent for a third of the defined time.

This study employed a single-blind design, meaning the statistical counsultant was unaware of the subjects’ allocation to the intervention or control groups.

Preparation of educational content involved extracting the required material from valid burn references [ 39 , 40 ] and then localizing it based on experts’ opinions.

Design of BAM game

The development of a serious game was inspired by the challenges encountered in teaching fifth-semester nursing students how to assess burn patients. The assessment scenes were unpleasant, and the clinical conditions were not conducive to effective teaching or active learning. Consequently, burn specialists, nursing professors, health informatics specialists, and a software engineering team conducted several brainstorming sessions to design the BAM game to address this challenge. In the BAM game, students could initially study educational content before real clinical encounter. Here, they would become familiar with crucial and vital points in assessing burn patients through real images and videos, thereby solidifying their learning. This method ensured that students could interact with the material in a dynamic and engaging manner, thereby enhancing the effectiveness and enjoyment of the learning process.

The first step involved preparing a comprehensive educational package based on the university’s educational protocol, created by the research team’s professors. Next, multiple educational scenarios were written based on the most common, real cases referred to the burn department. Relevant videos and images were also prepared for the BAM game design phase.

Then, the engineering team designed the BAM game using PHP language. The user interface was built using HTML/CSS and JavaScript technology, the processing of the submitted data was carried out by the PHP programming language, and MySQL was used to manage the data storage. Students could enter the BAM game after registering and setting their usernames and passwords. Initially, they were explained the importance of assessing burn patients and the rationale behind the game’s design. Then, students could view the guide and rules to play the game, as well as the educational content in the form of multimedia (Multimedia Appendix 1: “A quick review of the BAM game”).

The titles of the stages included assessments of circumferential burn injuries (limbs, chest, and abdomen), electrical burn, thermal burn (head and face), chemical burn, carbon monoxide poisoning, inhalation burn, delayed burn, and the extent and depth of burn (Table  1 ).

Questions related to each stage were presented in the forms of short films or images of real patients with burns, and students had to answer the questions within a certain time limit. Those who chose the correct option were encouraged, and their learning was confirmed. However, those who chose the wrong answer received a message stating that their choice was wrong, without being referred to the correct option, and it was explained why their answer was incorrect. Students could pass the current stage if they answered 60% of the questions. They could also compare their scores with other students and see their ranks in the classroom at any time. The game featured characteristics such as graphic elements, light colors, various emoji symbols indicating happiness and sadness, encouraging sound effects, stars and medals, an attractive appearance, and a sense of competition and excitement.

Intervention group

Students in the intervention group learned the educational content through the BAM serious game and could use the game for two weeks. Every week, a reminder message (Short message / SMS) was sent to them to encourage participation in the game. The BAM game could calculate how many times and how long each person used the game, so if students were absent for one third of the required time, they were excluded from the study.

Control group

Students in the control group received educational content related to the assessment of burn patients through feedback lectures within two 90-minute sessions; each 90 min were divided into three parts of 30 min. The lecturer spoke about the topic for the first 25 min and then answered students’ questions for five minutes; she used related pictures or videos depending on the educational conditions and content. In total, three 25-minute lectures with three 5-minute active discussions were held in each session. These sessions started prior to the initiation of the intervention for the BAM group to prevent contamination. Also, all participants signed confidentiality agreements about the importance of not sharing information about the sessions with peers from the other group. Finally, we conducted regular check-ins and monitoring of both groups to ensure adherence to the study protocols. This included brief interviews or surveys to detect if any sharing of information has occurred. Furthermore, the access to the software was restricted to each student via a uniquely defined code. This ensured that students without this code couldn’t enter the BAM, even if they had the software installation file.

Outcome measurement

Our participants had no prior theoretical knowledge or clinical encounter in the filed of burn assessment and manegment. So there was no need for conducting a pre-test.

Two weeks after completing the educational course, all students were evaluated in two stages: (I) participation in the knowledge assessment test (30 min); and (II) objective structured clinical examination (OSCE) for skills assessment.

To measure the knowledge of all students, 20 multiple-choice questions were used. The cutoff points were selected as follows: “Low level (0–34%)”, “Moderate level (35%-69%)”, and “High level (> 70%)”. Scores for the low level range from 0 to 7, scores for the moderate level range from 7.1 to 14, and scores for the high level range from 14.1 to 20. Higher scores indicate greater knowledge. A score of 14 (70%) or higher indicates a sufficient understanding of burn patient assessment. A passing score was set at 10.

To evaluate the skills assessment, a checklist was prepared by the nursing professors of our team, assessing the students’ skills in various dimensions of burn assessment using the OSCE approach. This exam included 10 stations (scenarios) related to burn injuries of the limbs, chest and abdomen, head and face, carbon monoxide poisoning and inhalation injuries, chemical and electrical burns, delayed burns, and assessment of burn extent and depth, according to the syllabus. Each station received a numerical score ranging from 1 to 5. For the purpose of simplifying calculations, these scores were then scaled to a maximum of 20. The final score for each student was calculated by adding up the scores of all 10 stations and then dividing by 10. The cutoff points were the same as those for the knowledge questionnaire, described previously. The students had to pass each station within five minutes. A passing score for each station was set at 10.

Reliability and Validity

The face and content validity, comprehensiveness, clarity and difficulty of the knowledge questionnaire was assessed, and after some modifications were confirmed by expert opinions (seven professors in the field of burn management). This scale demonstrated a satisfactory level of content validity, with a content validity ratio (CVR) of ranging from 0.72 to 0.91 and a scale content validity index (S-CVI) of 0.87 (ranging from 0.82 to 0.97), and face validity with a mean impact score of 2.35 (ranging from 2.24 to 4.45). The reliability of the calculation, as measured by the Kuder-Richardson Formula 20 (KR-20), was 0.7.

The content validity of the OSCE checklist was checked by expert consensus and iterative review and revision. The reliability of the skill assessment checklist items was confirmed using the inter-rater reliability method (ICC = 0.86), too.

Statistical Analyses

Statistical analysis was conducted using appropriate methods and IBM SPSS Statistics software [ver.28] (IBM SPSS Statistics, Armlonk, NY, USA). Normality of the numeric variables was checked and confirmed by Kolmogorov- Smirnov test. Data were presented using mean (SD) or median (percentile 25 – percentile75) for the Numeric Normal and non-normal variables, respectively and frequency (percent) for categorical variables. The between group comparisons of baseline measures and demographic variables were carried out by independent t tests, Mann-Whitney tests, and Fisher-Freeman-Halton Exact tests where appropriate. The correlations among the main variables were measured using Pearson correlation test. To assess the effect of intervention on knowledge and skills total score, the analysis of covariance (ANCOVA) was used after controlling for covariates (including age, gender, marital status, residence, university entrance exam rank, and annual grade point average or GPA). The intervention on skills score across stations, was assessed using univariable and multivariable ordinal logistic regression models. In the multivariable analyses, the effect of some covariates (including age, gender, marital status, residence, university entrance exam rank and annual GPA) were adjusted. All analyses were carried out using intention to treat approach, and P values less than 0.05 considered as significant.

Participants’ profile

Forty-four participants recruited in this study. In the first step evaluation for eligibility, two students were excluded (declined to participate). Finally, 42 patients were analyzed in the intervention ( n  = 21) and control ( n  = 21) groups (Fig.  1 ).

figure 1

CONSORT flow diagram

No significant difference was observed in terms of age, gender, marital status, residence, university entrance exam rank and annual GPA between intervention and control groups (Table  1 ).

Significant differences were observed between intervention and control groups in terms of skills total score ( P  < .05), and knowledge score ( P  < .05). The differences remained significant after adjusting for of age, gender, marital status, residence, university entrance exam rank and annual GPA) both ( P  < .05) (Table  2 ).

Correlations among main outcomes

The results showed that significant and positive correlations were observed between knowledge score and skills total score ( r  = .91, P  < .05), the more the knowledge score the more the skills total score.

Results of univariable and multivariable ordinal logistic regressions comparing intervention and control groups

The results univariable ordinal logistic regressions showed that for station 1 to station 8, the BAM game group had significantly higher skills rank compared to the feedback lecture group (all P  < .05) (Table  3 , and Table S1 ). So that, the BAM game group had higher skills score over a range of 1.6 to 3.9. Besides, after adjusting for of age, gender, marital status, residence, university entrance exam rank, and annual GPA, the results of multivariable ordinal logistic regressions indicated a significantly higher skills score across all stations (all P  < .05), so that the BAM game group had higher skills score over a range of 1.5 to 3.9 (Table  3 ; Fig.  2 ).

figure 2

Median scores of skills in 10 station across intervention and control groups

This study compared the effectiveness of serious game-based learning with feedback lectures in enhancing students’ knowledge and skills in assessing burn patients. The intervention and control groups were initially similar in key demographic characteristics. The results showed that students who participated in the serious game-based learning method significantly improved their assessment knowledge and skills compared to those who received feedback lectures. The analysis using univariable ordinal logistic regressions compared the two educational approaches across eight stations. It revealed that participants in the BAM game group exhibited significantly higher skill levels in eight out of ten stations compared to the feedback lecture group. The multivariable ordinal logistic regression analysis, which accounted for age, gender, marital status, place of residence, university entrance exam ranking, and annual GPA, confirmed these findings. Both analyses showed that the BAM game-based teaching method significantly enhances skill levels compared to feedback lectures, with this advantage remaining significant across various evaluation settings, even after adjusting for demographic and academic variables.

The study results indicated that serious games improved burn patient assessment knowledge and skills compared to the feedback lecture in nursing students. The majority of research highlights the effectiveness of education through serious games. For example, Farsi et al. (2021) reported the positive effect of serious games on the teaching of cardiopulmonary resuscitation to nursing students. They found that using simulations and serious games in education could lead to a significant increase in the mean score of students’ knowledge and the skills of students [ 29 ]. However the authors were concerned regarding the effectiveness of their training approaches in imparting knowledge about CPR, as evidenced by low scores (below 70%) on posttest knowledge questionnaires by participants from both groups. The issue raises questions about whether the knowledge questionnaire was too challenging or if the training methods failed to convey the necessary CPR concepts effectively. Therefore, they suggest that incorporating direct instruction, such as lectures, might improve understanding of CPR knowledge. In our research, it was observed that the average knowledge scores for students in both groups exceeded 70% (14 out of 20), demonstrating the beneficial impact of both educational approaches on enhancing student knowledge levels. However, when assessing skill scores, it was observed that only the intervention group’s average skill scores surpassed the 70% threshold, indicating a level of skill considered desirable. These scores were also significantly higher compared to those of the control group.

Several studies have reported the effectiveness of serious games in knowledge improvement [ 41 , 42 , 43 , 44 , 45 , 46 ]. Serious games positively influence learning by providing interactive experiences that increase focus and motivation. These games enhance critical thinking, problem-solving, and decision-making skills, fostering cognitive development. Active participation in SGs promotes practical application of knowledge, improving retention and real-world applicability. Additionally, SGs increase users’ sense of control, encouraging the application of learned content in real-life situations [ 47 ]. The game’s characteristics such as attractiveness, variety, interactive environment, easy access without time limit, repeatability, the sense of competition, use of multimedia profiles, random questions, sharing of results, and invitation to play through social networks were considered as effective factors [ 44 ].

The advantages of serious gaming have also been applied to enhance practical skills, too. Johnsen et al. (2018) used a serious game (containing two simulated courses for providing care for patients with chronic obstructive pulmonary disease at home and in the hospital) with the aim of teaching clinical reasoning and decision-making skills to nursing students. They found that both courses were educationally valuable and easy to use for students. But serious games were highly acceptable among nursing students [ 48 ]. Other studies reported the positive effect of the serious games on the nursing students’ knowledge and skills in resuscitation of infants, including ventilation and chest massage [ 49 ] and adults [ 50 ]. Researchers have also focused on the use of serious games as facilitators of education in large groups [ 51 , 52 , 53 ]. Serious games have been shown to enhance practical and procedural abilities among nursing students by offering them an immersive and secure setting for improving their clinical reasoning and decision-making capabilities [ 54 ]. Nursing educators are encouraged to use SGs to enhance cognitive skills and attention, improve judgment, foster time-efficient decision-making, facilitate safe decision practices, and promote decision exploration [ 55 ]. In our research, students in the intervention group demonstrated significantly better skills at eight stations compared to the control group. Yet, when it came to evaluating the burn extent (station 9) and burn depth (station 10), the practical skills of both groups showed no significant differences. Given the critical role that understanding the extent and depth of burns plays in foundational training for subsequent topics, such as fluid resuscitation calculations, these topics were emphasized through repeated discussion and review, with numerous practical examples in the control group. Consequently, students in the control group could achieve comparable skills scores to those in the intervention group on stations 9 and 10. The significant differences observed in other variables may be attributed to the repeated exposure to educational content facilitated by the BAM game. In contrast, the control group received the information only once during the traditional lecture. Additionally, the gamification elements like competition, points, medals, encouraging emoji, and background music likely improved student engagement with the potentially disturbing content of the assessments of burn victims. These elements are typically absent in the standard classroom setting.

However, there are some conflicting findings. Dankbaar et al. (2017) addressed the effect of serious game on students’ knowledge of the principles of supporting patient safety and showed that although the serious game could improve students’ knowledge of patient safety, it had no effect on students’ practice regarding patient safety, so it was not different from traditional methods [ 56 ]. Tubelo et al. (2019) also showed no difference in improving students’ knowledge of primary health care between serious games and classroom teaching [ 57 ]. One possible reason for this contradiction is the difference in the methods and content used in these studies to train and assess students. For example, they focused on patient safety and primary health care, which had no unpleasant, stressful, threatening, and heartbreaking contents for students, so they could learn such contents by lecture method, while training burn patients or similar cases included many heartbreaking images and unpleasant scenes, and students were reluctant to frequently review and view these scenes.

Strengths and Limitations

Some educational processes such as the assessment of burn patients are very stressful for students because they require more attention and concentration and any mistakes can threaten the patient’s life. The students of the intervention group in our study could experience better learning in a safe environment compared to the control group. We found no study that has compared serious game-based learning and traditional teaching methods in assessing burn patients.

One limitation of this study was that BAM is designed for assessing the most common types of burns in patients referred to our burn center, which may not be applicable to other departments or countries. Recruiting nursing students from only one university may be considered as another limitation, because it limits the generalizability of the findings. Small sample size was the third limitation, due to the small number of qualified students. On the other hand, this study was principally designed for a basic evaluation of our game and it will be developed during future phases.

Nursing students’ lack of knowledge and weakness in clinical skills are one of the challenges of the educational system, which can have a negative impact on the prognosis of burn victims and the quality of life of those who recover from these injuries. If students do not acquire enough knowledge and skills to assess patients, they will be unable to correctly assess their problems and complications in the future. The serious game gives students the opportunity to access educational content without time limitation and experience unpleasant burn scenes in a game-like environment, making this method an effective and efficient educational method. In addition, students are eager to review the content due to the attractive environment of the game and try to correct their mistakes due to the competitive nature of the serious game. Therefore, we hope that this new educational method will be included in the lesson plan and educational curriculum of the nursing students due to its advantages over traditional methods.

This research has the potential to equip nursing managers and educators with new educational methods for enhancing students’ knowledge and assessment skills in caring for burn patients. Although, this research informs the development of improved clinical skills training, bridging the gap between theory and practice, a prevalent issue in the health education systems. By enhancing students’ knowledge and assessment skills, we can potentially reduce post-burn complications, lowering patient costs. Additionally, improved core assessments can minimize organ damage from acute complications, ultimately improving patient satisfaction and well-being.

Data availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

Grade Point Average

Independent t-test

Fisher’s exact test

Mann-Whitney test

Deeter L, Seaton M, Carrougher GJ, McMullen K, Mandell SP, Amtmann D, Gibran NS. Hospital-acquired complications alter quality of life in adult burn survivors: report from a burn model system. Burns. 2019;45(1):42–7.

Article   PubMed   Google Scholar  

Rouzfarakh M, Deldar K, Froutan R, Ahmadabadi A, Mazlom SR. The effect of rehabilitation education through social media on the quality of life in burn patients: a randomized, controlled, clinical trial. BMC Med Inf Decis Mak. 2021;21(1):70.

Article   Google Scholar  

Lee K. The World Health Organization (WHO). Routledge; 2008.

Subrata SA. A concept analysis of burn care in nursing. Scand J Caring Sci. 2021;35(1):75–85.

Latimer S, Chaboyer W, Gillespie BM. Inviting patients to participate in their pressure injury care: the next step in prevention. DeepesTissues: Wounds Australia Newsl 2018:19–22.

Boehm D, Schröder C, Arras D, Siemers F, Siafliakis A, Lehnhardt M, Dadras M, Hartmann B, Kuepper S, Czaja K-U. Fluid Management as a risk factor for intra-abdominal compartment syndrome in burn patients: a total body surface area—independent Multicenter Trial Part I. J Burn Care Res. 2019;40(4):500–6.

Jaspers ME, van Haasterecht L, van Zuijlen PP, Mokkink LB. A systematic review on the quality of measurement techniques for the assessment of burn wound depth or healing potential. Burns. 2019;45(2):261–81.

Dewart G, Corcoran L, Thirsk L, Petrovic K. Nursing education in a pandemic: academic challenges in response to COVID-19. Nurse Educ Today. 2020;92:104471.

Article   PubMed   PubMed Central   Google Scholar  

Dreimane S, Upenieks R. Intersection of serious games and learning motivation for medical education: a literature review. Int J Smart Educ Urban Soc (IJSEUS). 2020;11(3):42–51.

Deldar K, Froutan R, Sedaghat A, Mazlom SR. Continuing nursing education: use of observational pain assessment tool for diagnosis and management of pain in critically ill patients following training through a social networking app versus lectures. BMC Med Educ. 2020;20(1):247.

Sattar MU, Palaniappan S, Lokman A, Hassan A, Shah N, Riaz Z. Effects of virtual reality training on medical students’ learning motivation and competency. Pakistan J Med Sci. 2019;35(3):852.

Google Scholar  

Solomon Y. Comparison between problem-based learning and lecture-based learning: effect on nursing students’ Immediate Knowledge Retention. Adv Med Educ Pract. 2020;11:947.

Heidari T, Kariman N, Heidari Z, AmiriFarahani L. Comparison effects of feedback lecture and conventional lecture method on learning and quality of teaching. J Arak Univ Med Sci. 2010;12(4):34–43.

Mällinen S, Sasaki DGG. Developing student-centered assessment for a postgraduate course designed for Basic Education teachers. Revista Ibero-Americana De Estudos Em Educação. 2018;13(1):520–5.

Dehghanzadeh S, Jafaraghaee F. Comparing the effects of traditional lecture and flipped classroom on nursing students’ critical thinking disposition: a quasi-experimental study. Nurse Educ Today. 2018;71:151–6.

Björn A, Pudas-Tähkä S-M, Salanterä S, Axelin A. Video education for critical care nurses to assess pain with a behavioural pain assessment tool: a descriptive comparative study. Intensive Crit Care Nurs. 2017;42:68–74.

Nickel F, Hendrie JD, Bruckner T, Kowalewski KF, Kenngott HG, Müller-Stich BP, Fischer L. Successful learning of surgical liver anatomy in a computer-based teaching module. Int J Comput Assist Radiol Surg. 2016;11(12):2295–301.

Kaczmarczyk J, Davidson R, Bryden D, Haselden S, Vivekananda-Schmidt P. Learning decision making through serious games. Clin Teach. 2016;13(4):277–82.

Deterding S, Dixon D, Khaled R, Nacke L. From game design elements to gamefulness: defining gamification. In: Proceedings of the 15th international academic MindTrek conference: Envisioning future media environments: 2011; 2011: 9–15.

Hamari J, Koivisto J, Sarsa H. Does gamification work?--a literature review of empirical studies on gamification. In: 2014 47th Hawaii international conference on system sciences: 2014 : Ieee; 2014: 3025–3034.

Gounaridou A, Siamtanidou E, Dimoulas C. A serious game for mediated education on Traffic Behavior and Safety Awareness. Educ Sci. 2021;11(3):127.

Xue Y, Chen G, Miao G, Liu C. Research on the Design and Effect of Serious Game Technology Transfer in Experiential Education. In: 2021 IEEE International Conference on Educational Technology (ICET): 18–20 June 2021 2021; 2021: 52–56.

McCoy L, Lewis JH, Dalton D. Gamification and Multimedia for Medical Education: a Landscape Review. J Osteopath Med. 2016;116(1):22–34.

Krishnamurthy K, Selvaraj N, Gupta P, Cyriac B, Dhurairaj P, Abdullah A, Krishnapillai A, Lugova H, Haque M, Xie S, et al. Benefits of gamification in medical education. Clin Anat. 2022;35(6):795–807.

Szeto MD, Strock D, Anderson J, Sivesind TE, Vorwald VM, Rietcheck HR, Weintraub GS, Dellavalle RP. Gamification and Game-based strategies for Dermatology Education: Narrative Review. JMIR Dermatol. 2021;4(2):e30325.

Klaassen R, Bul KCM, Op den Akker R, Van der Burg GJ, Kato PM, Di Bitonto P. Design and evaluation of a pervasive coaching and gamification platform for Young Diabetes patients. Sensors. 2018;18(2):402.

Ribeiro MAO, Corrêa CG, Nunes FLS. Gamification as a Learning Strategy in a Simulation of Dental Anesthesia. In: 2017 19th Symposium on Virtual and Augmented Reality (SVR): 1–4 Nov. 2017 2017; 2017: 271–278.

Sharifzadeh N, Tabesh H, Kharrazi H, Tara F, Kiani F, Rasoulian Kasrineh M, Mirteimouri M, Tara M. Play and learn for surgeons: a serious game to educate medical residents in uterine artery ligation surgery. Games Health J. 2021;10(4):220–7.

PubMed   Google Scholar  

Farsi Z, Yazdani M, Butler S, Nezamzadeh M, Mirlashari J. Comparative effectiveness of simulation versus serious game for training nursing students in cardiopulmonary resuscitation: a randomized control trial. International Journal of Computer Games Technology 2021, 2021.

Malekipour A. Serious games in medical education: why, what and how. J Med Educ Dev. 2017;12(1):100–13.

Sousa MJ, Rocha Á. Leadership styles and skills developed through game-based learning. J Bus Res. 2019;94:360–6.

Campillo-Ferrer J-M, Miralles-Martínez P, Sánchez-Ibáñez R. Gamification in higher education: impact on student motivation and the acquisition of social and civic key competencies. Sustainability. 2020;12(12):4822.

Kaur DP, Mantri A, Horan B. Enhancing student motivation with use of augmented reality for interactive learning in engineering education. Procedia Comput Sci. 2020;172:881–5.

Kahn SA, Goldman M, Daul M, Lentz CW. The burn surgeon: an endangered species. Can exposure in medical school increase interest in burn surgery? J burn care Res. 2011;32(1):39–45.

Sreedharan S, Cleland H, Lo C. Plastic surgical trainees’ perspectives toward burn surgery in Australia and New Zealand: changes in the last 17 years? Burns. 2021;47(8):1766–72.

Article   CAS   PubMed   Google Scholar  

Øgård-Repål A, De Presno ÅK, Fossum M. Simulation with standardized patients to prepare undergraduate nursing students for mental health clinical practice: an integrative literature review. Nurse Educ Today. 2018;66:149–57.

Huizenga JC, ten Dam GTM, Voogt JM, Admiraal WF. Teacher perceptions of the value of game-based learning in secondary education. Comput Educ. 2017;110:105–15.

Whittam AM, Chow W. An educational board game for learning and teaching burn care: a preliminary evaluation. Scars Burns Healing. 2017;3:2059513117690012.

Herndon DN. Total burn Care. 5th ed. Elsevier Health Sciences; 2018.

Hinkle JL, Cheever KH. Brunner and Suddarth’s textbook of medical-surgical nursing. Wolters kluwer india Pvt Ltd; 2018.

Bayram Ş, ÇAliŞKan N. Mobile Serious game on nursing students’ knowledge, motivation, satisfaction, and views: Tracheostomy Care Example. J Innovative Healthc Practices. 2023;4(2):118–29.

Bayram SB, Caliskan N. Effect of a game-based virtual reality phone application on tracheostomy care education for nursing students: a randomized controlled trial. Nurse Educ Today. 2019;79:25–31.

Min A, Min H, Kim S. Effectiveness of serious games in nurse education: a systematic review. Nurse Educ Today. 2022;108:105178.

Mitchell G, Leonard L, Carter G, Santin O, Brown Wilson C. Evaluation of a ‘serious game’on nursing student knowledge and uptake of influenza vaccination. PLoS ONE. 2021;16(1):e0245389.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Akbari F, Nasiri M, Rashidi N, Zonoori S, Amirmohseni L, Eslami J, Torabizadeh C, Havaeji FS, Bigdeli Shamloo MB, Paim CPP. Comparison of the effects of virtual training by serious game and lecture on operating room novices’ knowledge and performance about surgical instruments setup: a multi-center, two-arm study. BMC Med Educ. 2022;22(1):1–9.

Chittaro L, Sioni R. Serious games for emergency preparedness: evaluation of an interactive vs. a non-interactive simulation of a terror attack. Comput Hum Behav. 2015;50:508–19.

Chittaro L. Improving Knowledge Retention and Perceived Control through Serious games: a study about assisted emergency evacuation. IEEE Trans Vis Comput Graph 2023, Pp.

Johnsen HM, Fossum M, Vivekananda-Schmidt P, Fruhling A, Slettebø Å. Nursing students’ perceptions of a video-based serious game’s educational value: a pilot study. Nurse Educ Today. 2018;62:62–8.

Sarvan S, Efe E. The effect of neonatal resuscitation training based on a serious game simulation method on nursing students’ knowledge, skills, satisfaction and self-confidence levels: a randomized controlled trial. Nurse Educ Today. 2022;111:105298.

Creutzfeldt J, Hedman L, Felländer-Tsai L. Effects of pre-training using serious game technology on CPR performance–an exploratory quasi-experimental transfer study. Scand J Trauma Resusc Emerg Med. 2012;20(1):1–9.

Tan PL, Hay DB, Whaites E. Implementing e-learning in a radiological science course in dental education: a short-term longitudinal study. J Dent Educ. 2009;73(10):1202–12.

Schwarz D, Štourač P, Komenda M, Harazim H, Kosinová M, Gregor J, Hůlek R, Smékalová O, Křikava I, Štoudek R, et al. Interactive algorithms for teaching and learning acute medicine in the network of medical faculties MEFANET. J Med Internet Res. 2013;15(7):e135.

Mettler T, Pinto R. Serious games as a means for scientific knowledge Transfer—A case from Engineering Management Education. IEEE Trans Eng Manage. 2015;62(2):256–65.

Idrissi EME, Chemsi W, El Kababi G, Radid K. The impact of serious game on the nursing students’ learning, behavioral Engagement, and motivation. Int J Emerg Technol Learn (iJET). 2022;17(01):18–35.

Calik A, Kapucu S. The Effect of Serious games for nursing students in clinical decision-making process: a pilot randomized controlled trial. Games Health J. 2022;11(1):30–7.

Dankbaar ME, Richters O, Kalkman CJ, Prins G, Ten Cate OT, van Merrienboer JJ, Schuit SC. Comparative effectiveness of a serious game and an e-module to support patient safety knowledge and awareness. BMC Med Educ. 2017;17(1):1–10.

Tubelo RA, Portella FF, Gelain MA, de Oliveira MMC, de Oliveira AEF, Dahmer A, Pinto MEB. Serious game is an effective learning method for primary health care education of medical students: a randomized controlled trial. Int J Med Informatics. 2019;130:103944.

Download references

Acknowledgements

We wish to thank the Deputy of Research of Mashhad University of Medical Sciences for their cooperation in this project. Also, we would like to acknowledge the students who kindly participated in our study.

This research was derived from a nursing Masters’ thesis, directed in the School of Nursing and Midwifery of MUMS with the research code: 4001428.

Author information

Authors and affiliations.

Department of Medical Surgical Nursing, School of Nursing and Midwifery, Mashhad University of Medical Sciences, Mashhad, Iran

Amirreza Nasirzade & Razieh Froutan

Department of Information Technology, School of Allied Medical Sciences, Shahroud University of Medical Sciences, Shahroud, Iran

Kolsoum Deldar

Nursing and Midwifery Care Research Center, Mashhad University of Medical Sciences, Mashhad, Iran

Razieh Froutan

Department of Epidemiology and Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran

Mohammad Taghi Shakeri

You can also search for this author in PubMed   Google Scholar

Contributions

AN, KD, and RF contributed in the conception and research design; AN, KD, and RF collected the data, KD, RF, and MTS analyze and interpret the data; All authors have drafted the work and revised it.

Corresponding author

Correspondence to Razieh Froutan .

Ethics declarations

Ethical approval.

Research approval was obtained from the Ethics Committee of Mashhad University of Medical Sciences (code of ethics: MUMS.NURSE.REC.1400.096). The principles of confidentiality were observed. Written informed consent was obtained from the participants, too. Also this RCT was registered in the Iran Registry of Clinical Trials (IRCT20220410054483N1, https://irct.behdasht.gov.ir/trial/62878 ), Registration date: 18/04/2022.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Nasirzade, A., Deldar, K., Froutan, R. et al. Comparison of the effects of burn assessment mission game with feedback lecture on nursing students’ knowledge and skills in the burn patients’ assessment: a randomized clinical trial. BMC Med Inform Decis Mak 24 , 157 (2024). https://doi.org/10.1186/s12911-024-02558-4

Download citation

Received : 05 January 2024

Accepted : 28 May 2024

Published : 05 June 2024

DOI : https://doi.org/10.1186/s12911-024-02558-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Serious game
  • Feedback lecture
  • Nursing students
  • Burn assessment
  • Randomized controlled trial

BMC Medical Informatics and Decision Making

ISSN: 1472-6947

peer review higher education

COMMENTS

  1. Peer Review of Teaching

    of Peer Review. Bibliography. In higher education, peer review stands as the prime means for ensuring that scholarship is of the highest quality, and from it flows consequential assessments that shape careers, disciplines, and entire institutions. While peer review is well established as a means of evaluating research across the disciplines, it ...

  2. Peer review of teaching in higher education: A systematic review of its

    The keywords applied for the search were 'peer review of teaching' or 'peer observation of teaching' and 'higher education or college or university or postsecondary'. Since the majority of the publications (n = 365, 91.9%) were found after 1990, the final review focused on articles published between 1990 and 2017.

  3. The Review of Higher Education

    The Review of Higher Education (RHE) is considered one of the leading research journals in the field as it keeps scholars, academic leaders, and public policymakers abreast of critical issues facing higher education today.RHE advances the study of college and university issues by publishing peer-reviewed empirical research studies, empirically based historical and theoretical articles, and ...

  4. Assessment & Evaluation in Higher Education

    Peer Review Policy: All papers submitted to Assessment and Evaluation in Higher Education undergo a rigorous peer review process, beginning with an initial screening by the editor prior to anonymised scrutiny by at least two independent expert referees. Following structured comments from referees, decisions are conveyed to authors together with ...

  5. Peer Review in Academia

    Even though there are differences, peer review is a fundamental tool in the negotiation and establishment of a scholars' merits and research, of higher education quality and of excellence. Peer review is also considered a tool to prevent misconduct, such as the fraudulent presentation of findings or plagiarism.

  6. Peer review of teaching: A rapid appraisal

    Peer review of teaching (PRT) as a quality enhancement and review process was first adopted by higher education institutions (HEIs) in the 1990s and was driven in part by Quality Assurance Agency for Higher Education (QAA) expectations. PRT serves two principal purposes: (i) to assure the institution and others of the provision of quality ...

  7. Peer Review, Higher Education

    The principle of peer review is widely used in science for the assessment and evaluation ( Academic Evaluation in Higher Education) of research (e.g., manuscripts and grant proposals), people (e.g., jobs, promotion, and scholarships), and structures and procedures (e.g., organizations and programs). Despite the great variety of applications ...

  8. Peer-ing in: A systematic review and framework of peer review of

    Peer review is a fundamental facet of academic standards and, as Kihara (2003) observes, it has served a critical role for many decades. ... This is an area that is devoid of research and would further strengthen the argument for retaining PRT within higher education. Within this review, we discovered anecdotal evidence that shows that PRT ...

  9. Peer review in higher education: Student perceptions before and after

    Peer review is integral to academic endeavour, but opportunities for students to benefit from peer review in higher education remain limited, and relatively little is known about how student perceptions influence their appreciation of peer review. University student perceptions were examined before and after experiencing student peer review in ...

  10. (PDF) Peer review of teaching in higher education: A systematic review

    Peer review of teaching in higher education: A systematic review of its impact on the professional development of university teachers from the teaching expertise perspective. May 2020;

  11. Review of Higher Education

    About The Review. One of the leading journals in higher education, the Review of Higher Education publishes four times per year, providing a scholarly forum for discussion of issues affecting higher education.Since 1977, the journal has been advancing the study of higher education through publication of peer-reviewed research studies, scholarly essays, and theoretically-driven reviews that ...

  12. Peer Review

    The peer review: The reviewer(s) apply a peer review tool to teaching materials and/or a classroom observation. The review may consist of more than one step, based on the review process. ... Higher Education, 55(6). doi: 10.1007/s10734-007-9093-1 (opens in new tab) Berk, R.A., Naumann, P.L., & Appling, S.A. (2004). Beyond student ratings: Peer ...

  13. PDF Understanding the Purpose of Higher Education: an Analysis of The

    A comprehensive search of the literature selected 60 peer-reviewed journal articles and twenty-five books published between 2000 and 2016. Nine ... institutions and students on the economic and social benefits of higher education, the review was characterized by a significant misalignment. The findings suggest that student

  14. Rethinking feedback practices in higher education: a peer review

    Whilst it is recognised as a core component of the learning process, national surveys, both in the UK (Higher Education Funding Council for Engl... Rethinking feedback practices in higher education: a peer review perspective: Assessment & Evaluation in Higher Education: Vol 39 , No 1 - Get Access

  15. A Review of Peer Code Review in Higher Education

    In this work, we conduct a systematic review of the literature on peer code review in higher education to examine instructor motivations for conducting peer code review activities, how such activities have been implemented in practice, and the primary benefits and difficulties that have been reported. We initially identified 187 potential ...

  16. (PDF) Peer review in higher education: Student ...

    Reference [6] conducted a peer review study with 278 higher education students to examine student perceptions before and after experiencing peer review. The study showed that some students were ...

  17. Education Sciences

    Peer review has been successfully applied and analyzed in the literature. Indeed, many authors also recommend improving the design and implementation of self and peer review, which has been our main goal. This paper presents an empirical study based on the application of peer review assessment in different higher education BSc and MSc courses.

  18. Studies in Higher Education

    Studies in Higher Education publishes international research on higher education issues including institutional management and performance, teaching and learnin. Browse; Search. Close search. ... Peer review policy. All relevant submissions will undergo rigorous peer review, based on initial editor screening and doubly-anonymised refereeing by ...

  19. The Relationship between Self- and Peer Assessment in Higher Education

    Background: To promote a student-centered approach and sustain the development of a self-regulated attitude toward academic achievement, assessment in higher education should integrate different perspectives: teachers' feedback is crucial, but it needs to be supported by self-assessment and peer assessment activities. Methods: The aim of the current systematic review is to examine the most ...

  20. A systematic review of peer support interventions for student mental

    This systematic review found that peer support in higher education is defined in the literature according to three categories: peer-led support groups, peer mentoring and peer learning. By identifying this nomenclature, HEIs can start using a shared language when evaluating interventions and communicating best practice.

  21. Peer review in higher education: Student perceptions before and after

    Peer review is integral to academic endeavour, but opportunities for students to benefit from peer review in higher education remain limited, and relatively little is known about how student perceptions influence their appreciation of peer review. University student perceptions were examined before and after experiencing student peer review in four university subjects differing in discipline ...

  22. Artificial Intelligence for Assessment and ...

    Some other types of assessment tasks are also seen and used within higher education: self-assessment and peer assessment. Choosing the right type of assessment type depends on the need and what kind of learning outcomes is needed during the course. ... (Peer Review Assignments Increase Student Experience), Peer Scholar, Peerwise, are also ...

  23. Heterogeneous peer effects of college roommates on academic ...

    Peer effects, or peer influence 1,2,3,4,5, have long been studied in the literature on social contagions 6,7,8,9,10,11 and education 12,13,14,15,16,17,18.Understanding the influence of student ...

  24. Differential attainment in assessment of postgraduate surgical trainees

    The aim of this scoping review is to understand the breadth of research about the presence of DA in postgraduate surgical education and to determine themes pertaining to causes of inequalities. A scoping review was chosen to provide a means to map the available literature, including published peer-reviewed primary research and grey literature.

  25. Peer Feedback Improves Students' Academic Self-Concept in Higher Education

    The higher education funding council for England states that effective feedback helps learners to "progress with confidence and skill as lifelong learners" (HEFCE 2010, p. 8).One type of feedback is peer feedback that has become an increasingly central aspect for learning and teaching strategies in higher education (Brown 2010; Maringe 2010). ...

  26. Group-format, peer-facilitated mental health promotion interventions

    Introduction Young people in higher education face various stressors that can make them vulnerable to mental ill-health. Mental health promotion in this group therefore has important potential benefits. Peer-facilitated and group-format interventions may be feasible and sustainable. The scoping review outlined in this protocol aims to map the literature on group-format, peer-facilitated, in ...

  27. Public involvement and engagement in scientific research and higher

    Involving and engaging the public in scientific research and higher education is slowly becoming the norm for academic institutions in the United Kingdom and elsewhere. Driven by a wide range of stakeholders including regulators, funders, research policymakers and charities public involvement and public engagement are increasingly seen as essential in delivering open and transparent activity ...

  28. Peer Review, Higher Education

    The principle of peer review is widely used in science for the assessment and evaluation (Academic Evaluation) of research (e.g., manuscripts and grant proposals), people (e.g., jobs, promotion, and scholarships), and structures and procedures (e.g., organizations and programs).Despite the great variety of applications, research about peer review mostly focuses on research and the peer review ...

  29. Group-format, peer-facilitated mental health promotion interventions

    Young people in higher education face various stressors that can make them vulnerable to mental ill-health. Mental health promotion in this group therefore has important potential benefits. Peer-facilitated and group-format interventions may be feasible and sustainable. The scoping review outlined in this protocol aims to map the literature on group-format, peer-facilitated, in-person ...

  30. Comparison of the effects of burn assessment mission game with feedback

    Learning of burn patient assessment is very important, but heart-breaking for nursing students. This study aimed to compare the effects of feedback lecture method with a serious game (BAM Game) on nursing students' knowledge and skills in the assessment of burn patients. In this randomized controlled clinical trial, 42 nursing students in their 5th semester at Mashhad University of Medical ...