Reliability and Validity
Reliability means that individual scores from an instrument should be the same or nearly the same from one administration of the instrument to another. The instrument can be assumed to be free of bias and measurement error (68). Alpha coefficients are often used to report an estimate of internal consistency. Scores of .70 or higher indicate that the instrument has high reliability when the stakes are moderate. Scores of .80 and higher are appropriate when the stakes are high.
Validity means that individual scores from a particular instrument are meaningful, make sense, and allow researchers to draw conclusions from the sample to the population that is being studied (69) Researchers often refer to "content" or "face" validity. Content validity or face validity is the extent to which questions on an instrument are representative of the possible questions that a researcher could ask about that particular content or skills.
Watson-Glaser Critical Thinking Appraisal-FS (WGCTA-FS)
The WGCTA-FS is a 40-item inventory created to replace Forms A and B of the original test, which participants reported was too long.70 This inventory assesses test takers' skills in:
(a) Inference: the extent to which the individual recognizes whether assumptions are clearly stated (b) Recognition of assumptions: whether an individual recognizes whether assumptions are clearly stated (c) Deduction: whether an individual decides if certain conclusions follow the information provided (d) Interpretation: whether an individual considers evidence provided and determines whether generalizations from data are warranted (e) Evaluation of arguments: whether an individual distinguishes strong and relevant arguments from weak and irrelevant arguments
Researchers investigated the reliability and validity of the WGCTA-FS for subjects in academic fields. Participants included 586 university students. Internal consistencies for the total WGCTA-FS among students majoring in psychology, educational psychology, and special education, including undergraduates and graduates, ranged from .74 to .92. The correlations between course grades and total WGCTA-FS scores for all groups ranged from .24 to .62 and were significant at the p < .05 of p < .01. In addition, internal consistency and test-retest reliability for the WGCTA-FS have been measured as .81. The WGCTA-FS was found to be a reliable and valid instrument for measuring critical thinking (71).
Cornell Critical Thinking Test (CCTT)
There are two forms of the CCTT, X and Z. Form X is for students in grades 4-14. Form Z is for advanced and gifted high school students, undergraduate and graduate students, and adults. Reliability estimates for Form Z range from .49 to .87 across the 42 groups who have been tested. Measures of validity were computed in standard conditions, roughly defined as conditions that do not adversely affect test performance. Correlations between Level Z and other measures of critical thinking are about .50.72 The CCTT is reportedly as predictive of graduate school grades as the Graduate Record Exam (GRE), a measure of aptitude, and the Miller Analogies Test, and tends to correlate between .2 and .4.73
California Critical Thinking Disposition Inventory (CCTDI)
Facione and Facione have reported significant relationships between the CCTDI and the CCTST. When faculty focus on critical thinking in planning curriculum development, modest cross-sectional and longitudinal gains have been demonstrated in students' CTS.74 The CCTDI consists of seven subscales and an overall score. The recommended cut-off score for each scale is 40, the suggested target score is 50, and the maximum score is 60. Scores below 40 on a specific scale are weak in that CT disposition, and scores above 50 on a scale are strong in that dispositional aspect. An overall score of 280 shows serious deficiency in disposition toward CT, while an overall score of 350 (while rare) shows across the board strength. The seven subscales are analyticity, self-confidence, inquisitiveness, maturity, open-mindedness, systematicity, and truth seeking (75).
In a study of instructional strategies and their influence on the development of critical thinking among undergraduate nursing students, Tiwari, Lai, and Yuen found that, compared with lecture students, PBL students showed significantly greater improvement in overall CCTDI (p = .0048), Truth seeking (p = .0008), Analyticity (p =.0368) and Critical Thinking Self-confidence (p =.0342) subscales from the first to the second time points; in overall CCTDI (p = .0083), Truth seeking (p= .0090), and Analyticity (p =.0354) subscales from the second to the third time points; and in Truth seeking (p = .0173) and Systematicity (p = .0440) subscales scores from the first to the fourth time points (76). California Critical Thinking Skills Test (CCTST)
Studies have shown the California Critical Thinking Skills Test captured gain scores in students' critical thinking over one quarter or one semester. Multiple health science programs have demonstrated significant gains in students' critical thinking using site-specific curriculum. Studies conducted to control for re-test bias showed no testing effect from pre- to post-test means using two independent groups of CT students. Since behavioral science measures can be impacted by social-desirability bias-the participant's desire to answer in ways that would please the researcher-researchers are urged to have participants take the Marlowe Crowne Social Desirability Scale simultaneously when measuring pre- and post-test changes in critical thinking skills. The CCTST is a 34-item instrument. This test has been correlated with the CCTDI with a sample of 1,557 nursing education students. Results show that, r = .201, and the relationship between the CCTST and the CCTDI is significant at p< .001. Significant relationships between CCTST and other measures including the GRE total, GRE-analytic, GRE-Verbal, GRE-Quantitative, the WGCTA, and the SAT Math and Verbal have also been reported. The two forms of the CCTST, A and B, are considered statistically significant. Depending on the testing, context KR-20 alphas range from .70 to .75. The newest version is CCTST Form 2000, and depending on the testing context, KR-20 alphas range from .78-.84.77
The Health Science Reasoning Test (HSRT)
Items within this inventory cover the domain of CT cognitive skills identified by a Delphi group of experts whose work resulted in the development of the CCTDI and CCTST. This test measures health science undergraduate and graduate students' CTS. Although test items are set in health sciences and clinical practice contexts, test takers are not required to have discipline-specific health sciences knowledge. For this reason, the test may have limited utility in dental education (78).
Preliminary estimates of internal consistency show that overall KR-20 coefficients range from .77 to .83.79 The instrument has moderate reliability on analysis and inference subscales, although the factor loadings appear adequate. The low K-20 coefficients may be result of small sample size, variance in item response, or both (see following table).
Table 8. Estimates of Internal Consistency and Factor Loading by Subscale for HSRT
Inductive | .76 | .332-.769 |
Deductive | .71 | .366-.579 |
Analysis | .54 | .369-.599 |
Inference | .52 | .300-.664 |
Evaluation | .77 | .359-.758 |
Professional Judgment Rating Form (PJRF)
The scale consists of two sets of descriptors. The first set relates primarily to the attitudinal (habits of mind) dimension of CT. The second set relates primarily to CTS.
A single rater should know the student well enough to respond to at least 17 or the 20 descriptors with confidence. If not, the validity of the ratings may be questionable. If a single rater is used and ratings over time show some consistency, comparisons between ratings may be used to assess changes. If more than one rater is used, then inter-rater reliability must be established among the raters to yield meaningful results. While the PJRF can be used to assess the effectiveness of training programs for individuals or groups, the evaluation of participants' actual skills are best measured by an objective tool such as the California Critical Thinking Skills Test.
Teaching for Thinking Student Course Evaluation Form
Course evaluations typically ask for responses of "agree" or "disagree" to items focusing on teacher behavior. Typically the questions do not solicit information about student learning. Because contemporary thinking about curriculum is interested in student learning, this form was developed to address differences in pedagogy and subject matter, learning outcomes, student demographics, and course level characteristic of education today. This form also grew out of a "one size fits all" approach to teaching evaluations and a recognition of the limitations of this practice. It offers information about how a particular course enhances student knowledge, sensitivities, and dispositions. The form gives students an opportunity to provide feedback that can be used to improve instruction.
Holistic Critical Thinking Scoring Rubric
This assessment tool uses a four-point classification schema that lists particular opposing reasoning skills for select criteria. One advantage of a rubric is that it offers clearly delineated components and scales for evaluating outcomes. This rubric explains how students' CTS will be evaluated, and it provides a consistent framework for the professor as evaluator. Users can add or delete any of the statements to reflect their institution's effort to measure CT. Like most rubrics, this form is likely to have high face validity since the items tend to be relevant or descriptive of the target concept. This rubric can be used to rate student work or to assess learning outcomes. Experienced evaluators should engage in a process leading to consensus regarding what kinds of things should be classified and in what ways.80 If used improperly or by inexperienced evaluators, unreliable results may occur.
Peer Evaluation of Group Presentation Form
This form offers a common set of criteria to be used by peers and the instructor to evaluate student-led group presentations regarding concepts, analysis of arguments or positions, and conclusions.81 Users have an opportunity to rate the degree to which each component was demonstrated. Open-ended questions give users an opportunity to cite examples of how concepts, the analysis of arguments or positions, and conclusions were demonstrated.
Table 8. Proposed Universal Criteria for Evaluating Students' Critical Thinking Skills
Accuracy |
Adequacy |
Clarity |
Completeness |
Consistency |
Depth |
Fairness |
Logic |
Precision |
Realism |
Relevance |
Significance |
Specificity |
Aside from the use of the above-mentioned assessment tools, Dexter et al. recommended that all schools develop universal criteria for evaluating students' development of critical thinking skills (82).
Their rationale for the proposed criteria is that if faculty give feedback using these criteria, graduates will internalize these skills and use them to monitor their own thinking and practice (see Table 4).
Quick Links
College Hall 410A
(202) 559-5370
202.651.5085
DISCLAIMER: This data in this section is fictitious and does not, in any way, represent any of the programs at Gallaudet University. This information is intended only as examples.
A rubric is a scoring guide used to assess performance against a set of criteria. At a minimum, it is a list of the components you are looking for when you evaluate an assignment. At its most advanced, it is a tool that divides an assignment into its parts and provides explicit expectations of acceptable and unacceptable levels of performance for each component.
1 – Checklists, the least complex form of scoring system, are simple lists indicating the presence, NOT the quality, of the elements. Therefore, checklists are NOT frequently used in higher education for program-level assessment. But faculty may find them useful for scoring and giving feedback on minor student assignments or practice/drafts of assignments.
Example 1: Critical Thinking Checklist
The student…
__ Accurately interprets evidence, statements, graphics, questions, etc.
__ Identifies the salient arguments (reasons and claims)
__ Offers analyzes and evaluates major alternative points of view
__ Draws warranted, judicious, non-fallacious conclusions
__ Justifies key results and procedures, explains assumptions and reasons
__ Fair-mindedly follows where evidence and reasons lead
Example 2: Presentation Checklist
The student…
__ engaged audience
__ used an academic or consultative American Sign Language (ASL) register
__ used adequate ASL syntactic and semantic features
__ cited references adequately in ASL
__ stayed within allotted time
__ managed PowerPoint presentation technology smoothly
2 – Basic Rating Scales are checklists of criteria that evaluate the quality of elements and include a scoring system. The main drawback with rating scales is that the meaning of the numeric ratings can be vague. Without descriptors for the ratings, the raters must make a judgment based on their perception of the meanings of the terms. For the same presentation, one rater might think a student rated “good,” and another rater might feel the same student was “marginal.”
Excellent 5 | Good 4 | Fair 3 | Marginal2 | Inadequate 1 | |
---|---|---|---|---|---|
Accurately interprets evidence, statements, graphics, questions, etc | |||||
Identifies the salient arguments (reasons and claims) | |||||
Offers analyzes and evaluates major alternative points of view | |||||
Draws warranted, judicious, non-fallacious conclusions | |||||
Justifies key results and procedures, explains assumptions and reasons | |||||
Fair-mindedly follows where evidence and reasons lead |
3 – Holistic Rating Scales use a short narrative of characteristics to award a single score based on an overall impression of a student’s performance on a task. A drawback to using holistic rating scales is that they do not provide specific areas of strengths and weaknesses and therefore are less useful to help you focus your improvement efforts. Use a holistic rating scale when the projects to be assessed will vary greatly (e.g., independent study projects submitted in a capstone course). Or when the number of assignments to be evaluated is significant (e.g., reviewing all the essays from applicants to determine who will need developmental courses).
Rating scale.
Not meeting 1 | Approaching 2 | Meeting 3 | Exceeding 4 | |
---|---|---|---|---|
The Holistic Critical Thinking Scoring Rubric: A Tool for Developing and Evaluating Critical Thinking. Retrieved April 12, 2010 from Insight Assessment . 4 – Analytic Rating Scales are rubrics that include explicit performance expectations for each possible rating, for each criterion. Analytic rating scales are especially appropriate for complex learning tasks with multiple criteria. Evaluate carefully whether this the most appropriate tool for your assessment needs. They can provide more detailed feedback on student performance; more consistent scoring among raters, but the disadvantage is that they can be time-consuming to develop and apply. Results can be aggregated to provide detailed information on the strengths and weaknesses of a program. Example: Critical Thinking Portion of the Gallaudet University Rubric for Assessing Written English
Pre-College Skills 1 | Emerging Skills 2 | Developing Skills 3 | Mastering Skills 4 | Exemplary Skills 5 |
---|---|---|---|---|
1. Assignment lacks a central point. | 2. Displays central point, although not clearly developed. | 3. Displays adequately-developed central point. | 4, Displays clear, well-developed central point. | 5. Central point is uniquely displayed and developed. |
1. Displays no real development of ideas. | 2. Develops ideas superficially or inconsistently. | 3. Develops ideas with some consistency and depth. | 4. Displays insight and thorough development of ideas. | 5. Ideas are uniquely developed. |
1. Lacks convincing support for ideas. | 2. Provides weak support for main ideas. | 3. Develops adequate support for main ideas. | 4. Develops consistently strong support for main ideas. | 5. Support for main ideas is uniquely accomplished. |
1. Includes no analysis, synthesis, interpretation, and/or other critical manipulation of ideas. | 2. Includes little analysis, synthesis, interpretation, and/or other critical manipulation of ideas. | 3. Includes analysis, synthesis, interpretation and/or other critical manipulation of ideas in most parts of the assignment. | 4. Includes analysis, synthesis, interpretation, and/or other critical manipulation of ideas, throughout. | 5. Includes analysis, synthesis, interpretation, and/or other critical manipulation of ideas, throughout— leading to an overall sense that the piece could withstand critical analysis by experts in the discipline. |
1. Demonstrates no real integration of ideas (the author’s or the ideas of others) to make meaning. | 2. Begins to integrate ideas (the author’s or the ideas of others) to make meaning. | 3. Displays some skill at integrating ideas (the author’s or the ideas of others) to make meaning. | 4. Is adept at integrating ideas (the author’s or the ideas of others) to make meaning. | 5. Integration of ideas (the author’s or the ideas of others) is accomplished in novel ways. |
There are different ways to approach building an analytic rating scale: logical or organic. For both the logical and the organic model, steps 1-3 are the same.
Determine the best tool.
Not meeting 1 | Approaching 2 | Meeting 3 | Exceeding 4 | |
---|---|---|---|---|
Tip: Adding numbers to the ratings can make scoring easier. However, if you plan to use the rating scale for course-level assessment grading as well, a meaning must be attached to that score. For example, what is the minimum score that would be considered acceptable for a “C.”
Components of Analytic Rating Scales
Criteria | Excellent | Good | Inadequate |
---|---|---|---|
Descriptive characteristics (apply to the appropriate table cell) | |||
Other possible descriptors include:
examples of inconsistent performance characteristics and suggested corrections.
Logical Method | Organic Method |
---|---|
. Each should be mutually exclusive. | |
that distinguish the assignments |
Tips: Keep list of characteristics manageable by only including critical evaluative components. Extremely long, overly-detailed lists make a rating scale hard to use.
In addition to having descriptions brief, the language should be consistent. Below are several ideas to keep descriptors consistent:
3 | 2 | 1 |
---|---|---|
the effect of … | the effects of … | the effects of … |
Keep the aspects of a performance stay the same across the levels but adding adjectives or adverbial phrases to show the qualitative difference
3 | 2 | 1 |
---|---|---|
provides a | provides a | provides a |
shows a | shows a | shows a |
3 | 2 | 1 |
---|---|---|
uses correctly and independently | uses with occasional peer or teacher assistance | uses only with teacher guidance |
A word of warning: numeric references on their own can be misleading. They are best teamed with a qualitative reference (eg three appropriate and relevant examples) to avoid ignoring quality at the expense of quantity.
3 | 2 | 1 |
---|---|---|
provides examples | provides examples | provides example |
uses relevant strategies | uses relevant strategies | uses relevant strategies |
Use rating scales for program-level assessment to see trends in strengths and weaknesses of groups of students.
For more information on using course-level assessment to provide feedback to students and to determine grades, see University of Hawaii’s “ Part 7. Suggestions for Using Rubrics in Courses ” and the section on Converting Rubric Scores to Grades in Craig A. Mertler’s “Designing Scoring Rubrics for Your Classroom”.
Adapted from sources below:
Allen, Mary. (January, 2006). Assessment Workshop Material . California State University, Bakersfield. Retrieved DATE from http://www.csub.edu/TLC/options/resources/handouts/AllenWorkshopHandoutJan06.pdf
http://www.uhm.hawaii.edu/assessment/howto/rubrics.htm
http://www.teachervision.fen.com/teaching-methods-and-management/rubrics/4523.html?detoured=1
Mueller, Jon. (2001). Rubrics. Authentic Assessment Toolbox. Retrieved April 12, 2010 from http://jonathan.mueller.faculty.noctrl.edu/toolbox/rubrics.htm
http://en.wikipedia.org/wiki/Rubric_(academic)
Tierney, Robin & Marielle Simon. (2004). What’s Still Wrong With Rubrics: Focusing on the Consistency of Performance Criteria Across Scale Levels . Practical Assessment, Research & Evaluation, 9(2).
Gallaudet University, chartered in 1864, is a private university for deaf and hard of hearing students.
Copyright © 2024 Gallaudet University. All rights reserved.
800 Florida Avenue NE, Washington, D.C. 20002
IMAGES
VIDEO
COMMENTS
Using the Holistic Critical Thinking Scoring Rubric. 1. Understand What this Rubric is Intended to Address. Critical thinking is the process of making purposeful, reflective and fair‐minded judgments about what to believe or what to do. Individuals and groups use critical thinking in problem solving and decision making.
Holistic Critical Thinking Scoring Rubric*. Identifies the salient arguments (reasons and claims) pro and con. Thoughtfully analyzes and evaluates major alternative points of view. Draws warranted, judicious, non-fallacious conclusions. Justifies key results and procedures, explains assumptions and reasons. Fair-mindedly follows where evidence ...
The Holistic Critical Thinking Scoring Rubric (HCTSR) is a rating measure used to assess the quality of critical thinking displayed in a verbal presentation or written text. One would use the HCTSR to rate a written document or presentation where the presenter is required to be explicit about their thinking process. It can be used in any ...
1. Understand what the Rubric is intended to Address. Critical thinking is the process of making purposeful, reflective and fair-minded judgments about what to believe or what to do. It is used in problem solving and decision making. This four level rubric treats this process as a set of cognitive skills supported by certain personal dispositions.
How to Use The Holistic Critical Thinking Scoring Rubric 1. Understand what the Rubric is intended to Address. Critical thinking is the process of making purposeful, reflective and fair-minded judgments about what to believe or what to do. Individuals and groups use critical thinking in problem solving and decision making.
critical thinking scoring rubric, rating form, or instructions herein for local teaching, assessment, research, or other educational and noncommercial uses, provided that no part of the scoring rubric is altered and that "Facione and Facione" are cited as authors. (PAF49:R4.2:062694) Dr. Peter A. Facione Santa Clara University
Growing Tomorrow's Citizens in Today's Classrooms 2019 Solution Tree Press www.SolutionTree.com Visit go.SolutionTree.comassessment to download this free reproducible. REPRODUCIBLE Holistic Critical-Thinking Scoring Rubric Level Holistic Description Advancing Consistently does all or almost all of the following: • Accurately interprets evidence, statements, graphics, questions, and so on
Holistic Critical Thinking Scoring Rubric, by Facione and Facione. Show full item record. Files in this item. Files Size Format View; There are no files associated with this item. This item appears in the following Collection(s) Critical Thinking [73] Air University Resources Online - Creativity and Thinking Skills [67]
The next item addressed was one begun in the February meeting, the changing of the critical thinking rubric to from the AACU Critical Thinking Values Rubric, which the Critical Thinking (non-science) team found very cumbersome to apply, to the Facione and Facione rubric (see attached). The concern voiced in the February meeting was that our definition of critical thinking be consistent with ...
Noreen Facione and I developed the Holistic Critical Thinking Scoring Rubric (HCTSR) in 1994 in response to requests for a tool which (a) could be used to evaluate a variety of educational work products including essays, presentations, and demonstrations, and (b) works as both a pedagogical device to guide people to know ...
19. How To Use The Holistic Critical Thinking Scoring Rubric. 1. Understand what the Rubric is intended to Address. Critical thinking is the process of making purposeful, reflective and fair-minded judgments about what to believe or what to do. Individuals and groups use critical thinking in problem solving and decision making.
Critical Thinking (Northeastern Illinois University) 55 Critical Thinking (CA State University, Fresno) 56 Information Competence (CA State University) 57 A Rubric for Rubrics (Monmouth University) 58 *Rubrics were taken verbatim from campus websites and were sometimes lightly reformatted to fit the printed page.
Contexts in source publication. ... the study, the rubric developed by Facione and Facione (1994), which is used to facilitate the evaluation of prospective teachers' critical thinking skills, was ...
The students' critical thinking skills, as demonstrated by their participation in structured small-group discussions, were assessed pre- and post-intervention using the Holistic Critical Thinking Scoring Rubric. This paper will first review relevant literature on critical thinking, focusing on trends in the assessment of critical thinking.
Use the following rubric to think about HOW you've made your historical argument. Critical thinking by historians or anyone else requires constructing arguments based on solid evidence. In contrast, opinion, close-mindedness, or irrationality reflect a lack of critical thinking. In such cases, one merely expresses preconceptions and biases not ...
Holistic Critical Thinking Scoring Rubric. This assessment tool uses a four-point classification schema that lists particular opposing reasoning skills for select criteria. One advantage of a rubric is that it offers clearly delineated components and scales for evaluating outcomes. This rubric explains how students' CTS will be evaluated, and ...
One of the interviewees, Lara, consistently ignores others' opinions, states arguments that are factually incorrect, misinterprets what others are saying, offers a biased information, and exhibits close-mindedness to reasoning. According to the Holistic Critical Thinking Scoring Rubric, Lara's skills are _____.
RUBRICS - Free download as Word Doc (.doc / .docx), PDF File (.pdf), Text File (.txt) or read online for free. Rubrics
The Holistic Critical Thinking Scoring Rubric: A Tool for Developing and Evaluating Critical Thinking. Retrieved April 12, 2010 from Insight Assessment. 4 - Analytic Rating Scales are rubrics that include explicit performance expectations for each possible rating, for each criterion.Analytic rating scales are especially appropriate for complex learning tasks with multiple criteria.
the Holistic Critical Thinking Scoring Rubric (HCTSR) (Facione & Facione, 1996). The research questions within this study are: Critical Thinking and Physics Concepts | 193 • Can the FMCE post-test results be used as an indicator of domain-specific critical thinking within the first semester of a univer-
Holistic Critical Thinking Scoring Rubric * 4 Consistently does all or almost all of the following: • Accurately interprets evidence, statements, graphics, questions, etc. • Identifies the salient arguments (reasons and claims) pro and con. • Thoughtfully analyzes and evaluates major alternative points of view. • Draws warranted, judicious, non-fallacious conclusions.
Objective: translation, transcultural adaptation, and validation of the Holistic Critical Thinking Scoring Rubric, original from the United States, to Brazilian Portuguese. Method: methodological ...
This rubric is a four level scale, half point scoring is inconsistent with its intent and conceptual structure. Further, at this point in its history, the art and science of holistic critical thinking evaluation cannot justify asserting half-level differentiations.