Brian Stecher and Laura Hamilton, measurement experts at RAND, share the results of their study on how schools and districts should consider assessments for measuring competencies like creativity and global understanding.
Public school systems are expected to promote a wide variety of skills and accomplishments in their students, including both academic achievement and a broader set of competencies, such as creativity, adaptability, and global awareness.
The latter outcomes, which are often referred to as “21st century skills” or “21st century competencies,” are increasingly central in policy discussions, because they are seen as critical components of college and career readiness. Assessing these competencies can provide educators with a broader set of indicators they can use to inform instruction and set goals with students. However, evidence about the effects of testing suggests that caution and careful planning is warranted when developing and launching a new assessment system.
We conducted a comprehensive review of assessments of 21st century competencies, including widely used measures and some that are in development, and we talked with educators about how these measures are being used and about the challenges and limitations associated with them. Based on this review, we identified a set of key lessons learned for those who develop or implement these new assessment systems.
1. Determine the purpose of the assessment. Don’t buy into a measure of “stick-to-it-iveness” or “character” because the title sounds good or because someone claims it is an important skill. Begin with a thoughtful analysis of what competencies you want to promote, identify the specific decisions that will be made on the basis of the assessment, and then look for measures that are appropriate for those purposes.
2. Do not try to measure everything. 21st century competencies cannot be measured equally well, and competencies that are not well defined are particularly difficult to measure. It is easy to label concepts as 21st century skills and talk about them as if they shared many characteristics, but that is an oversimplification. For example, there are subtle differences between persistence, resilience, and grit. It can be quite difficult to specify exactly which of these factors matter and equally difficult to distinguish among them in measurement.
3. Understand that requirements for technical quality increase with the stakes. Tests that will be used to make consequential decisions need to meet higher technical standards than tests that are used for lower-stakes decisions. If you are going to attach important decisions to the results—student promotion or retention, eligibility for higher courses, or financial incentives—then the measures need to meet higher standards for reliability and validity. If, instead they will be used primarily by teachers to make interim decisions about lessons or assignments which can be changed in the face of new information, the bar for validity and reliability is not quite as high (but these factors are still important to consider).
4. Weigh costs and benefits. The cost of assessment (both expenditures and time) should be weighed against the value of the uses it will serve. Consider both the time involved in taking an assessment, scoring the assessment, and interpreting the results, as well as the cost to purchase it, and weigh that against to potential benefits of having the information.
5. Recognize that more-complex assessments may be needed to measure more-complex competencies. One of the reasons most schools do not presently measure students’ ability to monitor their own learning or students’ persistence in solving unstructured problems is that these things are difficult to measure. For example, some competencies may not be revealed in a single sitting or performance event but may only be identifiable across a series of events.
6. Understand that innovation often comes with a cost. Innovative assessments (involving simulations, remote collaboration, etc.) can require substantial time and resources (e.g., training, computing power, telecommunications infrastructures). These factors should be considered when comparing costs and benefits.
7. Leverage partnerships. If assessments of important competencies do not exist, districts can work with partners to develop them (partners can include other districts, researchers, and assessment organizations).
8. Recognize that context and culture matter. Assessments that work in one setting might not work as well in another. It is often necessary to conduct additional research to validate measures locally. There are obvious advantages to adopting measures that have been developed already rather than starting from scratch, and many people point to assessments used internationally as examples. However, care must be taken when using an assessment in a very different cultural context. It may not work the same way as it did originally.
9. Use data to drive instruction and create learning opportunities. Acquiring information about students’ understanding of 21st century competencies can make educators and students more intentional about improving the competencies. One way assessments can lead to improvement is by signaling to students and teachers which skills and competencies are important. If it is important enough to measure, then educators are more likely to attend to it.
10. Set realistic and systemic expectations for improvement. Educators (and learning scientists) do not know as much about teaching and learning 21st century competencies as they do about teaching traditional academic content, so expectations for improvement need to be realistic. While we are optimistic about the promise of measuring 21st century skills, we are realistic about the difficulties that have to be overcome. This is also true when it comes to teaching these competencies. You cannot teach “persistence” or “oral communication” through a single lesson or course; you have to change students’ educational experiences over an extended period of time.
11. Monitor for unintended consequences. Assessments can have unintended consequences, which should be monitored in each local context. The potential downsides of “teaching to the test” (narrowing curriculum, reallocating class time to unproductive practice activities, etc.) exist for tests of 21st century skills just as they do for traditional tests of academic achievement. Educators need to be careful not to drive instruction toward exercises designed solely to prepare students for tests of intrapersonal skills or other competencies.
12. Create a balanced system of assessments. Measures of 21st century competencies should be part of a balanced assessment strategy. Assessment time and assessment resources should be used in thoughtful ways to provide information that can be used by educators about student performances that matter to society. Choices should be driven by the criteria of importance and usefulness.
The growing availability of measures and data systems that support assessment of 21st century competencies makes it likely that this form of assessment will become more widespread. By designing and implementing an assessment system that reflects the key lessons presented above, educators and policymakers can help ensure that new assessments contribute to improved student results and that they avoid the mistakes that characterized many previous assessment reforms.
Brian Stecher is a senior social scientist and the associate director of RAND Education and Laura Hamilton is a senior behavioral scientist at the nonprofit, nonpartisan RAND Corporation.
The opinions expressed in Global Learning are strictly those of the author(s) and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.