/ Statewide Testing: Problem or Solution for Failing Schools?


Standardized and criterion-referenced statewide testing now plays an enormous role in U.S. education today. This article looks at what has become a social frenzy, distorted perceptions about the role of testing and what it can tell us about how well we are meeting the overall needs of families and children in the many communities of America It offers suggestions for a more reasonable testing and evaluation scenario, including the principles set forth for useful assessment by the National Association for the Education of Young Children (NAEYC).

Key Words: testing, assessment, failing schools

    1. Anne K. Soderman is Professor and Acting Chair, Department of Family and Child Ecology, Michigan State University, East Lansing, Michigan, 48824.return to text

    Nine-year-old Lamont is taking the state's 4th grade reading test, and it's clear that he's having trouble. With his head bobbing and nose pointing to each word in the passage, he scans back and forth, whispering aloud each of the words. When finished, he moves on to the questions. Slumping down in his chair after a brief time, he throws his pencil down on the test booklet and folds his arms, a defiant look on his face. Only two answers have been bubbled in, but he's had enough.

    Lamont's discomfort is matched by his teacher's. Watching the stress he and several of the other children are experiencing, she is upset about having to put them in a situation she knows is frustrating for them. However, she is also angry that she will be held responsible for their failure to perform satisfactorily.

    Currently, 38 states reward or sanction schools on the basis of children's performance on state-developed assessments, with sanctions including written warnings, threatened intervention, removal of administrators, funding penalties, and even complete takeovers. By April, 2001, more than 800 of Michigan's 3,128 public schools were deemed at risk of losing accreditation and state take-over if fewer than 25 percent of its students meet state standards on the Michigan Educational Assessment Program (MEAP) tests. Trophies are awarded to top performing schools, and high school students doing well are awarded $2,500 scholarships to college (Johnson, 2001).

    A proposal by the Bush Administration to test all U.S. children in grades 3-8 has been met by mixed reactions from parents and educators. Those schools that make too little progress will be given additional aid to improve; however, if children's performance is still considered inadequate after two years, it would be mandated that all children attending a "low-functioning" school be offered the option of attending another school and taking their state-funded support with them (Toppo, 2001). Less talked about is the fact that most of these same children will also take with them their lack of experience and skill development, lack of family support, and higher rates of special needs. Unless the "new" school is better able to cope with these children in terms of accumulated learning deficits and unmet personal needs, the "problem" will simply be shifted from one context to another, although it may be better hidden and less of a social and political irritation. Moreover, for those families who are unable to provide transportation or do whatever it takes to find a more hopeful learning environment, their children will be left in schools in which future test scores are sure to be even more negatively skewed.

    In order to prepare children for better testing outcomes, 49 states are developing or already have developed educational standards (Morrow, 2001). While there have been boycotts and protests across the country and internationally against the abuses of standardized tests, and while several highly reputable professional organizations (National Research Council; American Educational Research Association; American Federation of Teachers) have come out against using a single measure to make educational decisions about children, 12 states have begun to use tests to determine promotion (Heubert and Hauser, 1999).

    This swell in testing has been costly to already financially burdened school districts, and it has been estimated that more than $423 million per year is now spent on testing in the U.S. Dollars to buy copyrighted achievement tests, coaching materials and additional staff to prepare children for upcoming state testing, and efforts to analyze local test data undoubtedly draw a large share of resources from other areas of school budgets.

    All of the publicity surrounding test scores has actually begun to reshape school districts and school communities. More affluent families who can afford to do so skirt districts that are average or below average. When considering a move to a new home, these families first check on the neighborhood school's state test scores to determine whether purchasing a home there would be in their children's best interests; as a result, the reputation and makeup of particular communities and the schools within them are increasingly reflective of state testing outcomes and household income.

    A Testing Frenzy

    High stakes testing has been labeled a social "frenzy" by Harvard's Howard Gardner (2001), who says it is an inappropriate response to a perceived "problem" and one that is feeding on itself. The stakes are so high, in fact, that some schools have been accused of cheating to get scores into an acceptable range. Accusations have been leveled that teachers have directed students to the correct answers or prepped them ahead of time on the vocabulary that will be included in the reading selections; principals have been charged with filling in the correct answers on official answer sheets prior to sending them on for state analysis (Lansing State Journal, February 9, 2001). In low-income areas where where schools are more vulnerable to loss of funding, pressure to practice for testing has tended to bring all other classroom activity to a standstill (Toppo, 2001). Other consequences of this intense focus on testing include "cannibalized" curricula, with little opportunities for children to experiment in order to construct knowledge or to discuss current events (since those things wouldn't be on the test, anyway). Recess has been eliminated in some elementary schools, and physical education, electives, and the arts have become more vulnerable because they "don't count" either (Kohn, 2001).

    Concentration on "upping" test scores has drawn attention away from matching instruction to children's experiential and developmental levels. A fourth grade teacher in a low-performing school noted that he knew he should be planning literacy activities in his classroom that would more closely match the children's developmental levels in order to scaffold them forward; however, citing that he "felt between a rock and a hard place," he was also struggling with the press to prepare the children for the state test they would have to take in January. "It's crazy," he lamented. "I'm using their time to prepare them for a test I know most of them can't pass. In the meantime, I could be using the time to enable them to come closer next year when they'll be facing the same problem." There is no question that the learning of skills for the sake of being able to replicate them on tests rather than being able to apply them across practical situations is now more highly targeted. For example, when standardized reading tests used in a school district emphasize decoding rather than comprehension, phonics become more important on a day-to-day basis than highlighting what actually happened in the story or taking time to have children relate the events and information to the central issues in their own lives.

    Distorted Perceptions

    Trying to create a more balanced perspective, Michigan's state school superintendent Tom Watkins has indicated that the fixation on the MEAP is distorting perceptions of what schools should be about because the state has come to rely on one criterion, one score, as a measure of how its schools are doing. This heavy reliance on one measure may actually lull teachers, parents, administrators and politicians into thinking we're actually doing something about the "problem" of failing schools. Scores on such tests, while highly reliable in telling us what children are not able to do, are only part of the picture; conclusions based on results are frequently misleading about the many positive and negative experiences in children's lives that lead to school success or school failure.

    When test scores are published in local newspapers, there is no accompanying information about the fact that as many as 87 percent of the children attending a particular school are poor and may come from highly chaotic family situations. Many of them may not eat well balanced, nutritious meals on a daily basis (having breakfast or a snack provided by the school on the day of the test won't help much). Estimates are that most do not get to bed at a reasonable hour. There may be poor monitoring of TV viewing and video involvement, little attention given to homework, and precious little literacy activity in the child's home. Such children are often subject to worrying about adult problems, which draw their attention away from their work in the classroom. One first grader, being tested on his reading accuracy, responded to his teacher's comment about the importance of learning to read: "Well, yeah. I'm going to learn to read really, really good so I can get a job when I grow up. Then I'll be able to get a car and go find my real mom" (he is currently in a foster home).

    In another school where children's state test scores are primarily in the satisfactory category, there is a predictable positive correlation with the more supportive home environments and lower mobility rates of the children who are in attendance. Alfie Kohn, author of The Schools Our Children Deserve: Moving Beyond Traditional Classrooms and Tougher Schools, suggests: "Don't let anyone tell you that standardized tests are not accurate measures. The truth of the matter is that they offer a remarkably precise method for gauging the size of the homes near the school where the test was administered." He bases this account on a number of studies documenting the fact that SES accounts for "an overwhelming proportion of the variance in test scores" (2001:349). According to Kohn, only 4 variables explain 89% of the variance: the home. number of parents living at home; parents' educational background; type of community; and the state's poverty rates. Of these four factors, 74 percent of the variance is derived from only one of the factors: whether or not two parents are in the home.

    Toward a More Reasonable Testing and Evaluation Scenario

    Many of the statewide tests that have been developed across the nation are extremely well constructed in a psychometric sense, and they have exposed and documented the weaknesses in many of our schools, communities, and the homes from which our children come. Extensive efforts have been made to see that the tests are free of bias and that content is valid for the grade levels tested. If only all children shared the same family and community contexts, the tests would be more helpful. However, most statewide tests are criterion-referenced.

    That means that a "satisfactory level" (and how many correct answers constitute that level) is arbitrarily set by the state. These tests are not standardized, i.e., normed on such characteristics as gender, ethnicity, or socioeconomic (SE S) differences. While there is widespread agreement that schools must be accountable for transmitting knowledge and skills to children in a more effective manner and that our goal is (and should be) to have all children be able to perform at satisfactory levels, many children coming from lower SES populations are clearly not up to the task. Statewide testing alone will never solve the growing phenomenon of children at risk for school failure. Only different inputs will eventually influence the outcomes we seek. Rather than more rigorous standardized testing, we need to make sure that the daily experiences all children have in their classrooms are growth producing, engaging, and useful to them.

    Because the children and families that schools serve are dramatically different from one another, educational approaches must also differ in order to be effective. For example, strategies such as smaller classrooms and having two adults in classrooms in troubled areas where children are less able to control their behavior, are more likely to produce better testing results. Making sure that teachers are trained to design and closely match high quality classroom experiences and expectations to children's experiential and developmental levels will ultimately produce better test outcomes. Using culturally sensitive and leveled materials, rather than one-size-fits-all basal textbooks and "dumbed down" workbooks, is more likely to produce better testing results than we are now getting. Collecting useful, "authentic" data about children on an ongoing and consistent basis, long before children ever see a state assessment is more likely to produce better statewide testing results. Having children of all ages involved in sharing that data with their parents, producing pride in the children related to their work and better understanding on the part of their parents, is more likely to produce better testing results.

    For the past decade, there has been an attempt to balance formal testing with assessment procedures that can provide a more accurate picture of children's progress in order to facilitate true learning and development (Kostelnik, Soderman, and Whiren, 1999; Soderman, Gregory, and O'Neill, 1999). The principles set out by the National Association for the Education of Young Children in 1996 for children ages 3-8 would seem to have merit for all public school assessment. They are as follows:

    1. Curriculum and assessment are integrated throughout the program; assessment is congruent with and relevant to the goals, objectives, and content of the program.
    2. Assessment results in benefits to the child, such as needed adjustments in the curriculum or more individualized instruction and improvements in the program.
    3. Children's development and learning in all the domains (physical, social, emotional, and cognitive), as well as their dispositions and feelings, are formally and routinely assessed by teachers' observing children's activities and interactions, listening to them as they talk, and using children's constructive errors to understand their learning.
    4. Assessment provides teachers with useful information to successfully fulfill their responsibilities: to support children's learning and development, to plan for individuals and groups, and to communicate with parents.
    5. Assessment involves regular and periodic observation of the child in a wide variety of circumstances that are representative of the child's behavior in the program over time.
    6. Assessment relies primarily on procedures that reflect the ongoing life of the classroom and typical activities of the children. Assessment avoids approaches that place children in artificial situations, impede the usual learning and developmental experiences of the classroom, or divert children from their natural learning processes.
    7. Assessment relies on demonstrated performance during real, not contrived, activities; for example, real reading and writing activities rather than only skills testing (Engel, 1990; Teale, 1988).
    8. Assessment utilizes an array of tools and a variety of processes including, but not limited to, collections of representative work by children (artwork, stories they write, tape recordings of their reading), records of systematic observations of teachers, records of conversations and interviews with children, teachers' summaries of children's progress as individuals and as groups (Chittenden & Courtney, 1989; Goodman, Goodman & Hood, 1989).
    9. Assessment recorgnizes individual diversity of learners and allows for differences in styles and rates of learning. Assessment takes into consideration children's ability in English, their stage of language acquisition, and whether they have been given the time and opportunity to develop proficiency in their native language as well as in English.
    10. Assessment supports children's development and learning; it does not threaten children's psychological safety or feelings of self-esteem.
    11. Assessment supports parents' relationships with their children and does not undermine parents' confidence in their children's or their own ability; nor does it devalue the language and culture of the family.
    12. Assessment demonstrates children's overall strengths and progress, what children can do — not just their wrong answers or what they cannot do or do not know.
    13. Assessment is an essential component of the teacher's role. Because teachers can make maximal use of assessment results, the teacher is the primary assessor.
    14. Assessment is a collaborative process involving children and teachers, teachers and parents, school and community. Information from parents about each child's experiences at home is used in planning instruction and evaluating children's learning. Information obtained from assessment is shared with parents in language they can understand.
    15. Assessment encourages children to participate in self-evaluation.
    16. Assessment addresses what children can do independently and what they can demonstrate with assistance because the latter shows the direction of their growth.
    17. Information about each child's growth, development, and learning is systematically collected and recorded at regular intervals. Information such as samples of children's work, descriptions of their performance, and anecdotal records is used for planning instruction and communicating with parents.
    18. A regular process exists for periodic information sharing between teachers and parents about children's growth and development and performance. The method of reporting to parents does not rely on letter or numerical grades but rather provides more meaningful, descriptive information in narrative form (NAEYC, 1996a, 15-16).

    Things move very quickly today in public education. Within a five-year period or less, valued strategies and progress from one perspective can be quickly devalued and eliminated as those pushing a completely different perspective gain in popularity. Teachers often choose to "wait it out" or are slow to implement new learning strategies because they know additional changes may quickly wash them away. Parents are often confused about whether new directions are wise ones, and many don't have time to even think about it. Politicians want fast results. These are, however, all adult concerns.

    Children, on the other hand, are eager to learn, at least in the early stages of their education. Bowman, Donovan, and Burns (2001:237) suggest that the "pace and content of that learning is shaped by the opportunity that the environment presents for interactions and growth." The authors note the need to design assessments that have greater ecological validity in supporting children's educational journey from "natal culture to school culture to the culture of the larger society." The greatest danger in using assessments and tests with children is to do so without consideration of cultural influences or variations in development, possibly leading to misuse and misinterpretations, lack of fusion betweeen assessment and instruction, or inappropriate policy decisions or external action.


    That more rigorous testing is an answer for the ills that face America's schools is unreasonable and flawed thinking. Gardner (2001) compares this with taking the temperature of a sick person repeatedly in order to improve their health. Common sense and a closer look at differences in communities should tell us that school failure, as measured by these tests, has more to do with failing families and failing communities than the schools that the children attend. There is no documented evidence that teachers in failing schools are any less effective, less well trained or less motivated to help children learn than teachers in more successful schools. What is true is that educators in many of our low performing schools are dealing with an entirely different set of challenges. In failing schools, there is far more absenteeism, tardiness, problem behavior, poverty, family transitions and, noticeably, less parent participation. This is not to say that there are not exceptions in a school's ability to overcome the odds — or that realities should serve as excuses for not continuing to look for solutions. Blaming teachers and administrators who do their best on a daily basis to deal with very trying situations and yet have unsatisfactory test scores, however, is not a solution to failing schools. The intense focus on test outcomes that have much to do with factors outside of educators' control are prompting good teachers and administrators in failing schools to look for positions in more affluent districts where children and families are often in better shape to learn.

    Statewide tests that are well constructed to be as fair as possible can serve to measure how well society is meeting the overall needs of families and children in the many, diverse communities of America. If our ultimate goal is to set high standards for U. S. education and to bring all children into a ring of school success, we must widen our perspectives about what it takes to produce future generations of truly educated citizens. That's a complex and daunting task that will continue to take some very serious work by the best minds in the business — and the cooperation of the thousands of administrators and teachers across the country who every day go to war against school failure.


    Bowman, B. T., Donovan, M. S. And Burns, M. S. (Eds). (2001). Eager to Learn: Educating our Preschoolers. Washington, D.C.: National Academy Press.

    Gardner, H. (January 15, 2001). Constant testing isn't the way to measure eduction. Lansing State Journal, 5A.

    Heubert, J. P. And Hauser, R. M. (Eds.) (1999). High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, D.C.: National Academy Press.

    Johnson, M. (September 8, 2000). Engler promises accreditation reform. Lansing State Journal, 2B.

    Kohn, A. (January, 2001). Fighting the tests: A practical hi Delta Kappan, 82 (5) 348-357.

    Kostelnik, M. J., Soderman, A. K., and Whiren, A. P. (1999). Developmentally Appropriate Curriculum: Best Practices in Early Childhood Education. Upper Saddle River, NJ: Merrill.

    Lansing State Journal. Editorial, April 7, 2001.

    Morrow, J. (May, 2001). Undermining standards. Phi Delta Kappan, 653-659.

    Toppo, G. (March 1, 2001). Teachers fear overkill on standardized testing. Lansing State Journal, 2B.

    Soderman, A. K., Gregory, K. S., and O'Neill, L. T. (1999). Scaffolding Emergent Literacy: A Child Centered Approach, Preschool through Grade 5. Boston: Allyn & Bacon.