THE MOST EFFECTIVE SCHOOL DISTRICTS IN MASSACHUSETTS

A study of performance on the first year of the MCAS tests

Sponsored by

the University of Massachusetts

Donahue Institute

by

Dr. Robert D. Gaudet,

Senior Research Associate

January, 1999

 

"Education, then, beyond all other devices of human origin,

is the great equalizer of the conditions of men..."

-- Horace Mann

 

Testing plays an important role in most of the contemporaryschool reform efforts in the United States. The Massachusettseducation reform effort is no exception. Its testing vehicle isthe Massachusetts Comprehensive Assessment System or, as it'scommonly known, the MCAS.

The chief objective of the state's education reform initiativeis to enable public school students to achieve a certain levelof knowledge and skill. The Massachusetts Department of Educationhas established this level by setting out what students are expectedto learn in each basic subject. School districts are supposedto see to it that their students learn what they're expected tolearn. The purpose of the MCAS is to gauge periodically how studentsare doing as they try to achieve this level of knowledge and skill.

Each year. in every district in the state, the MCAS testsare given to public school students in grades four, eight, andten. They cover such academic subjects as math, science, andliteracy skills. The test scores are broken down by individualstudent, school, and district. The scores for individual studentsare available to their parents, teachers, principals, and superintendents.The scores for entire schools and districts are available to thegeneral public.

With the MCAS, the state has, for the first time in itshistory, an evaluation mechanism that measures how much progressstudents are making toward well-defined goals. At the same time,individual schools districts are urged to anticipate and complementthe MCAS by developing their own parallel methods of assessinghow their students are doing. Thus, the education reform effortuses assessment as a way to help all students move toward a highlevel of academic achievement.

Just as this effort views higher student achievement asits end, it views the improvement of the public schools as itschief means to achieve this end. What happens in school is byno means the only or even the leading influence on how pupilscurrently perform on standardized academic tests. However, whathappens in school obviously is the only means that is currentlywithin the control of the schools themselves. So it's the onlymeans of reform that is at the disposal of the education improvementeffort as it now exists.

Improving Our Schools

Thus, the more the test scores can be used to inform decisionsabout how to alter what happens in school, the better the chancesto make the schools more effective in helping their students toimprove their performance on standardized academic tests likethe MCAS. Properly used, the results can pinpoint which approachesto teaching and learning are working and which are not. The MCASalso includes an array of diagnostic tools that let teachers andadministrators spot areas where students perform poorly, so thatthey can work with the students to mend the weaknesses.

Consequently, the essence of education reform in Massachusettscan be

summed up in a few words: Better student performance, throughmore effective schools.

However, for the MCAS to fulfill its intended role in thecurrent education reform effort, there at least two importantconditions that have to be met.

FIRST, the tests, and other assessments, must be fair andaccurate. They must measure what children have learned, ratherthan just their social or economic background. They must notbe biased for, or against, any group of students.

 

 

SECOND, they must be used to make the public schools moreeffective. Thus, the scores should drive an ongoing analysis ofwhat makes the school experience effective. They must provideteachers with a critical piece of information about the potentiallearning problems and possibilities of individual students. Andthe information must be used as a basis for helping all studentsto do better.

To meet the second condition, we must be able to use theMCAS scores as one tool to discern the effectiveness of our schools.We must be able to establish how effective they are today, andto track the rise or fall of their effectiveness in the future.Thus, finding ways to measure school effectiveness is essentialto education reform.

Measuring Effectiveness

Student academic performance, including how students doon MCAS tests, is

influenced by two broad sets of factors: school factors and non-schoolfactors. The first entail what happens in school, and thus whatis within the control of the school district itself. The secondentails conditions outside the schools, such as the demographicprofile of the students and the community. As we look at a givendistrict's average score on an MCAS test, we have to be able todiscern how much of the score is tied to school factors, and howmuch of the score is explained by non-school factors.

How well do the school design and the curriculum promotelearning for all?

Are teachers top-notch professionals who have both the skillsand commitment to teach all students? Are professional developmentactivities rigorously aligned with efforts to increase studentachievement? Is there strong, solid leadership in the school?Are there high expectations for all? Are parents full partnersin their children's education? Are there adequate resources todo the job? These are all questions about school factors.1

In the reserach reported in this paper, non-school factorsconsist largely of the overlapping demographic conditions of familylife and community life. We use six such conditions in a givenschool district: its median level of educational attainment, itsmedian income level, its percentage of households above the povertyline, its percentage of single-parent families, its percentageof non-English-speaking households, and its level of private schoolenrollment. Statistical analysis shows that these factors formmuch of the non-school influence on how the state's studentsdo on such standardized tests as the MCAS.2

As we all know, students in advantaged districtstend to get higher standardized test scores than students in disadvantageddistricts. Thus, if a district's students get a high average scoreon an MCAS or other standardized tests, the test score by itselfdoesn't tell us how much of the score is explained by school factorsand how much is explained by non-school factors. A high scoremight be tied more to advantaged demography than to what actuallyhappens in the district's schools. The score by itself isn't asound guide to how effective the school district is.

We cannot begin to zero in on just how effective the schooldistrict itself is unless we can distinguish between the respectiveinfluences of the two types of factors. Only then can we discernhow effectively the district itself performs, and how much itcontributes to its students' average performance on the MCAS.

The Effectiveness Index provides insight into this distinction,and consequently provides some measure of the school district'scontribution to its students's performance. Thus, it suppliesa piece of crucial insight as to which schools are more effective.

For a given district, the Effectiveness Index (EI) gaugesthe impact that school factors have on the average MCAS score. The greater the positive impact of the school factors, the higherthe district's Effectiveness Index will be.

The Index is calculated in the following manner: For agiven district, the six demographic factors are used as the basisfor projecting a likely average score on the MCAS. The demographically-likelyscore is then compared to the average score that the studentsin the district actually received. The Effectiveness Index isthe number that represents the difference between the likely scoreand the actual score.

If the number is negative – if the actual score islower than the likely score – then this suggests that whatis happening in the schools in the district is not enabling itsstudents to perform beyond the demographic expectations for them. If the number is a positive number – if the actual scoreis higher than the likely score – then this suggests thatwhat is happening in the schools is helping the district's studentsto surpass the demographic expectations for them. (For a fulleraccount of of the development of the Effectiveness Index, pleasesee Appendix B.)

 

 

 

What the Effectiveness Index Tells Us: Statewide Results

We applied the Effectiveness Index to the MCAS scores ofthe 200 largest school districts in the state. These districtscomprise 93 percent of the total population.

The demographic differences between the 200 largest districtsexplain 86% of the variation in the districts's overall averagetest scores - that is, their scores for all of their test-takingstudents for the nine MCAS tests combined. Thus, though demographyisn't destiny in this case, it sets a strong tendency.

A simple way to depict the respective contributions thatdemography and the schools make to the average level of studentperformance on the MCAS is this:

DEMOGRAPHY + School = Average Score

Nonetheless, a number of districts achieved test scoresthat are significantly higher than their demography predicts.

 

Four Types of School Districts

The Effectiveness Index lets us identify three types of schooldistricts: effective, noteworthy, and ineffective.

An EFFECTIVE district meets two specifications:

1) Its Effectiveness Index is a positive number - that is,its actual

score on the test is higher than its demographically likely score.

2) Its actual score is equal to or higher than the averageMCAS

score for the state as a whole.

Thus, Stoneham, whose demography places it in the middleof the state's demographic ladder, is an example of an effectivedistrict. Its actual score is substantially higher than its likelyscore. And its actual score is higher than the statewide averagescore. Indeed, its actual score ranks it 46th among the state's200 largest school districts. Its demography would predict thatits score would rank 111.

 

A NOTEWORTHY district fits the first specification but doesn'tfit the second. Since its performance helps its students to gobeyond their demography, it is still worthy of note. For whatsuch a district is doing can hold useful lessons for districtsthat are demographically similar, but do not outscore their demography.And such a district is more likely to deliver a return on futurepublic investment than an ineffective district is.

Here, Everett and Worcester are outstanding examples.Everett's overall score on all nine tests combined is much higherthan its demography predicts. Worcester's scores on the gradefour tests surpass its demographic prediction.

An INEFFECTIVE district has a negative index number,and its actual score that is less than the average MCAS scorefor the state as a whole.

For each of these three types of districts, the EffectivenessIndex sets a baseline for improvement. If a district is ineffective,then its short-term goal should be to become efective. If a districtis noteworthy, its short-term goal should be to get its test scoreshigh enough to exceed the statewide average. If a district iseffective, its short-term goal should be to raise its actual testsscores further, so that it will become a fourth type of district-- a SUCCESSFUL district.

A SUCCESSFUL district, as presently defined, gets a 75% passrate on each of the tested subjects in each of the three grades.

Making the Grade

Currently, no successful districts exist, because none of thestate's 200 largest districts meet the department's definitionof a successful district. Presumably, some districts will improvetheir test scores by enough in the near future to do so.

However, a number of districts did come close to achieving success. On five of the nine MCAS tests, 75% of Harvard's students earnedpassing scores. Harvard got an overall score of 2208 on all nineof the MCAS tests combined, higher than any of the other 199 districts.In Medfield and Wellesley, 75% of the students earned a passingscore on several of the tests.

None of these districts is among the 50 most demographicallydisadvantaged in the state.

This study uses the Effectiveness Index to identify, and thento rank, on each of several fronts, the 50 most effective districtsand 10 most noteworthy districts. Thus, on pages TK thorugh TK,there are four sets of rankings:

Overall performance: all three subjects and all three gradescombined

Grade four performance: all three subjects combined

Grade eight performance: all three subjects combined

Grade ten performance: all three subjects combined

 

The Importance of Reading and Writing

For each of the three grades, this study also ranks thedistricts that are effective or noteworthy in language arts --essentially, reading and writing.

This is because reading and writing are necessary conditionsfor doing well on the MCAS tests. This is true even of the testsin mathematics. Many of the problems on the mathematics's tests,particularly in grade eight and ten, are word problems. You cannotunderstand these problems if you cannot understand the words.In all subjects, moreover, many questions call for a written answer, as short as a sentence or two or as long as an essay ofseveralparagraphs.

Finally, the children who took the grade four MCAS testsin May of 1998 entered first grade in 1994, the first full yearthat the Education Reform Act was in force. Thus, they are thechildren of this long-term education reform initiative. The entirespan of their K-12 experience will be shaped by the requirementsestablished by the reform act. How well these children do intheir school careers will be the first full measure of the impactof the act.

The three sets of language arts rankings appear on pagesTK through TK.

This study highlights these districts because they mighthave lessons to offer

to other districts to help them to enhance their contributionto their

students's future performance of the MCAS.

 

 

In time, we might depict the contributions that demographyand school

make to the average level of student performance on the MCAS testsin this

fashion:

Demography + SCHOOL = AVERAGE SCORE

 

Middle Massachusetts

In the demographic ranking of the 200 largest school districts,100 districts are concentrated in the demographic middle of thestate. These districts, with 2 million people, make up what mightbe called Middle Massachusetts. They may be well-suited to playa crucial role in the short-term future of education reform.

For the state as a whole, as we've seen, demographic differencesbetween the 200 largest districts explain 86% of the variationin the districts's average overall test scores. All or much ofthe other 14% of the variation is probably explained by the differencesin how the school districts themselves behave.

However, this 14% isn't spread out evenly across all200 districts. Little of this variation is found in the mostadvantaged districts, where the relationship between demographyand test scores is generally strong. Thus, these advantaged districtsgenerally get high test scores. The same holds for the most disadvantageddistricts. Here, the relationship between demography and testscores is also generally strong. The actual test scores of allthese districts are well below the statewide average.

The pattern in Middle Massachusetts is different. Itsdistricts exhibit a wide range of test scores -- even though theirdemography is relatively similar. Thus, much of the 14 pointsof variation is concentrated in the 100 MiddleMass districts.

This variation can be seen in these bar graphs. For eachof the 20 districts closest to the demographic middle of the state-- thus, the districts that form the middle of Middle Massachusettsitself -- the tip of one bar represents its demography, and thetip of the other bar represents its MCAS test score.

Since the demographic variation is slight, but the variationin test scores is great, this pattern suggests that much of thevariation is explained less by demography than by differencesin what the schools of MiddleMass are doing.

Further, the test scores of MiddleMass districts with highpositive numbers on the Effectiveness Index are just as high asthe scores of many of the advantaged districts. Thus, the scoresof a MiddleMass district like East Longmeadow are very nearlyequal to the scores of Longmeadow, one of the more demographicallyadvantaged districts in the state.

Some MiddleMass districts did particularly well on thegrade four tests. The scores of Shrewsbury, Pembroke, and EastLongmeadow were equal to or higher than the scores of such advantageddistricts as Norwell, Cohasset, and Duxbury. To be sure, thesedisticts had high scores, as one would expect. What's less expectedis that districts farther down the demographic ladder did justas well.

So, if more MiddleMass districts become as effective asEast Longmeadow, Shrewsbury, and Pembroke, then more MiddleMassdistricts will get test scores

as high as the test scores of the advantaged districts.

Moreover, insofar as MiddleMass districts are demographicallysimilar, what

makes for effective schools in an effective districts in MiddleMassis more likely to make for effective schools in an ineffectivedistrict MiddleMassachusetts.

Thus, in the short run, Middle Massachusetts can be anespecially fruitful place to seek, and expect to find, a relativelyswift rise in MCAS test scores.

Education Reform in Massachusetts

The Education Reform Act of 1993 provides an opportunityto transform our

schools. The MCAS can be the backbone of our effort to do so.It can assess the performance of districts, schools, and individualstudents, and it can inform the public about its schools. Moreimportantly, MCAS’s built-in diagnostics can help teachersto help all children learn better. Under the act, increased statefunds provide substantial amounts of new money for districts touse for reform.

Massachusetts stands at a critical crossroads. The elementsare in place for exciting statewide reform, but the barriers tochange are substantial. This study captures the role that demographyplays in student performance. Again, though demography is notdestiny, it does establish a tendency. If we overlook the tendencyof disadvantaged districts to produce low scores, then we willcontinue to consign the children of those districts to a futureof unfulfilled potential.

Without a broad long-term effort to do what is needed toenhance performance by all, we can expect more of the same: apolarization of academic performance that troubles even thosewhose children are fortunate enough to have been born into a situationthat makes a powerful contribution to their academic success.

------------------------------------------------------------

FOOTNOTES

1. Per pupil expenditure [PPE] is a school factor. but our measuresof it are not always reliable. There is no standard accountingprocedure for establishing PPE. For example, some systems mightinclude teacher retirement costs, capital costs, federal funds,and long-term disability obligations in their per-pupil spendingfigure. Others might not. So comparisons across districts aredifficult to make.

2. Other family and community conditions arecrucial to student success, but are hard to observe and measure. One would have to monitor many families and communities closelyover time to discern how family and community behavior affectschool outcomes. How many books are read in the family? How muchtime is taken up by TV-watching? How do the community's adultstreat children other than their own? Does the community mentorits young people? It's hard to get reliable answers to such questions.But we do know that the children of advantaged families and communitiesare more likely on average to have resources and support, andchildren of less advantaged situations are less likely to havethem. So we use gross measures of such support as a proxy foranswers to the more specific questions that are so hard to pursue.

For more information, please contact Rabert Gaudet at the UMassDonahue

Institute. Phone (617)

 

APPENDIX A: Community Listings

 

APPENDIX B: Deriving the Effectiveness Index

The Effectiveness Index (EI) is derived by comparing actual scoreson standardized tests with scores as predicted by a model whichfactors in the role community characteristics play in educationaloutcomes.

The Community Effects Factor (CEF) model was developed in a doctoral

dissertation (Education Achievement Communities: A New Modelfor "Kind of Community" in Massachusetts Based on anAnalysis of Community Characteristics

Affecting Educational Outcomes, May 1999, University of Massachusaetts,Amherst). That work is the basis for determining school effectiveness. The model examines the relationship between selected demographiccharacteristics and educational outcomes. These characteristicsinclude: average education level, average income, poverty rate,single-parent status, language spoken, and percentage of school-agepopulation enrolled in private schools. These variables werechosen because they correlate with achievement and because theeducation literature identifies them as connected to academicperformance.

In order to refine a better CEF model, it is first necessary tofactor the impact of these demographic variables on each other. This can be done through a technique known as principal componentanalysis that is a statistical mechanism that reduces many variablesto a few salient ones that have the most impact on an outcome. Once the factors have been identified, a regression analysisproduces the equations that can be used to either build a kind-of-communitymodel or to predict expected district

performance on achievement tests. The degree to which a community'scharacteristics lifts or lowers test scores is reflected in aCommunity Effects Factor (CEF). a measure of demography.

The CEF, which is a measure of the demographic lift or drag ofeach community concerning educational achievement, is a good pointof departure for analyzing school and school district effectiveness. The CEF identifies expected levels of performance based on communitycharacteristics which, for better or worse, are very powerfulindicators of educational achievement in Massachusetts. In thisanalysis, Weston is the most demographically advantaged communityin the state in terms of educational outcomes (CEF = + 2.8),and Lawrence is the least advantaged (CEF =

– 4.8). The CEF has a strong relationship, or correlation,to test scores.

Correlation is a process that identifies the interdependence ofone variable with another. Correlation simply shows "the extentto which two things typically run together." [The Economist, 6 Dec. 1997, p. 82]. Correlation is not equivalent to causation;it can only reveal tendencies between variables, not identifycauses. Correlations simply demonstrate relationships. A perfectcorrelation would be 1.0. For example, the correlation betweeninches and feet is 1.0 because it is a perfect linear fit; 12inches always equals one foot. Correlations in real world situationsinvolving human behavior are never 1.0.

The correlation, or the connection, between spending (Per-PupilExpenditure or PPE) and achievement in Massachusetts is .28, whichis relatively low. While spending clearly matters, merely increasingspending levels has a relatively weak impact on results. Increasingly,many people are coming to the realization that how a system spendsmoney is more important than how much money it spends. The achievementoutcome accounted for by the community effects factor (CEF) ismuch stronger; that relationship is .86. This is not to say thecommunity context, the CEF, is the most important determinantof school success, but it is a significant element that must bea major consideration in any plan to improve education in disadvantagedareas.

The Effectiveness Index was generated in the following manner:

o Utilize the 1998 MCAS results as an outcome indicator forachievement in

each of the state's most populous 200 communities. (NOTE: Thismodel does

not evaluate results in the 151 smallest communities of the statewhich comprise about 7% of the population.)

o Utilize the CEF model to predict a score for each district. This

predicted score is based solely on community characteristic asthey affect

educational outcomes.

o Compare the actual to the predicted score. Systems whoseactual scores

are significantly higher than predicted scores and whose absolutescores are at or above state average are identified as effective. systems with positive effectiveness indexes but scores belowstate average are identified as noteworthy.

The basis for this model was developed as a doctoral dissertation. The following provides detail on the statistics behind the model which is used to predict expected scores based on demographics.