By Thomas W. Bice

**Executive Summary**

Plaintiffs in the *Sheff vs. O’Neill* lawsuit currently before Connecticut’s courts seek to improve minority students’ educational outcomes by instituting magnet schools in the City of Hartford and opening schools in surrounding districts to inner-city Hartford students. The principal rationale underlying these demands holds that greater racial/ethnic balances in schools’ student bodies will bring about better academic performance among minority students. The study reported here was undertaken to test this presumption.

Using publicly available, aggregate data from 139 Connecticut high schools, the study estimates direct effects of schools’ racial compositions on tenth-graders’ scores on the four areas tested by the Connecticut Academic Performance Test (CAPT) — mathematics, science, reading, and writing. It does so using statistical methods that estimate these effects while adjusting test scores for effects of known determinants of academic performance.

The several analyses consistently find that schools’ racial compositions have no appreciable effect on academic performance among black and Hispanic students. The percentages of schools’ student bodies accounted for by white youths do not significantly influence any of the measures of academic performance investigated, including average Index scores, percentages of students meeting standards, and percentages of students who require remediation.

**We conclude from our findings that the Sheff plaintiffs’ presumption that schools’ racial/ethnic compositions directly influence educational performance is incorrect.**

From a brief survey of the education reform literature **we also conclude that Sheff plaintiffs’ demand for magnet schools overlooks complexities encountered in urban school reform.**

**Introduction**

In 1996 the Supreme Court of the State of Connecticut heard a suit in which plaintiffs sought to improve the educational circumstances and outcomes of the City of Hartford’s minority students, principally blacks and Latinos. The court ordered the State to effect a more balanced distribution of racial/ethnic groups among schools in the Hartford area and remanded the case to the Superior Court. In April of 2002 the Superior Court reopened the case. The plaintiffs’ preferred remedies are (1) to replace Hartford’s existing schools with magnet schools and (2) to compel public schools in Hartford’s surrounding suburbs to reserve portions of their capacities for inner-city students who might choose to attend them. The rationale underlying these proposed remedies presumes that attaining racial and ethnic balances within the region’s schools will have positive effects on the educational performance of minority students. This paper examines the tenability of that logic.

Although many studies find that schools’ racial/ethnic compositions are associated with educational outcomes, there is good reason to believe that this correlation is not a causal one. Based on their study of a national sample of more than 7,000 high school seniors, Chubb and Moe concluded that “[o]nce other influences of [academic] achievement are included in the [statistical] model, individual gains [in test scores] are virtually unaffected by the percentage of the student body that is black.” Grissmer and his associates conclude from their thorough study of trends in the so-called black-white test score gap that in the 1960s and 1970s school desegregation in the South narrowed the gap, but not elsewhere. Reflecting on these and other recent studies, Jencks and Philips reckon that “…racial mix does not seem to have much effect on changes in reading scores after the sixth grade or on math scores at any age, and that “…desegregation in northern schools might raise blacks’ reading scores today, but the gain would be modest.”

Studies conducted in various settings have shown that many interacting factors are implicated in educational performance. Among these are the quality of teachers and other school resources, parents’ encouragement of and involvement in their children’s education, and schools’ cultures, which in varying degrees either encourage or discourage academic performance. Schools’ racial/ethnic compositions are undoubtedly associated with some of these causative factors. An assessment of schools’ racial/ethnic compositions on academic performance therefore must take account of a web of interrelationships that includes contextual as well as other education-related factors. Accordingly, we estimate the *direct* relationship between students’ educational performance and the percentages of white students in their schools by adjusting for effects of known determinants of educational outcomes.

Figure 1 depicts the system of interrelationships that guides our analyses. The model indicates that towns’ and school districts’ socioeconomic situations affect education through two pathways. First, wealthier towns and districts and those with highly educated adults are likely to devote more funds to education. This expectation is depicted by the link of socioeconomic (*SES*) and *Rural* to indicators of schools’ resources (*Resources*). Second, we hypothesize that economically well-off communities have relatively higher proportions of parents who attach great importance to their children’s academic performance and translate that value into childrearing practices that reinforce schools’ educational missions. That effect is measured by *Family*.

Our model also recognizes the crucial of importance of what transpires within schools and among students. That effect is indicated by *Culture*, which attempts to tap the degree to which schools and their student bodies value academic objectives and translate this into supportive educational expectations and practices. Sociologists who coined the term “student culture” have found that schools’ and their student bodies’ cultures variously stress different values. Some place great emphasis upon athletics; others accentuate social ends; and others lay emphasis on academic performance. Our *Culture* relates primarily to this latter dimension.

Finally, the quality of teaching and administration in schools is a critically important factor. Regrettably, none of our indictors even approximates the type of data needed to capture these phenomena. *Resources*, as we will see, taps only structural dimensions of school quality. We do not suggest that our structural measures capture the more human and process-driven aspects of education. We therefore depict effects of these omitted factors with dashed lines leading to and emanating from *Quality of Teaching & Administration*.

The following section describes the methods employed in this study. Following this, we present quantitative results that address the associations implied by our conceptual model and the effects of schools’ racial compositions on black and Hispanic students’ educational performance. In the closing section, we spell out our findings’ implications for the *Sheff* case and for education more generally.

**Data and Methods**

**Data**

All of the data in this report are from published reports. Data pertaining to towns are from the Connecticut Department of Economic and Community Development, and information pertaining to schools and educational factors are from several reports issued by the Connecticut Department of Education.

**Towns and School Districts**

Data pertaining to towns and cities are from the Department of Economic and Community Development’s website. The Department publishes town profiles that include demographic, socioeconomic, business-related, and other types of information for all of Connecticut’s 169 towns. We assembled our community context information from those profiles.

For each town we recorded the following data, all of which we presume relate to towns’ socioeconomic environments.

● Population size (*Pop*)

● Population density (*Density*)

● Percent of dwelling units that are single-family (*SHouse*)

● Percent of dwellings that are owner-occupied (*OwnOcc*)

● Racial/ethnic composition: percent white (*%White*), percent black (*%Black*), and percent Hispanic (*%Hisp*)

● Per capita income (*Income*)

● Value of the town’s equalized grand list per capita (*GrandList*)

● Education levels of adults (persons 25 years of age and older): percent not graduating high school (*NoHigh*), percent with bachelors degrees or higher (*Bach*)

● Per capita number of books circulated annually by town libraries (*Reading*)

Our intent in gathering these data was to devise a concise yet reliable indicator of towns’ socioeconomic environments. To accomplish that we performed factor analyses on these data, which resulted in the two identifiable factors shown in Table 1. The coefficients in this table are factor loadings, which indicate the correlation of each variable with the underlying factors. For instance, *Pop* — our measure of towns’ population sizes — is highly negatively loaded onto the Rural factor, and has a negligible loading on *SES*. Variables in the upper panel of the table are components of the *Rural* factor; those in the lower panel are loaded on *SES*.

The *Rural* factor arrays towns along a continuum from rural through suburban to urban. The more rural communities thus have high scores on this factor, and large cities have low scores. For instance, the town of Hartland has the highest score on *Rural*. Hartland has about 2,000 residents who are scattered thinly (*Density* =60.5 people per square mile) and has values on other *Rural* variables that we associate with small-town Connecticut. Hartford lies at the other end of the *Rural* spectrum, closely positioned near Bridgeport, New Haven, and New London. Windsor Locks, Old Lyme, and Sharon fall in the mid region.

The *SES* factor arrays towns along a socioeconomic dimension that groups towns by levels of income, wealth, and education. Fairfield County towns are clustered at the high end of *SES*. New Canaan leads, followed closely by Darien, Weston, and Greenwich. Putnam, Griswold, and Killingly — all small towns — are at the low end of *SES*. Connecticut’s larger towns lie between the extremes. For instance, New London ranks 131st, Hartford 101st, and Bridgeport 97th from the top of *SES*.

Most regular/traditional school districts in Connecticut coincide with town borders. The sixteen regional school districts that encompass two or more towns are the exceptions. As we intend to analyze community contextual variables applied to school districts, we aggregated data from participating towns for each district. These district aggregations are the population weighted mean values on *Rural* and* SES* of the towns that participate in particular districts.

**Schools**

The Department of Education publishes on its website Strategic School Profiles for all of Connecticut’s public schools. These Profiles contain a wealth of quantitative data addressing demographic characteristics of schools’ student bodies, types of courses taken by the previous year’s graduating classes, scores on academic achievement tests, and other information. Our data pertaining to the state’s high schools are from these Profiles.

Our selection of variables was guided by the types of factors depicted in our conceptual model, namely *Family, Culture, *and* Resources*.

*Family*

We chose three variables to indicate families’ influences on their children’s education: school attendance, physical fitness, and dropout rates. Each of these variables in some degree stems from families’ decisions (or lack thereof). School attendance falls when parents permit truancy and when illnesses interfere with the exercise of normal activities. Physical fitness is to some extent a consequence of children’s diets and exercise regimes. Dropping out of school likewise bespeaks of lack of parental interest in education or lack of control over their children or both.

The Profiles list for each school its average school attendance percentage, the percentage of tenth graders who pass all four physical fitness tests administered by the school, and the cumulative four-year dropout rate for students in the previous year’s graduating class. We factor analyzed these data and created a *Family* factor from the resulting factor scores. *Family* factor loadings are 0.783 for school attendance, 0.682 for physical fitness, and -0.773 for dropout rates.

The Avon School District leads on Family followed closely by several Fairfield County towns. The low end of the continuum is occupied by schools located in the state’s larger towns. For instance, seven of the eight lowest ranked schools are in the Hartford, New Haven, New London, and Bridgeport school districts.

*Culture*

The data available to construct an indicator of schools’ culture are relatively satisfactory, at least for measuring their *academic* orientations. The Strategic School Profiles give for each school the types of courses taken by members of the previous year’s graduating class over their four high school years. These include the percentages who took four or more units in mathematics, three or more in science, four or more in social studies, two or more in the arts, and two or more in vocational education. Additionally, the Profiles supply information on the percentage of the previous year’s graduating class who took the Scholastic Achievement Test (SAT) —which is usually taken by students who aspire to go on to college — and the percentages of seniors who scored 600 or more on each of the SAT’s two sections (Quantitative and Verbal). The Profiles also give the school-wide percentages of students who were retained in grade during the previous year.

We estimated a graduating class-wide SAT score based on the assumption that no one who elected not to take the test would have achieved scores of 600 or higher had they participated. Our *SAT Score* indicator thus is simply the product of the percentage who take the test and the mean percentage of those who score 600 or more across the test’s two sections.

We factor analyzed these data and created a *Culture* factor score for each school from the results. The eight variables’ loadings on *Culture* are shown below. Four of the five subject matter areas have positive loadings, while vocational education is negatively associated with what might be termed an academically oriented culture. Failure rates’ (*Retained in Grade*) negative loading and the positive loadings associated with SAT participation and scores are consistent with that view.

We find that the state’s wealthier, suburban towns score high on this factor, while the inner city high schools are clustered at its low end. The Madison, Ridgefield, Westport, and Darien districts have the highest scores; Bridgeport, Hartford, and New London lie near the bottom of *Culture’s* distribution of districts.

*School Resources*

Strategic School Profiles supply numerous potential indicators of schools’ resources. We selected three types, namely, variables pertaining to (1) teacher quality, (2) library resources, and (3) technology.

Teacher quality is admittedly a difficult variable to measure, particularly when the available data pertain only to credentialing, as is the case with the Profiles. Our indicator of teacher quality is therefore admittedly crude, based as it is on only the total number of teachers per student, the number of teachers with masters degrees or higher and the number of teachers trained and qualified as mentors, assessors, or cooperating teachers. For each of the two credentialing variables we computed a per student ratio, that is, the number of credentialed teachers per student. In turn, we factor analyzed the three indicators in order to compute *Teacher* factor scores. The factor loadings are -0.921 for the ratio of students per teacher, 0.870 for masters level trained teachers per student, and 0.530 for mentor-trained teachers per student.

A *Library* factor was similarly constructed from data on the number of printed library volumes per student and the number of subscriptions per student.

Finally, we constructed a *Technology* factor from data describing the availability of electronic equipment. These include the percentages of classrooms that are wired for voice communications, are equipped for video presentations and data transmission, and that are connected to the Internet.

As we are not for present purposes interested in effects of particular intra-school features and believe that none of the factors we were able to construct is likely to be of much explanatory value, we created a super-factor based on a factor analysis of the three sets of factor scores. The individual loadings of *Teacher, Library, *and* Technology* on his so-called Resources factor are, respectively, 0.848, 0.873, and 0.252.

The alignment of schools on the resulting *Resources* factor is not easily summarized. At the high end are such diverse schools as some of those in Fairfield County and other relatively wealthy districts. At the low end one finds schools in the Bristol and West Hartford districts.

**Educational Outcomes: CAPT Scores**

The CAPT is a state-mandated examination that is administered annually to tenth graders. The test covers four areas, namely, mathematics, science, reading, and writing. The Department of Education reports three versions of scores for each of these four skill areas: (1) percentages of students scoring in each of four levels (ranging from “requiring remediation” to “meeting standards”), (2) average “Index Scores,” and (3) scale scores.

We analyze each of the four CAPT area scores separately, examining for each the Index scores, the percentages of students meeting standards, and the percentages requiring remediation.

**Selection of Schools**

The Department of Education’s Strategic School Profiles list 192 public academic institutions that provide high school-level education categorized as shown in Table 2. As the *Sheff* case focuses on conventional schools, we confine our analyses to regular/traditional and magnet schools. Of these 162 institutions, complete data from the various sources we employed were available for 139 schools.

The loss of some schools resulted from the unavailability of CAPT scores. The Department of Education does not report results for subgroups of ten or fewer students. Therefore no data are available for schools that have fewer than that number of tenth graders. As the overwhelming majority of students in Connecticut schools are white, this policy results in relatively sparse data for minority groups. Table 3 shows the availability of CAPT scores among the 139 schools in our analysis.

Our 139 schools yield 240 observed sets of aggregate CAPT scores. As percentages of eligible children for whom data are reported for each of the areas differ among schools, this number varies slightly among CAPT areas.

**Racial Composition**

The principal policy-related question in our analysis pertains to the effect of schools’ racial/ethnic compositions on educational outcomes, in this case CAPT scores. This factor is measured by two variables: (1) a dummy variable that measures average differences in CAPT scores among white, black, and Hispanic students and (2) the racial compositions of schools’ student bodies.

*Black & Hispanic*

A dummy variable is employed in statistical analyses to capture differences among categorical groups. A dummy variable is defined by *k-1* categories, where k is the number of groups. The omitted category is the reference group to which the other categories are compared.

In our analyses, we employ white students as the reference group and measure differences with *Black*, which indicates the average difference between white students’ CAPT scores and those of black students and *Hispanic*, which measures the corresponding difference for Hispanic students.

**Racial Composition**

Given the *Sheff* plaintiffs’ logic, the definition of the variable measuring the racial compositions of schools deserves special attention.

The *Sheff* argument suggests that schools’ racial compositions affect only minority students’ educational performance. In effect, *Sheff* hypothesizes an interaction effect in which schools’ racial compositions will affect minority students’ educational outcomes but not those of white students. Accordingly, measuring schools’ compositions simply by the percentages of schools’ student bodies that are white (*%White*) does not provide an appropriate estimate of racial composition’s effects on educational outcomes.

Scatter plots of CAPT scores across schools’ racial compositions demonstrate that patterns of CAPT scores differ among white, black, and Hispanic students. Figure 2 shows that white students’ CAPT have relatively little variance overall, and scores are not highly correlated with the percentages of their student bodies accounted for by white students (%White). The best fitting linear line through these CAPT scores is

Y = 71.313 +0.024X,

where Y refers to the CAPT score and X to *%White*. That line accounts for only 13.1 percent of the variance in CAPT scores.

The scatters plot of black and Hispanic students’ CAPT scores across *%White* differ from that for white students, and *%White* has a greater effect. Figure 3 shows that black students’ scores are distributed rather evenly across *%White* with a tendency for higher scores to be found among schools with the highest percentages of white students. The best-fitting linear line is:

Y = 41.579 + 0.158X

This statistical model explains 16.4 percent of the total variance in black students’ CAPT scores and is thus a slightly better fit than that for white student’s scores.

The scatters plot of Hispanic students’ CAPT scores across *%White* (Figure 4) also indicates a slight tendency for CAPT scores to be higher among schools with higher percentages of white students. However, the best-fitting line through these points accounts for only ten percent of the variance in CAPT scores and bends slightly downward among schools with the highest percentages of white students:

Y = 39.039 + 0.609X – 0.005X

Taken together, these three scatter plots and statistical models of best-fitting lines clearly indicate that *%White* does not have a consistent effect across white students and black students. We therefore indicate schools’ racial compositions in our analyses with three variables, *White*%White, Black*%White*, and *Hispanic*%White*. *White*%White* is assigned the value of zero for all black and Hispanic students’ CAPT scores and for white students takes on the value of the percentages of schools’ student bodies that are accounted for by white students. *Black*%White* is assigned the value of zero for all white and Hispanic students’ CAPT scores and for black students is assigned the value of the percentages of schools’ student bodies that are accounted for by white students for all others. *Hispanic*%White* likewise is zero for black and white students’ CAPT scores and *%White* for Hispanic students.

**Statistical Methods**

Our analytic objectives dictate the use of multivariate statistical estimation models. Recognizing that the niceties of such approaches might not be accessible to the lay reader, we describe its features below in order to elucidate some of the statistical jargon that appears in our discussion of findings.

The models we estimate attempt to test which of two hypotheses regarding the effect of schools’ racial compositions on CAPT scores is more tenable, that of the *Sheff* plaintiffs or that we pose as an alternative. This analytical objective focuses attention on our estimates of effects of five variables (*Black, Hispanic, White*%White, Black*%White, and Hispanic*%White*) and particularly the latter three.

**Multivariate Analysis**

The purpose of multivariate analysis is to estimate the individual effect of each of a set of predictor variables on a dependent variable when effects of all other predictor variables included in the statistical model are adjusted (or “held constant”). We use a statistical procedure known as multivariate regression analysis, whose statistical model is

Y = c + b1X1 + b2X2 + … + bkXk + e,

where Y denotes an observed value of the “dependent variable,” an observed CAPT score. Each X stands for the value of an observed predictor variable. The c term is an unknown “constant” that the model estimates. Each b (“regression coefficient”) indicates the model’s estimate of its corresponding predictor variable’s effect on Y when effects of all other Xs are adjusted. The e term is the “error term,” which measures effects of all other potential Xs that are not included in the model and measurement error in the included ones. In a linear statistical model such as we employ, the sum of all these effects equals the *predicted* CAPT score.

Results from applying this model to our data yield quantitative estimates of the unknowns, the c, b and e terms. The overall fit of the model to the underlying data is indicated the size of e. This is expressed as 1-R2, where R2 equals the percentage of the observed variance in Y “explained” by the entire model. This ranges from zero percent (when the model’s predictor variables in combination explain nothing) to 100 percent (when the model perfectly fits the underlying observed data).

The constant term c estimates the value of Y when all predictor variable values equal zero. Each b estimates the independent effect of its corresponding X score. These range from zero, which indicates no effect, to unbounded positive and negative values. Positively signed values indicate that increases in the X variable are associated with higher Y (CAPT) scores; negatively signed values indicate that increases in the X variable are associated with lower Y (CAPT) scores. More specifically, a b’s quantitative value indicates the amount that Y changes when X changes by one unit and when effects of all other Xs are adjusted. Thus, a b equal to, say, 0.50 indicates that a one-unit increase in X brings about a 0.50 increase in Y (CAPT score).

**Statistical Hypotheses**

Two competing hypotheses are at risk in our analyses. The *Sheff* hypothesis, in effect, states that [*insert figure*].

In words, the variables *Black, Hispanic, and White*%White* will have no effect on CAPT scores when effects other predictor variables are adjusted; and *Black*%White* and *Hispanic*%White* will have positive effects.

Our competing hypothesis holds that none of these variables will significantly affect CAPT scores when effects of other predictors are adjusted. Rather, we expect that other predictors, which we regard as being true determinants of educational outcomes, will account for all of the variance in CAPT scores that our predictor variables are able to explain.

**Statistical Significance**

Finally, a word on statistical significance. When samples of observed data are selected from larger universes of data, statistical analyses infer population *parameters* from statistics computed on sample data. Sample statistics never equal corresponding parameters. Statistical theory, however, permits us to estimate for each sample statistic a range within which the corresponding parameter is likely to fall. Ranges can be constructed for various levels of confidence. It follows that greater confidence is associated with wider ranges, and vice versa. Conventionally, analysts employ 95 percent confidence as the point that distinguishes “statistically significant” findings from “statistically insignificant” ones. The former are denoted by “p < 0.05,” that is, the particular estimated parameter range would be expected to include zero in five percent of all possible samples of a particular size drawn from a specified universe of data in which the parameter is greater than zero. This estimate is popularly described as being the effect of chance variation; more strictly it measures “sampling error”.

As our observed data were not selected by any known sampling method from a larger universe of data, conventional interpretations cannot be assigned to estimated tests of statistical significance. Nevertheless, we report “p-values” and use them as rough indicators of statistical significance. In all tables reporting regression results “*” indicates that the coefficient is statistically significant at the 90 percent level of confidence and “**” at the 95 percent level. We are not slavishly attached to this criterion, however. We consider these results along with regression coefficients’ stability across the various specifications of our statistical model and with the substantive interpretations that models suggest.

**Results**

**Descriptive Findings**

As Figure 4 shows, all groups’ educational performance is associated with the racial

compositions of the schools they attend. All groups’ CAPT scores increase across *%White*. However, white students’ average CAPT scores are uniformly higher than those of black and Hispanic youths at all levels of *%White*.

The findings regarding black and Hispanic students would appear to support *Sheff* plaintiffs’ assumption that alteration of the racial compositions of schools affects minority groups’ educational performance. However, the fact that white students’ scores are associated with the racial compositions of their schools is inexplicable by that logic. Were we to interpret *%White* effects causally, we would conclude that increasing the numbers of white students in schools would improve white students’ academic performance, or the opposite that increasing numbers of minority students would diminish white students’ educational outcomes. The former view, we believe is untenable; the latter is politically explosive.

A more reasonable interpretation of the correlation between *%White* and CAPT scores is that schools’ racial compositions are associated with other factors that cause variations in educational outcomes. The multivariate analyses that follow investigate that possibility.

Correlations of *%White* with other causative factors specified by our model and with indicators of educational performance are shown in Tables 4. These zero-order correlations show that %White is positively associated with all hypothesized determinants of educational outcomes and with the mean CAPT scores. In turn, groups’ average CAPT scores are correlated with all hypothesized determinants.

Overall, the zero-order correlation coefficients in Table 4 lend support to our conceptual

model. All hypothesized causative factors are associated with CAPT scores, and community level factors are closely related to family and educational variables. They also indicate that the prevalences of education-enhancing and favorable educational outcomes are greater in Connecticut’s more rural and wealthier communities than in its less wealthy urban areas.

We also find that *%White* is correlated not only with educational outcome indicators but with hypothesized causative factors as well. This suggests that the *%White*-CAPT association might merely be an artifact of *%White*‘s covariation with the causative factors. Tests of this suspicion require multivariate analyses to which we now turn.

**Estimation of Effects**

Our assessment of causative factors implicated in educational outcomes proceeds stepwise through our model. We begin by estimating effects of *White, White*%White, *and* Black*%White* alone. We then successively introduce community variables (*Rural *and* SES*) and then family and educational variables (*Family, Culture, *and* Resources*). Of particular interest in these analyses are coefficients associated with *White*%White, Black*%White, *and* Hispanic*%White*. If our logic holds, these variables’ estimated effects will diminish as other predictor variables are introduced.

Table 5 shows the regression coefficients estimated by the four specifications of our statistical model. Model I gives estimates of regression coefficients associated with *Black, Hispanic, White*%White, Black*%White *and* Hispanic*%White* when they alone are regressed onto CAPT Math Index scores. These data indicate that the aggregate white students’ CAPT Math Index scores are on average more than thirty-seven points higher than those of black students and thirty-five points higher than those of Hispanic students after accounting for schools’ racial compositions. Regression coefficients associated with *White*%White, Black*%White, *and* Hispanic*%White* show that increasing *%White* boosts aggregate CAPT Math Index scores for all students. A ten percent increase in *%White* raises white students’ scores by about 1.4 points, those of black students by 2.7 points, and those of Hispanic students by about 3.3 points.

Already, our findings lead us to doubt the *Sheff* logic. The model’s R2 of 83 percent indicates that this specification explains a relatively large portion of the observed variance in CAPT scores. However, the Black and Hispanic coefficients remain large and statistically significant, and the *White*%White* coefficient is statistically significant. White students’ average CAPT Math Index scores are considerably higher than those of minority students regardless of schools’ racial compositions. Moreover, increasing percentages of white students in schools’ student bodies increase aggregate performance among all students, whites as well as minority groups. For instance, the model’s predicted CAPT Math Index score for white students in a school with twenty percent white students is 70. In the same school, the predicted CAPT score for black students is about 34, and that for Hispanic students is 38. In a school whose white students comprise ninety percent of the student body, predicted aggregate CAPT scores for white students, black students, and Hispanic students are, respectively, about 80, 53, and 60.

Model II adds community variables, *Rural* and *SES*. This leads to a slightly better statistical fit with the data (R2 =88.3%). *Rural* and *SES* raise CAPT Math Index scores. The effect of racial composition on scores remains positive and statistically significant for black students and Hispanic students, while that effect among white students is virtually zero.

Model III estimates effects of racial/ethnic variables in conjunction with family and school level variables. In this specification the average white-black (*Black*) and white-Hispanic (*Hispanic*) differences remain, as do effects of schools’ racial compositions (*White*%White, Black*%White, & Hispanic*%White*). Coefficients associated with all of the other predictors in Model III are consistent with our conceptual model. All are positively signed, and all but one (*Resources*) are statistically significant.

Finally, Model IV includes all predictor variables. The absolute values of the coefficients associated with indicators of racial composition are virtually zero among white students and black students and is small but statistically significant among Hispanic students. We conclude from these results that schools’ percentages of white students have no effect on educational outcomes as measured by CAPT Math Index scores among white students and black students and perhaps a trivial effect on those of Hispanic students.

In addition to overall averages as measured by Index scores, we are interested in extreme scores as well, namely, those in the group who meet standards and those whose scores are so low as to indicate needs for special intervention.

Table 6 gives parameter estimates for these math outcomes based on the full model. In both cases, estimates of schools’ racial compositions on minority students’ CAPT Math scores are near zero and not statistically significant.

Tables 7 through 9 show results from analyses of CAPT Science, Reading, and Writing scores. In each case, results indicate that, when effects of other pertinent factors are adjusted, racial compositions of schools have no effect on minority students’ CAPT scores.

**Summary and Conclusions**

**Summary**

This paper was occasioned by the *Sheff vs. O’Neil* court case in which plaintiffs presume that establishing more racially and ethnically balanced student bodies will lead to improved educational performance among minority group students, principally black and Hispanic youths.

The study reported here was undertaken to test the racial/ethnic composition effect implied by *Sheff*. We examined this hypothesized effect among the Connecticut’s tenth-grade students for whom aggregate scores on their Connecticut Academic Performance Test (CAPT) are reported. We posited a conceptual model of educational outcomes that draws together influences of students’ communities, families, and intra-school factors. Using multivariate regression analyses whose specifications were defined by this conceptual model, we compared white, black, and Hispanic students’ academic performance across the 139 Connecticut schools for which we could assemble complete data sets. Each statistical model included indicators of schools’ racial mixes, which permitted us to estimate these variables’ *direct* effects after adjusting for effects of our conceptual model’s hypothesized determinants’ of academic performance.

Our principal findings are summarized as follows.

● Based on conventional quantitative criteria, our statistical models fit the observed data very well.

● Hypothesized determinants of educational performance pertaining to influences of communities’ socioeconomic features and of families and schools’ academic cultures are rather consistently associated with academic performance.

● Once effects of these factors are adjusted, schools’ racial compositions have no effect on tenth-graders’ educational outcomes.

**Limitations**

While our analyses confidently reject the notion that altering schools’ racial/ethnic compositions will improve minority groups’ academic performance, we note that the study suffers from limitations. First, our findings are based on non-experimental observations. The ideal design for estimating causal effects would observe changes in academic performance among students randomly assigned to various schools. Such a design is both infeasible and undesirable, however. Moreover, as our intent was to examine a single, specific question, the statewide scope of the available non-experimental data permits broader generalization.

Second, as we investigated only tenth-graders, we cannot assume that our results apply to other student cohorts. We therefore suggest that additional research be carried out on test scores (e.g., Connecticut Mastery Test [CMT] of elementary school students.

Third, our analyses are based exclusively on readily available aggregated data, both test scores and predictor variables. In particular, *Rural* and *SES* apply district-wide variables to individual students, and *Family* and *Culture* apply school-level measures to white students, black and Hispanic students alike. Measuring individual differences among students on these dimensions and other relevant dimensions most certainly would add to our study’s sensitivity. Additionally, analyses based on individual-level data that more thoroughly measure family and cultural influences would provide policy makers with more direction as to scope and content of needed reforms. Such investigations certainly should focus on the quality of schools’ teachers and administration, factors that publicly available information does not address.

**Conclusions**

Our analyses consistently yield a troubling finding that deserves particular attention. In all statistical models that estimate white-black and white-Hispanic achievement differences, we find that, on average, white students outperform both black students and Hispanic students even with effects of other predictors held constant. Indeed, the average differences in whiteblack and white-Hispanic aggregate test scores remain almost unchanged from the mean difference observed when no other variables are considered. We cannot estimate precise extents of persisting white-black and white-Hispanic differences from the data at hand. On the other hand, other studies report gaps of similar magnitudes between white students’ performance and that of black students, which endure even when many determinants are adjusted.

These enduring test score gaps between white and minority students point to schools themselves and to the orientations, encouragement, and emotional intelligence that students bring to schools from their homes. *Sheff* plaintiffs’ demand for replacing Hartford’s existing inner-city schools with magnet schools takes a step in this direction, but a woefully incomplete one. The history of school reform in the United States is replete with instances of “single bullet reforms” that regrettably are often short lived and ineffective. Magnet schools might improve educational outcomes in some settings. The same might be said for charter schools, contracted schools, and other alternatives to prevailing arrangements.

But mere generalities do not suffice, for they mask considerable diversity in the particulars that actually effect outcomes. Beneath brand names lies a host of design decisions that involve far-reaching considerations of schools’ missions and management as well as their modes of accountability and relationships to other community resources. **In that regard, one might reasonably question the basic tenant of the proposed Sheff remedy, which envisions students traveling from suburban Hartford towns to attend inner-city Hartford magnet schools. Experience elsewhere shows that such arrangements often fail to attract students from suburban communities, for parents generally prefer sending their children to local schools.** This has been found in Minnesota, the first state to adopt such reforms, and in Massachusetts, where, as of the mid-1990’s, fewer than twenty five percent of the state’s schools, and none of Boston’s suburban communities, participated in the statewide program.

**At least in broad outline, considerable agreement exists among educational experts as to where attention should be focused. Instead of mandating a general type of reform, Connecticut’s policy makers should initiate a planning process that takes what is known to work and adapts these lessons to Hartford’s particular circumstances.** Such planning and design must then be followed by a long-term political commitment to oversee, support, and continually adjust whatever reforms are adopted. As Hill and Celio note,

[a]ny city’s reform can take a decade. That is a tragically long time, given the costs to children. However, unless education reform is taken seriously, as an effort requiring serious thinking, testing, careful use of evidence, and continuous refinement, America’s urban public schools are likely to be no better off in ten years than they are now.

**Our study’s findings reject that part of Sheff that expects redistributions of students among schools to bring about better academic performance among minority students. Accumulated knowledge and experience in urban education reform likewise eschew a simplistic embracing of the magnet school (or any other reform) label.** The lawsuit is thus without substantive merit. Indeed, the courts are inappropriate venues for the sorts of hard thinking and political will that are required to bring about better education for Hartford’s children.