|
|
Analysis of Reading Performance in the Elementary Schools 2001-02 Introduction Reading may be the most fundamental skill needed for success in school and the workplace in our modern society. Understanding math, science, and all other academic subjects, depends largely on a student’s ability to read. It is fitting then, that as Rochester School Department moves forward in using data to advise policy decisions and drive instructional improvement, we should concentrate on reading as the first topic of study. The primary tool used to disaggregate and analyze the data in this study is the Quality School Portfolio, (QSP) computer software developed by the Center for Research in Evaluation, Standards, and Student Testing (CRESST) at UCLA. This tool permits us to combine data from a large number of sources and to compare the performance of students on several measures and in various groupings. Such analysis allows the district to select groups of students with common characteristics, such as gender, ethnicity, participation in kindergarten or special instructional programs. We can then compare the performance of the groups to see if one group is consistently outperforming another. With that information, we can dig deeper to see what may have contributed to the differences and increase our use of strategies or resources that show promise for improving student performance. This study is focusing on the elementary level because that is where the initial reading skills are obtained by students and where the school can have the greatest impact on future success. Purpose of this Study This document is a beginning in the process of data informed decisions. The analysis contained herein will, in all likelihood, lead to more questions at individual schools rather than providing a large number of concrete program level recommendations for immediate instructional change. That is one of the goals of scientific inquiry – to lead the investigator to more focused questions that will lead to additional answers and, in turn, foster continuation of the inquiry in a constant cycle of growth. The study is intended to answer only the basic questions about how our district is performing and to serve as a departure point for individual schools in studying their own performance and developing building level improvement efforts. In analyzing data about school performance, it is important to start with questions related to student standards and achievement. Accordingly, this study is also intended to discover those things that are working effectively for Rochester students as well as the areas where we are not performing as we would like to. There is no attempt to focus exclusively either on the positive or negative aspects of district performance. Where causation cannot be established, we will be careful to note that more study is needed. Where statistical methodology is used, an explanation will be included regarding the meanings and limitations of the statistics. In the web based version of the study, statistical terms are linked to their definitions. In the paper version, an Appendix is included that explains statistical concepts and terms used in the study. Limitations of the Study This is not a scientifically or statistically rigorous experimental model with control and treatment groups. We are not, at this point, attempting to prove causation, although some conclusions may be drawn when we encounter very strong evidence that a particular practice or resource is having a positive or negative impact on student learning. It is, rather, an observation of the state of reading performance of various groups of elementary students within the school district and the district as a whole. It should be noted, as has been stated to the Board on numerous occasions before, that for a number of important reasons these analyses cannot accurately measure individual teacher performance. This is not simply and administrative choice intended to avoid a politically charged issue. Rather, it is based on the fact that the analysis of teacher quality based on student performance requires a sophistication in both the control of variables and analysis of data that is well beyond the capability and resources for an individual school district such as Rochester. There are several factors that make this so. First, because we have not controlled for the numerous variables that affect student performance, we cannot isolate what learning changes are attributable to teacher effect as opposed to demographic, school climate, prior year issues, or random factors. Additionally, if any inferences can ever be drawn regarding individual teacher performance, it will require multiple years of tracking in addition to controlling for the factors listed above and numerous others. The reader is, therefore, cautioned not to use the evidence from this or other district studies to reach conclusions about individual teacher performance. Finally, while some inferences can be made at the building level, the reader is cautioned to refrain from considering the evidence to be strong, particularly for the smaller buildings. To establish statistical significance, a minimum sample or population is required because, as the sample gets smaller, the error of estimate increases. To avoid the possibility of giving incomplete or inaccurate information that would lead to wrong conclusions, we have avoided analysis where an insufficient population can be identified. Basis of the Analysis One of the strengths of data analysis through QSP is that we are not limited to a single measure, such as the state's NHEIAP assessment or a standardized test. Trying to measure a school's overall performance based on a single test taken three times during a student's 13 years in the district, as is done with NHEIAP, is tantamount to asking a physician to make a complex diagnosis with nothing more than a couple of blood pressure readings. Just as the physician will use a number of tests to narrow the possible diagnoses, then if necessary, order another series of tests based on the results of the first, the district must look at numerous sets of data to eliminate dead ends and focus on more productive threads in the research. And, similar to evaluation of our physical health, while there are general conditions that contribute to quality education, there are also unique circumstances for schools and individual students that need to be considered in developing an effective program. As we are in the early stages of data use for policy development, much of this analysis is based on scores on the standardized reading assessments used in Rochester, the Iowa Test of Basic Skills, given each spring to grades 2, 4, and 5; and the Gates MacGinitie Reading Inventory which is given both Fall and Spring of each year to grades 1 through 8. While we participate in the state's NHEIAP Assessment, it is not included in this study because there is no reading component to that test. It is to be considered in inquiries into the other four core curriculum areas: language arts, mathematics, science and social studies. In addition to the standardized tests, the district is embarking on a series of surveys to assess non-academic factors that may contribute to academic performance. Additionally, individual schools are strongly encouraged to identify other measures and develop local assessment tools to create a capability to predict student achievement and design specific instructional strategies based on those predictions. Reliability of the Measures of Student Achievement Rochester School Department assesses reading proficiency by a number of means, including teacher-made tests, chapter tests from the reading series, and standardized tests. The two standardized tests named in the previous section have been nationally validated as measuring similar reading skills. Use of both assessments decreases the possibility of measurement bias that is sometimes present when using a single instrument. Chart 1 shows a strong correlation between the two measures, which is one indication that they are measuring similar skills and that the two tests are administered consistently by Rochester School Department staff. This is important to establish as we use the test results to inform policy decisions. Chart 1
This chart indicates that, in general, students who scored low on the Gates test also scored low on ITBS (those points in the lower left quadrant of the chart) and those who scored high on one tended to score high on the other (points in the upper right quadrant). As expected, most of the scores clustered around the center, or median, of the scoring. The score used throughout this study to measure both performance and progress is the Normal Curve Equivalent, or NCE, which has a range from 0 to 99 and compares a student’s performance with performance of other students throughout the nation. Standardized tests like the ITBS and Gates are designed to compare the achievement of students rather than measure mastery of a specific curriculum. They built to be stable, forcing a majority of scores into the middle, in conformity with a normal distribution.
General Achievement Level During the 2000-01 school year, Rochester showed an interesting pattern of reading performance between Fall and Spring testing. Specifically, in the Fall administration of the Gates-MacGinitie Reading Inventory, Rochester students tested slightly behind the norming sample used for the Gates assessment. That is, students in Rochester taking the test in the Fall did not perform as well as students in the national norming population who took the test in the Fall (Chart 2). For grades 2 through 5, the district's average is between the 45th and 48th NCE, which is very near to average performance. In contrast, incoming first graders, on average, entered the year at the 33rd NCE, which represents a dramatic difference. With all things equal, one would expect the NCE to remain fairly constant from Fall to Spring because, while our students took the test shortly after a summer vacation, so did the students in the original norming population. This may suggest that Rochester students, when entering first grade, are genuinely behind other entering first graders in reading and that Rochester must work to bring these students up to catch the rest of the nation. However, with only one year of data, it is difficult to determine whether the low first grade scores are related to actual student achievement or factors more closely related to the time and manner in which the test was administered. This is an area that needs to be monitored longitudinally. Chart 2
Whatever the
reason for the low Fall scores, Rochester students and
teachers appear to have made up the difference and moved
slightly beyond the norming population by the May
administration of the assessment (see Chart 3). At
that time the average performance in all grades had risen
to the 50th NCE or
higher. In a normal distribution, the 50th NCE is the
anticipated mean and median score for a relatively large
student population. For grades 2 through 5, the increase
is within the standard error of measurement, indicating
that it could be a chance occurrence. However, the
increase in first grade scores is significant. Without
speculating at this point on what is causing the dramatic
increase, it is it is probably related to something being
done by the schools. The gain appears to be too large to
be occuring through random factors. Chart 3
Results from the Iowa Test of Basic Skills Reading sub-test (Chart 4), also administered in the spring to grades 2, 4 and 5, correlate positively with the Gates Spring administration, as noted in Chart 1, above. The overall performance of elementary students in grades where the test is given is at or near the 50th NCE. Second grade is the lowest, bordering on the 48th NCE, which is still within the standard error for a student population the size of Rochester’s. Chart 4
While
averages provide some useful information, they give an
incomplete picture of performance. When relying solely on
averages, a few outlying scores can dramatically affect
the results. A distribution of Rochester's scores shows
that our population approximates a normal distribution
(Chart 5), but that our scores are skewed slightly
to the positive side, indicating the slightly higher than
average performance on the test and corroborating the
information on the bar charts of averages. The anomalous
bar at the NCE range 97-99 is something that should be
studied further if it shows up in subsequent years. It may
be caused by the ceiling of the Gates-MacGinitie Reading
Test, which was not necessarily designed to differentiate
exceptional performance. Chart 5
One of the major reasons to use QSP or similar tools is to permit a school or district to disaggregate, or break down the results into sub-groups of students to determine what characteristics are shared by students achieving at various levels. If the district can isolate the populations of students who are performing well and determine what strategies and resources might be contributing to the success, then we may be able to apply those things to other students. Likewise, if the district can find common characteristics of low achieving students, either student characteristics or characteristics of the services they receive, then measures can be taken to change or overcome the barriers to learning. The following sections constitute a first look at some of the possible factors that can affect student learning. Gender Differences One of the national issues in education has to do with gender equity - the question of whether boys and girls are given the same opportunities to learn in school. The paired columns in Chart 6 demonstrate that the difference in reading performance between males and females in the Rochester School Department was minor in school year 2000-01. In all cases, the differences are less than 2 NCE points, which is within the standard error of estimate for a population our size and can be accounted for by chance. It should be noted, however, that the differences are consistently favoring female students in grades 1 through 4 and favoring males in grade 5. At this point it is entirely uncertain whether the directionality is random, a cohort effect, or caused by instructional differences. The gender differences within individual schools are much greater, but that appears to be a function of sample size rather than educational practice. We draw this conclusion partially because, in the smaller schools, the differences, while consistently larger, are entirely unpredictable regarding the direction of the difference. That is, in one school boys will perform much higher in one grade; girls perform much higher in the very next grade; and the trend will reverse once again in the following grade. This phenomenon needs to be reviewed annually to see if the patterns are consistent within certain grades and schools or whether, as noted above, they follow the cohort of students. Chart 6
Racial and Ethnic Factors There are too few minority students in Rochester to conduct meaningful statistical analysis. This does not suggest, however, that race and ethnicity should not be considered in educational improvement efforts or that there are no racially related differences in instruction and student performance. The small numbers can camouflage serious problems that can permanently affect the lives of an important population within our community. It is imperative that individual schools, and the district, continuously review performance by race and ethnicity, and observe whether instruction, expectations, or interactions are different between teachers and students on the basis of race and ethnicity and whether interactions among students vary consistently by race. Kindergarten Participation That student participation in kindergarten has a relationship to reading achievement and other academic success has been established through national research. The influence, while not dramatic in all grades, is evidenced in Rochester (Table) with differences ranging from 0.79 NCE points to almost 6 NCE points. Again, the single year of data will not show clear patterns from year to year, but the data do suggest that the disadvantage to students who have not attended kindergarten is not ameliorated as they progress through the grades. This is also consistent with national research showing the impact of educational experiences in one grade on progress in subsequent grades.
Please note that there was no universal public kindergarten program available in Rochester for any of the students who were in first grade or above during School Year 2000-01, so none of the students in this study attended a public kindergarten other than the Title I kindergarten for a limited number of qualifying children who do not reflect a normal distribution of abilities. The district will need to measure the effectiveness of its public kindergarten program in a longitudinal study to determine exactly what effect it is having on student performance in reading and all other subjects. Regardless of a parent's choice of kindergarten, it is apparent from these data and the national research that the district needs to encourage increased participation in kindergarten programs for all students at the appropriate age. Analysis Based on Annual Progress Perhaps a more telling measure of the quality of a school’s instruction than student performance on standardized tests is annual progress. Students enter the classroom with native abilities and previous experiences for which the teacher cannot be held responsible. If, for example, a student enters a school five years behind in reading, it is probably unreasonable to expect the teacher to overcome the entire deficit in a single year. In measuring growth using Normal Curve Equivalents, we note that an NCE gain of zero from one year to the next represents one full year of growth. That is, if a student is at the 50th NCE in the first year, then an NCE score of 50 in the second year would indicate that the student had kept his or her relative place with other students who were all one year older and had grown academically during that year. An NCE gain of 1 point or more represents progress at a faster rate than the general population, or in the case of Fall and Spring testing, more than 9 months of academic growth with 9 months of instruction. A loss of NCE points represents less progress than the instructional time represented. We have not used the Grade Equivalent score for these measures. Although subject to frequent misunderstanding and misinterpretation, the Grade Equivalent can be somewhat useful when talking with a parent about an individual students. However, it is of little value in discussing an entire student body. The same information can be obtained, and more clearly, using the NCE. Fall to Spring Gains and ITBS Spring to Spring Gains As noted above, the low Fall scores, and subsequent gains, between the Gates Fall and Spring testing cycle may represent something more than just teacher and school effect. We are not certain what factors might be contributing to the low first grade scores. It could be proximity to the first day of school; a lack of test-taking strategies for our first graders that is not lacking in the norming sample; test validity for entering first graders; other factors related to the test administration; lack of a consistent kindergarten curriculum for those students who attended a variety of programs, or an actual deficit in the preparation of our first graders. This will require a look at multiple years of data. What we can see is that, by far, the most dramatic gains for Rochester’s students are made during the first grade year. On average, Rochester’s first graders made gains of almost 20 NCE points during their first year in the school district. Gains in all grades showed more than a single year rate, on average. Regardless of the reasons for the increases, this is good news for Rochester’s children. Chart 7
Title I Impact on Performance and Gains In general, it is expected that schools eligible for Title I funds will have lower achievement than non-Title I schools. The program is targeted to schools with high poverty and other factors correlated to lower student achievement. This is not necessarily a reflection on the quality of curriculum and instruction in those schools, but represents factors that are generally beyond the control of the school. Rochester is no different in this respect. Chart 8 shows the student achievement on the Gates Fall administration. This chart shows the relative proportions of students in these schools who fall into each quartile of the NCE scores on the Gates-MacGinitie Reading Test. In the Fall administration, Title I schools had a much larger proportion of their students falling into the bottom 25 percent than the non-Title I schools. Conversely, there was a smaller proportion of the students in the Title I schools in the top two quartiles. This is not surprising. Chart 8
Conversely, by Spring, the Title I schools actually have a lower proportion of their students in the bottom quartile than the non-Title I schools and have leveled the performance of the top two quartiles somewhat. This suggests that there is something positive happening in the Title I schools that may be absent in the non-Title I schools. Chart 9
Further evidence of this effect can be found in comparing reading gains between Title I and non-Title I schools. The bar on the left side of Chart 10 represents the 9 month NCE gains by the three Title I schools, Maple Street School, School Street School and William Allen School. The middle bar represents gains in the other five elementary schools, and the right hand bar is the average gain for all of the elementary schools in the district. Chart 10
One possible factor in this dramatic difference could be the relative performance of first graders in the Fall administration and the proportion of low first graders attending the Title 1 schools. This low performance was noted above. However, when the 1st grade impact is removed from the equation (Chart 11), the relative difference in growth between Title I and non-Title I schools remains as dramatic. Chart 11
There are many possible explanations for these results. Future studies at the district level will need to isolate differences in the schools and determine which characteristins may have an impact on learning. It can be speculated, however, that one major factor is the availability of resources for Title I schools. There are more services available in reading for first through third graders in all Title I schools, and those resources are made available to all students in the Title I school-wide programs. While a national debate has raged for years regarding whether more money makes a difference in quality of education, research has now established quite clearly that money spent carefully does have a positive impact on student learning. The challenge for districts with a high level of resources, whether for individual schools through Title I or from other treasuries, is to determine how to use those resources most wisely. Gains by Achievement Quartile Unfortunately, not all of Rochester’s news in student growth is good. While it is important for schools, and in particular those involved with Title I, to move their lowest achieving students as far as they possibly can, it is equally important to challenge the brightest and most motivated students to work to their potential. Chart 12 shows the growth patterns of students from where they began the year, as measured by the Fall administration of the Gates Reading Test. It demonstrates what Dr. William Sanders calls a shed pattern. That is, those students in the bottom quartile are receiving extra services or services of a type that encourage exceptional growth, while the students in the middle and top are making fewer gains, possibly because so much effort is focused on the bottom quartile. In Rochester's case in reading, this pattern shows a decline with each higher quartile with a net average loss of NCE status by the students who began the year in the top quartile. If this trend represents actual effects from instruction and is not the result of artificially low Fall performance by incoming first graders, then it is a disturbing trend that needs to be addressed quickly by individual schools. Chart 12
The same pattern is repeated when looking at gains from fourth to fifth grade on the ITBS Reading sub-scale. When looking at the gains from Spring 2000 to Spring 2001, the greatest gains are made by the students starting at the lowest point, with net losses for students who were in the highest quartile in the previous year (Chart 13). Chart 13
This pattern appears to be consistent and seems to confirm what has been expressed by some parents and school district critics, but has not previously been measurable. The Board has received a request for a more sophisticated look at this phenomenon through the research and analysis of Dr. William Sanders. Dr. Sanders has the largest database of comparative student growth in the world and has the means to determine with greater certainty in which schools this effect is occurring and to provide additional information to assist in reversing the trend. Impact of Reading Intervention That gains are high in the bottom quartile is not at issue. The school district intentionally places additional resources in schools for children at risk, specifically to accelerate their learning and bring them closer to the appropriate level for their age and grade. The interventions include the use of reading specialists to provide individual instruction, work with special education teachers on Individualized Education Plans, and a specific reading curriculum known as Wilson Reading System. Currently, there are fewer than ten teachers in the Rochester School Department who are fully trained and certified in Wilson Reading. The district has not trained more teachers in use of Wilson techniques because the program is relatively new to the district and because it is an intensive program for very small groups and carries a higher cost than some other interventions. The first year data show, however, that the Wilson Reading program is highly effective for the students served by it. Average gains for students receiving Wilson instruction are greater than 12 NCE points, more than more than 1.5 times as great as gains for students receiving only regular reading instruction. Chart 14
Average gain is only one part of the equation, of course. A second way to look at the effectiveness of interventions is to set standards for growth and see how the interventions affect students in meeting these standards. A reasonable standard of growth is to remain on the same NCE – that is no gain or loss in NCE standing reflects one year of reading growth in one year of instruction. Because the Fall scores for first graders seem unreasonably low, Chart 15 reflects a gain of 5 NCE points for the standard to have been met. Below Average Gain represents 2 or fewer NCE points. An Average Gain is defined as 3 to 5 NCE points. Meets Standards is from 5 to 10 points of gain; and Exceeds Standards is defined as 11 NCE points or more. These represent relatively high standards as far as gains are concerned. As with the average gains shown in Chart 14, above, Wilson students are showing exceptional growth, with more than 60 percent exceedint the standard (shown in the dark green) and more than 80 percent of them meeting or exceeding the standard. Overall, our district’s reading instruction seems to be effective, with nearly 75% of all students having met or exceeded the standard for gains, but the gains in the Wilson program do exceed even those of students with other reading interventions. Chart 15
Recommendations As noted in the first section, the purpose of this study is not to provide specific, detailed recommendations for principals and teachers to apply in individual classes. Rather, it is to introduce them to some of the issues that are basic to the school district and provide them with a departure point for their own studies, to find their own schools' strengths and weaknesses. Accordingly, it is imperative that every school in the district begin looking at its own data, whether based on the findings of this study or on other topics of more urgency to the individual building and classrooms. Based on the results of this analysis, however, the following recommendations are made for the entire district:
Conclusion As noted at the beginning of this document, there is much more study needed in virtually every academic and non-academic area. This should be the first of a large number of studies of school performance in Rochester. It is hoped that the results of such studies will be improved instruction and higher student performance. At the very least, an awareness of the real issues and performance should give everyone in Rochester a better idea of what kinds of support are needed by the School Department. |
|||||||||||||||||||||||||||||||||||
|
The Rochester Schools
|