When the No Child Left Behind (NCLB) Act of 2001 was first introduced, it carried with it a simplistic goal: to ensure that all children reach "rigorous" standards of performance by the year 2014 without exception. While the merits of such a goal are debatable[1], NCLB established a clear expectation for all schools. Adequate Yearly Progress (AYP) further delineated the measures by which schools would have to achieve interim goals to ensure that they would meet NCLB by 2014.
For example, if a school has 60 percent of its students at or above a performance standard called "proficient" (using assessments and procedures deemed acceptable by the United States Department of Education (USDE) via the Peer Review Process[2]), then by 2014, the other 40 percent must also be proficient. If the school started counting in 2004, they would have 10 years to meet this requirement. Therefore, one would expect (using a simple linear function) that four percent of the previously designated "less than proficient" students would have to reach (or grow, stretch, learn, or progress) to the "proficiency standard" each year if the 2014 goal is to be met. If four percent does not reach proficient each year, then the school could be considered as failing to meet its AYP target and could fall under USDE sanctions.
None of this is new, but one might assume such an AYP model would require individual student growth (since words like "grow," "stretch," and "progress" imply growth) - but it does not! In fact, growth measures for individuals are arguably against the premise of NCLB, which is that growth does not matter, but rather the attainment of the proficient status is what is important. We will see later in this text that recent changes initiated by the USDE "soften" this position somewhat.
NCLB Is a Status Model, Not a Growth Model
Without pretending to be visionary, one can clearly see that the emphasis for the provisions of NCLB is status...namely that all students become proficient by 2014. While the definition of "proficient" is debatable (some supporters of NCLB have used the term "proficient" synonymously with "on grade level"), the current status of the students is what is important-"Are they proficient?" The gyrations that students, teachers, parents, and others must go through to get all students to the proficient level are really secondary.
Students who are way below proficient (way below grade level) will have to "catch up" (i.e., grow at a faster pace than their peers) if they are to become proficient by 2014. Similarly, students who are way above proficient (way above grade level) will not have to accelerate their growth to meet the needs of NCLB. So, status is the measure, but growth is the route students must take to become proficient, especially struggling students.
Yet, with the exception of the "safe harbor" provisions of NCLB,[3] students and schools are not given credit for growth, only for reaching the proficient standard. Conceptually then, a student who is performing two grade levels below expectation can make up the "lost time" by growing at a tremendous rate but still might not reach the proficient level and therefore would not get credit for improvement under NCLB as it currently stands.
AYP Measures
The AYP measures for a school are really just one of the ways compliance with the requirements of NCLB is enforced independent of student growth. Consider a school with 60 percent of its grade four students at or above the proficient standard in reading in 2005. If we assume the AYP target is four percent more students becoming proficient each year (as in the previous example), then the expectation is that in 2006, 64 percent of the grade four students will be at or above proficient. Note, however, that this is a different group of students than last year. Because the status data required by NCLB is cross-sectional and not longitudinal, it does not take into account differences in the ability of groups of students from year to year. This can be more than a little confusing as schools follow their students' performances across the grades.
What if, for example, this school's cohort of grade four students missed their proficient target the previous year by two percent? That is, when they were in grade three, two percent fewer students reached the proficient standard than was targeted by AYP. What this means is that in addition to the four percent improvement expected this year in grade four due to AYP, additional improvement is expected to "make up" for the ground lost in third grade. But the AYP target did not take this into account. As such, actual student achievement for the cohort of students in this example must be much greater than the AYP target of an additional four percent to reach proficient status. Students and schools get no credit for this added effort.
Consider figures one and two. This data was provided to me by Bob Linn at the November 2005 CASMA-ACT Invitational Conference in Iowa City, Iowa.

Figure 1. NAEP Math Trends. Slide presented at the CASMA-ACT Conference,
Robert L. Linn, CREST and the University of Colorado at Boulder, November 5, 2005.
Figure 1 shows that, using existing National Assessment Educational Program (NAEP) mathematics performance and the requirements of NCLB, the projections of student growth to meet AYP requirements for grade 4 and grade 8 NAEP mathematics are very unrealistic. Similarly, the trends for reading are not much better, as displayed in Figure 2.
Figure 2. NAEP Reading Trends. Slide presented at the CASMA-ACT Conference,
Robert L. Linn, CREST and the University of Colorado at Boulder, November 5, 2005.
Figure 2 shows that reading on NAEP has mostly been flat up to now. Yet, AYP under NCLB would require steep improvements if the goal of all students reaching proficiency by 2014 is to be realized. How likely do you think it is that real NAEP data, once collected, will "suddenly" jump up and follow the projected growth path? According to NCLB, it must!
The requirements of NCLB and AYP are for students to reach proficiency by 2014. However, as the examples and empirical data using NAEP have demonstrated, in order to achieve this goal, students will have to grow individually at a much faster pace than has been seen to date, and those students currently behind will not only have to "catch up," but continue to grow at a higher rate in the future. To this extent, recent actions by the FED have provided for an opportunity to use individual growth measures under NCLB.[4]
Growth Models and Value-Added
Measures of student growth in education have been with us for a long time. These growth measures take many different shapes and sizes. For example, there are the standard scores associated with most intelligence tests[5]; vertical scale scores like those used on the Pearson PASeries Mathematics assessment called Quantiles®; simple pre-test/post-test differences like those used with Pearson Assessments Basic Achievement Skills Inventory; simple differences between normal-curve equivalents (NCEs) like those previously used under the old Title I funding; or true developmental scale scores like Lexiles®, found on the Pearson PASeries.
It was not so long ago that most state education agencies required measures of growth routinely for their assessments. Oregon used the "RIT scale" or "vertical Rasch units" as it evolved. Florida has studied the use of vertical scales in addition to its horizontal scale score system for almost 10 years. Similarly, Texas had the "Texas Learning Index" (a standard score-based scale for measuring growth) and now uses the "Texas Growth Index" for aggregated or district-level growth modeling. prior to NCLB, most states required such growth measures as ways to enhance the interpretation of individual student performance and, as such, provide valuable feedback to students, teachers, and parents targeted to improved learning.
Value-Added Models for Accountability
The idea of using measures of student growth for program improvement developed in conjunction with the need for individual student growth measures for improved instruction. One of the first examples may have been the "Tennessee Career Ladder" in the mid 1980s.[6] (In many ways, this sparked the "value-added" movement, also known as the "Sanders model" of value-added assessment.[7])
Unlike traditional measures of student growth, these types of value-added models use sophisticated statistical models to parse the effects of student background and other achievement characteristics so that teacher effect can be isolated. Conceptually, these early value-added models required a simple formula. A student was "predicted" to reach a certain level on an assessment via the value-added model, but if the student working under a particular teacher was able to beat this prediction, the difference (predicted vs. actual) was the "value added" by this particular teacher. As you may imagine, such a concept was well received because it allowed the history of the students to be taken into account when looking at the "teacher effect," allowing teachers to be evaluated fairly against each other. Such a concept has been greatly debated.
Growth Models and NCLB
So far, the chronology of growth models has followed a simple flow. First, prior to NCLB, state education agencies (as well as parents, teachers, and others) wanted measures of student growth for the purpose of improved instruction. Second, sophisticated mathematical and measurement models had evolved so that comparisons of teacher merit could be attempted without the complicating factor of different levels of skills and backgrounds for the students assigned to the teacher. Third, NCLB came along with its emphasis on student "status" and desire to see all students proficient. The story would end there if not for a speech from Dr. Margaret Spellings, the U.S. Secretary of Education, who encouraged a "pilot program" for growth modeling within the parameters of NCLB.
The motivation for this pilot program was not teacher accountability. On the contrary, the perceived motivation was to provide "relief" to schools who struggle to bring students up to the proficient level. Recall that one of the problems with the status nature of AYP was that if a school instructed its (presumably below-level) students and made tremendous improvement, it still might not meet the requirements of AYP. This would be because students might not have reached the proficient standard even if they did grow more than a full grade. As you might guess, this would be very disheartening to schools, students, and teachers. Dr. Spellings summarized her goals:
There is nothing inconsistent between this [growth] pilot and the bright lines of the law. A growth model is not a way around accountability standards. It's a way for states that are already raising achievement and following the bright-line principles of the law to strengthen accountability.
In her letter refining the requirements of the growth model pilot,[8] Dr. Spellings continued to define her expectations:
...in response to educators across the country, the Department has been exploring how accountability models that measure improvements in student achievement (i.e., "growth models") could be one such tool to help schools meet the requirements of NCLB.
...our belief is that growth models may show promise for measuring school accountability, giving schools credit for improvement over time, and measuring individual student progress.
The essence of the letter is that more than "one year's progress" is needed for each year in school if all students are to become proficient by 2014. As such, Dr. Spellings indicates that measuring student growth is critical to understanding what improvements need to be made in order to achieve this goal. Readers can investigate on their own the multitude of requirements to be fulfilled before a state education agency will be allowed to participate in the pilot, but consider the following:
Many of these requirements have implications for traditional growth models. For example, if student demographic and other background characteristics cannot be taken into account, then modifications to models like those proposed by Sanders will have to be made. If one year of progress for one year of instruction is not enough, "average cohort gain" models or other regression-based models would have to be modified. Requiring the same proficiency expectations for all would require "individual curve/growth" models to be modified if not rejected outright. In short, the requirements of the growth pilot suggested by the USDE are no simple way to meet the requirements of NCLB.
Who Was Selected to Participate?
Since this article was conceptualized, the USDE has posted its acceptance regarding who qualifies to participate in the growth model pilot program. Proposals from eight states (Alaska, Arizona, Arkansas, Delaware, Florida, North Carolina, Oregon, and Tennessee) were formally reviewed (others may have been submitted but were not reviewed due to a failure to fulfill other NCLB requirements prior to submission) via a peer review process. Only two states were accepted-Tennessee and North Carolina. The other states will be allowed to resubmit their applications and/or negotiate requirements with the USDE. Accepted programs will be scrutinized by the USDE.
What is Pearson Educational Measurement Doing?
Measures of student growth have been with us for a long time. The perceived need for measures of growth is really a need for improved student learning. Sophisticated mathematical and measurement models have been developed so that individual student growth can be used (arguably) to index teacher merit. NCLB started as a status model and only very recently accepted two states into a pilot program for growth models. Given this, where has Pearson Educational Measurement (PEM) been? The answer is, almost everywhere!
PEM has been assisting our state clients in researching, implementing, and supporting growth modeling for many years. Much of our research can be found at pearsonedmeasurement.com and is updated periodically. PEM developed both the Texas Learning Index and the Texas Growth Index referenced earlier in this article. PEM has also advised such states as Florida, California, New York, New Jersey, Minnesota, Washington, and Utah regarding measures of student growth, vertical scales, and value-added models. PEM research in this regard includes professional presentations and publications such as:
Current Research Initiatives
PEM is currently working with the State of Texas on the next generation of growth scales called the "Reaching the Standards" model or RTS. The RTS provides a yearly growth target for students who are below the passing standard. This growth target indicates the performance level the student needs to reach in a future year to be on track for reaching the passing standard by graduation or 2014 under NCLB.
For each student, the RTS model evaluates how far the student is from the passing standard in a current year. A student's distance from passing is then divided by the number of years the student has to graduate or reach proficiency by 2014, whichever is smallest. The result is set as the amount of growth a student needs to make the next year to be on track for passing. This amount can then be recalculated each year to take into account actual student performance from grade to grade.
The RTS Model was primarily designed for use in accountability. It meets the guidelines for growth models to be used in state accountability systems presented in the Spellings letter. The RTS describes how far in current-year standard deviation units a student is from the passing standard and sets a growth target for a future year (in standard deviation units) that will put the student on track for passing. The RTS Model will provide growth targets for all students taking an assessment with a passing standard, such as the Texas Assessment of Knowledge and Skills (TAKS). A scaled score of 2100 is used as the passing standard (i.e., proficient) on the TAKS test. For example, if a student meets or exceeds his/her growth target, that student is considered to be on track for passing. The numbers of students who do not meet the 2100 passing standard but are on track for passing based on the RTS model at the school/district/state levels can be evaluated as an alternate indicator of performance, and perhaps as more than "status" evidence for the school to claim they are meeting the requirements of AYP under NCLB.
[1] See the Council of Chief State School Officers, "NCLB Implementation Issues and Opportunities for Action," http://www.ccsso.org/content/pdfs/IssuesAndOpportunities.pdf.
[2] See NCLB, "Standards and Assessment Peer Review Guidance: Information and Examples for Meeting Requirements of the No Child Left Behind Act of 2001."
[3] See U.S. Department of Education, http://www.ed.gov/policy/elsec/guid/secletter/020724.html.
[4] See U.S. Department of Education, http://www.ed.gov/news/pressreleases/2005/11/11182005.html.
[5] See, Zachary, R. A., & Gorsuch, R. L. "Continuous Norming." Journal of Clinical Psychology, 41 (1), 86-94, 1985.
[6] Furtwengler, C. "Tennessee's Career Ladder Plan: They Said It Couldn't Be Done," Educational Leadership, November, ASCD, 1985.
[7] See Sanders, William L. and Sandra P. Horn, "Research Findings from the Tennessee Value-Added Assessment System (TVAAS) Database: Implications for Educational Evaluation and Research.
[8] See U.S. Department of Education, http://www.ed.gov/policy/elsec/guid/secletter/051121.html.